Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Adding Custom Service to Availability Monitoring"

From EGIWiki
Jump to navigation Jump to search
 
(43 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{| style="border:1px solid black; background-color:lightgrey; color: black; padding:5px; font-size:140%; width: 90%; margin: auto;"
| style="padding-right: 15px; padding-left: 15px;" |
|[[File:Alert.png]] This article is '''Deprecated''' and should no longer be used, but is still available for reasons of reference.
|}
[[Category:Deprecated]]
{{TOC_right}}
{{TOC_right}}


= Introduction =
= Introduction =
The [[SAM | Service Availability Monitoring (SAM) ]] system is used to monitor the resources within the
production infrastructure. SAM monitoring data is used for calculation of availability and reliability of
grid sites. It includes the following components:
*test execution framework based on the open source monitoring framework Nagios and the [https://tomtools.cern.ch/confluence/display/SAM/NCG Nagios Configuration Generator (NCG)]
*database components which contain topology (gathered from GOCDB and other sources), profiles (mapping between service types and tests), test results and availability and reliability of sites and services
*visualization portal [http://grid-monitoring.egi.eu/myegi/atp/se/ MyEGI] which enables users to access current status, history and availability of monitored sites and services
*programmatic interface which enables other tools (e.g. Operations Portal, VO dashboards) to access test results and availability and reliability of sites and services
*probes used to test monitored services. These probes are provided by middleware developers and third parties (e.g. NGIs, Nagios community).
'''SAM architecture'''
[[File:Sam_architecture.png|500px]]


Operational tools such as the [[GOCDB |GOCDB management system]] and the [[SAM | SAM monitoring system]] are key
Operational tools such as the [[GOCDB |GOCDB management system]] and the [[SAM | SAM monitoring system]] are key
software components for the reliable and stable operation/monitoring of the infrastructure.  
software components for the reliable and stable operation/monitoring of the infrastructure.  


GOCDB - Grid Configuration Database (GOCDB) contains general information about the sites participating to the production Grid. Accessed by all the project actors (end-users, sitemanagers, NGI mangers, support teams, VO managers), by other tools and by third party middleware in order to get Grid topology. The portal has a single central installation but a regional package will be developed and deployed on the interested NGIs.  
GOCDB - Grid Configuration Database (GOCDB) contains general information about the sites participating to the production Grid. Accessed by all the project actors (end-users, sitemanagers, NGI mangers, support teams, VO managers), by other tools and by third party middleware in order to get Grid topology. The portal has a single central installation.


Services registered in GOCDB are described with the following information:
Services registered in GOCDB are described with the following information:
Line 12: Line 34:
*Service Endpoint: is a deployed instance of a named service type
*Service Endpoint: is a deployed instance of a named service type
*Endpoint Location: a Service Endpoint may optionally define an Endpoint Location which locates the service (URL).
*Endpoint Location: a Service Endpoint may optionally define an Endpoint Location which locates the service (URL).
SAM - The Service Availability Monitoring (SAM) system is used to monitor the resources within the production infrastructure. SAM monitoring data is used for calculation of availability and reliability of grid sites.
The Service Availability Monitoring (SAM) [SAM] system is used to monitor the resources within the
production infrastructure. SAM monitoring data is used for calculation of availability and reliability of
grid sites. It includes the following components:
*test execution framework based on the open source monitoring framework Nagios and the Nagios Configuration Generator (NCG)
*databases which contain topology (gathered from GOCDB and other sources), profiles (mapping between service types and tests), test results and availability and reliability of sites and services
*visualization portal MyEGI/MyWLCG which enables users to access current status, history and availability of monitored sites and services
*programmatic interface which enables other tools (e.g. Operations Portal, VO dashboards) to access test results and availability and reliability of sites and services
*probes used to test monitored services which are provided by middleware developers and third parties (e.g. NGIs, Nagios community).




Line 30: Line 40:
= Process =
= Process =


1. Register your custom service (hostname / service endpoint) at GOCDB.
{| class="wikitable" style="width:100%;
 
!style="width:50%" align=left|Monitoring requestor
Note: if your service does not much any of [[GOCDB/Input_System_User_Documentation#Service_types | existing service types at GOCDB]], please [[ GOCDB/Input_System_User_Documentation#Adding_new_services_types | request new service type ]]. <span style="color:red">(Service Types descriptions are meaningless to the not middleware related probes e.g. i want to monitor web service which is http) </span>
!style="width:50%" align=left|EGI Operations
 
|-
2. Choose the set of currently available probes for your service from the [[ SAM#Probes | list]], if you can not find any suitable, you will have to develope the probes for your service. How to do it, please have a look [https://tomtools.cern.ch/confluence/display/SAMDOC/Probes+Development| here]. After your probes are developed, please follow [[PROC07 | this procedure]] to enable them within EGI SAM.<span style="color:red">(Probes descriptions are not available) </span>
| 1. Register at [http://goc.egi.eu/ GOCDB] and request a new custom service type following this procedure [[GOCDB/Input_System_User_Documentation#Adding_new_services_types| here]].<br>
|
|-
|
| 2. [[OMB |OMB]] validates the request.<br>
|-
| 3. Develop the custom service probes for your service. How to develop new probes, please have a look [https://tomtools.cern.ch/confluence/display/SAMDOC/Probes+Development here]. After your probes are developed, please follow [[PROC07 | this procedure]] for submitting them for validation and integration into SAM framework next release.<br>
|
|-
|
| 4. [[OTAG | OTAG]] validates the request.
|-
| 5. You can see your service being tested by the probes at [http://grid-monitoring.egi.eu/myegi/atp/se/ MyEGI portal].
|
|}


Note: if your service has one hostname / service endpoint and you would like to test multiple functionality e.g. http-check and db-check, please make sure that another custom service you will add in the future for the same created service type also provides same multitple functionality, because service types has associated with a set of probes.
= Examples =

Latest revision as of 12:34, 13 August 2021

Alert.png This article is Deprecated and should no longer be used, but is still available for reasons of reference.


Introduction

The Service Availability Monitoring (SAM) system is used to monitor the resources within the production infrastructure. SAM monitoring data is used for calculation of availability and reliability of grid sites. It includes the following components:

  • test execution framework based on the open source monitoring framework Nagios and the Nagios Configuration Generator (NCG)
  • database components which contain topology (gathered from GOCDB and other sources), profiles (mapping between service types and tests), test results and availability and reliability of sites and services
  • visualization portal MyEGI which enables users to access current status, history and availability of monitored sites and services
  • programmatic interface which enables other tools (e.g. Operations Portal, VO dashboards) to access test results and availability and reliability of sites and services
  • probes used to test monitored services. These probes are provided by middleware developers and third parties (e.g. NGIs, Nagios community).


SAM architecture

Sam architecture.png


Operational tools such as the GOCDB management system and the SAM monitoring system are key software components for the reliable and stable operation/monitoring of the infrastructure.

GOCDB - Grid Configuration Database (GOCDB) contains general information about the sites participating to the production Grid. Accessed by all the project actors (end-users, sitemanagers, NGI mangers, support teams, VO managers), by other tools and by third party middleware in order to get Grid topology. The portal has a single central installation.

Services registered in GOCDB are described with the following information:

  • Service Type: a unique name that identifies the type of software component deployed on a Grid.
  • Service Endpoint: is a deployed instance of a named service type
  • Endpoint Location: a Service Endpoint may optionally define an Endpoint Location which locates the service (URL).


MS421 document (Reference)

Process

Monitoring requestor EGI Operations
1. Register at GOCDB and request a new custom service type following this procedure here.
2. OMB validates the request.
3. Develop the custom service probes for your service. How to develop new probes, please have a look here. After your probes are developed, please follow this procedure for submitting them for validation and integration into SAM framework next release.
4. OTAG validates the request.
5. You can see your service being tested by the probes at MyEGI portal.

Examples