Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Adding Custom Service to Availability Monitoring"

From EGIWiki
Jump to navigation Jump to search
 
(25 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{| style="border:1px solid black; background-color:lightgrey; color: black; padding:5px; font-size:140%; width: 90%; margin: auto;"
| style="padding-right: 15px; padding-left: 15px;" |
|[[File:Alert.png]] This article is '''Deprecated''' and should no longer be used, but is still available for reasons of reference.
|}
[[Category:Deprecated]]
{{TOC_right}}
{{TOC_right}}


= Introduction =
= Introduction =
The [[SAM | Service Availability Monitoring (SAM) ]] system is used to monitor the resources within the
production infrastructure. SAM monitoring data is used for calculation of availability and reliability of
grid sites. It includes the following components:
*test execution framework based on the open source monitoring framework Nagios and the [https://tomtools.cern.ch/confluence/display/SAM/NCG Nagios Configuration Generator (NCG)]
*database components which contain topology (gathered from GOCDB and other sources), profiles (mapping between service types and tests), test results and availability and reliability of sites and services
*visualization portal [http://grid-monitoring.egi.eu/myegi/atp/se/ MyEGI] which enables users to access current status, history and availability of monitored sites and services
*programmatic interface which enables other tools (e.g. Operations Portal, VO dashboards) to access test results and availability and reliability of sites and services
*probes used to test monitored services. These probes are provided by middleware developers and third parties (e.g. NGIs, Nagios community).
'''SAM architecture'''
[[File:Sam_architecture.png|500px]]


Operational tools such as the [[GOCDB |GOCDB management system]] and the [[SAM | SAM monitoring system]] are key
Operational tools such as the [[GOCDB |GOCDB management system]] and the [[SAM | SAM monitoring system]] are key
Line 12: Line 34:
*Service Endpoint: is a deployed instance of a named service type
*Service Endpoint: is a deployed instance of a named service type
*Endpoint Location: a Service Endpoint may optionally define an Endpoint Location which locates the service (URL).
*Endpoint Location: a Service Endpoint may optionally define an Endpoint Location which locates the service (URL).
SAM - The Service Availability Monitoring (SAM) system is used to monitor the resources within the production infrastructure. SAM monitoring data is used for calculation of availability and reliability of grid sites.
The Service Availability Monitoring (SAM) [SAM] system is used to monitor the resources within the
production infrastructure. SAM monitoring data is used for calculation of availability and reliability of
grid sites. It includes the following components:
*test execution framework based on the open source monitoring framework Nagios and the [https://tomtools.cern.ch/confluence/display/SAM/NCG Nagios Configuration Generator (NCG)]
*databases which contain topology (gathered from GOCDB and other sources), profiles (mapping between service types and tests), test results and availability and reliability of sites and services
*visualization portal [http://grid-monitoring.egi.eu/myegi/atp/se/ MyEGI] which enables users to access current status, history and availability of monitored sites and services
*programmatic interface which enables other tools (e.g. Operations Portal, VO dashboards) to access test results and availability and reliability of sites and services
*probes used to test monitored services which are provided by middleware developers and third parties (e.g. NGIs, Nagios community).




Line 30: Line 40:
= Process =
= Process =


1. Register your service instance (hostname / service endpoint) in the GOCDB.
{| class="wikitable" style="width:100%;
 
!style="width:50%" align=left|Monitoring requestor
Note: if your service does not much any of the [[GOCDB/Input_System_User_Documentation#Service_types | existing service types in GOCDB]], please [[ GOCDB/Input_System_User_Documentation#Adding_new_services_types | request a new service type ]]. <span style="color:red">(Service Types descriptions are meaningless to the people who did not had any relations with a middleware e.g. i want to monitor web service which is http, which service type should i choose ? Are there any generic service types available at all ? e.g. http-check). Where are Service Types and Nagios probes mapping ? </span>
!style="width:50%" align=left|EGI Operations
 
|-
2. Choose from the set of currently available probes for your service from the [[ SAM#Probes | list]]. If none are suitable, you will have to develop the CUSTOM probes for your service. Naming example: CUSTOM.<domain>.<subdomain>.<test_name>, for more examples take a look at Custom Service Types section [[GOCDB/Input_System_User_Documentation#Adding_new_services_types | here]]. How to develop new probes, please have a look [https://tomtools.cern.ch/confluence/display/SAMDOC/Probes+Development| here]. After your probes are developed, please follow [[PROC07 | this procedure]] to enable them at [http://grid-monitoring.egi.eu/myegi/atp/se/ MyEGI portal].<span style="color:red">(Probes descriptions are not available, how new people can choose the suitable probes ?) </span>
| 1. Register at [http://goc.egi.eu/ GOCDB] and request a new custom service type following this procedure [[GOCDB/Input_System_User_Documentation#Adding_new_services_types| here]].<br>
 
|
Note: if your service has one hostname / service endpoint and you would like to test multiple functionality e.g. http-check and db-check, please make sure that another custom service you will add in the future for the same created service type also provides same multiple functionality, because service types are associated with a set of probes.
|-
 
|
 
| 2. [[OMB |OMB]] validates the request.<br>
{| class="wikitable"
!Service provider
!Operations
|-
|-
| 1. Register your service endpoint instance in GOCDB.<br>
| 3. Develop the custom service probes for your service. How to develop new probes, please have a look [https://tomtools.cern.ch/confluence/display/SAMDOC/Probes+Development here]. After your probes are developed, please follow [[PROC07 | this procedure]] for submitting them for validation and integration into SAM framework next release.<br>
| 1. Validates the service endpoint registration.<br>
|
|-
|-
| 2. Find the suitable service type in GOCDB.<br>
|  
2.1 If your service does not mach any of the [[GOCDB/Input_System_User_Documentation#Service_types | existing service types at GOCDB]], please [[ GOCDB/Input_System_User_Documentation#Adding_new_services_types | request a new service type ]].
| 4. [[OTAG | OTAG]] validates the request.
| 2. If an existing service type was chosen, subsequent operations are automatic.<br>
2.1 If a new service type was requested, it will be reviewed by OMB and OTAG.<br>
|-
|-
| 3. If an existing GOCDB service type was suitable, please set option "Monitoring" to start receiving probes.
| 5. You can see your service being tested by the probes at [http://grid-monitoring.egi.eu/myegi/atp/se/ MyEGI portal].  
3.1 If a new service type was requested, choose from the set of currently available probes for your service from the [[ SAM#Probes | list]].<br>
|
3.2 If you can not find any suitable probes, you will have to develop the CUSTOM probes for your service. Naming example: CUSTOM.<domain>.<subdomain>.<test_name>, for more examples take a look at Custom Service Types section [[GOCDB/Input_System_User_Documentation#Adding_new_services_types | here]]. How to develop new probes, please have a look [https://tomtools.cern.ch/confluence/display/SAMDOC/Probes+Development| here]. After your probes are developed, please follow [[PROC07 | this procedure]] to enable them at [http://grid-monitoring.egi.eu/myegi/atp/se/ MyEGI portal].<br>
| 3. Subsequent operations are automatic.<br>
3.1 If a new service type was requested and existing probes in SAM framework were chosen, they have to be enabled in a profile of local NGI based on service location<br>
3.2 If new probes were submitted, operations will start [[PROC07 | this procedure]].<br>
|}
|}
= Examples =

Latest revision as of 11:34, 13 August 2021

Alert.png This article is Deprecated and should no longer be used, but is still available for reasons of reference.


Introduction

The Service Availability Monitoring (SAM) system is used to monitor the resources within the production infrastructure. SAM monitoring data is used for calculation of availability and reliability of grid sites. It includes the following components:

  • test execution framework based on the open source monitoring framework Nagios and the Nagios Configuration Generator (NCG)
  • database components which contain topology (gathered from GOCDB and other sources), profiles (mapping between service types and tests), test results and availability and reliability of sites and services
  • visualization portal MyEGI which enables users to access current status, history and availability of monitored sites and services
  • programmatic interface which enables other tools (e.g. Operations Portal, VO dashboards) to access test results and availability and reliability of sites and services
  • probes used to test monitored services. These probes are provided by middleware developers and third parties (e.g. NGIs, Nagios community).


SAM architecture

Sam architecture.png


Operational tools such as the GOCDB management system and the SAM monitoring system are key software components for the reliable and stable operation/monitoring of the infrastructure.

GOCDB - Grid Configuration Database (GOCDB) contains general information about the sites participating to the production Grid. Accessed by all the project actors (end-users, sitemanagers, NGI mangers, support teams, VO managers), by other tools and by third party middleware in order to get Grid topology. The portal has a single central installation.

Services registered in GOCDB are described with the following information:

  • Service Type: a unique name that identifies the type of software component deployed on a Grid.
  • Service Endpoint: is a deployed instance of a named service type
  • Endpoint Location: a Service Endpoint may optionally define an Endpoint Location which locates the service (URL).


MS421 document (Reference)

Process

Monitoring requestor EGI Operations
1. Register at GOCDB and request a new custom service type following this procedure here.
2. OMB validates the request.
3. Develop the custom service probes for your service. How to develop new probes, please have a look here. After your probes are developed, please follow this procedure for submitting them for validation and integration into SAM framework next release.
4. OTAG validates the request.
5. You can see your service being tested by the probes at MyEGI portal.

Examples