Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Adding Custom Service to Availability Monitoring"

From EGIWiki
Jump to navigation Jump to search
 
(13 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{| style="border:1px solid black; background-color:lightgrey; color: black; padding:5px; font-size:140%; width: 90%; margin: auto;"
| style="padding-right: 15px; padding-left: 15px;" |
|[[File:Alert.png]] This article is '''Deprecated''' and should no longer be used, but is still available for reasons of reference.
|}
[[Category:Deprecated]]
{{TOC_right}}
{{TOC_right}}


= Introduction =
= Introduction =


Operational tools such as the [[GOCDB |GOCDB management system]] and the [[SAM | SAM monitoring system]] are key
The [[SAM | Service Availability Monitoring (SAM) ]] system is used to monitor the resources within the
software components for the reliable and stable operation/monitoring of the infrastructure.
 
GOCDB - Grid Configuration Database (GOCDB) contains general information about the sites participating to the production Grid. Accessed by all the project actors (end-users, sitemanagers, NGI mangers, support teams, VO managers), by other tools and by third party middleware in order to get Grid topology. The portal has a single central installation. 
 
Services registered in GOCDB are described with the following information:
*Service Type: a unique name that identifies the type of software component deployed on a Grid.
*Service Endpoint: is a deployed instance of a named service type
*Endpoint Location: a Service Endpoint may optionally define an Endpoint Location which locates the service (URL).
 
 
The Service Availability Monitoring (SAM) [SAM] system is used to monitor the resources within the
production infrastructure. SAM monitoring data is used for calculation of availability and reliability of
production infrastructure. SAM monitoring data is used for calculation of availability and reliability of
grid sites. It includes the following components:
grid sites. It includes the following components:
Line 24: Line 19:




[https://documents.egi.eu/public/ShowDocument?docid=1308 MS421 document (Reference)]
'''SAM architecture'''
 
[[File:Sam_architecture.png|500px]]
 


= Process =


1. Register your service instance (hostname / service endpoint) in the GOCDB.
Operational tools such as the [[GOCDB |GOCDB management system]] and the [[SAM | SAM monitoring system]] are key
software components for the reliable and stable operation/monitoring of the infrastructure.  


Note: if your service does not match any of the [[GOCDB/Input_System_User_Documentation#Service_types | existing service types in GOCDB]], please [[ GOCDB/Input_System_User_Documentation#Adding_new_services_types | request a new service type ]]. <span style="color:red">(Service Types descriptions are meaningless to the people who did not had any relations with a middleware e.g. i want to monitor web service which is http, which service type should i choose ? Are there any generic service types available at all ? e.g. http-check). </span> Service Types and Nagios probes mapping is available in POEM at https://grid-monitoring.cern.ch/poem/admin/poem/profile/24/
GOCDB - Grid Configuration Database (GOCDB) contains general information about the sites participating to the production Grid. Accessed by all the project actors (end-users, sitemanagers, NGI mangers, support teams, VO managers), by other tools and by third party middleware in order to get Grid topology. The portal has a single central installation.
 
Services registered in GOCDB are described with the following information:
*Service Type: a unique name that identifies the type of software component deployed on a Grid.
*Service Endpoint: is a deployed instance of a named service type
*Endpoint Location: a Service Endpoint may optionally define an Endpoint Location which locates the service (URL).


2. Choose from the set of currently available probes for your service from the [[ SAM#Probes | list]]. If none are suitable, you will have to develop the CUSTOM probes for your service. Naming example: CUSTOM.<domain>.<subdomain>.<test_name>, for more examples take a look at Custom Service Types section [[GOCDB/Input_System_User_Documentation#Adding_new_services_types | here]]. How to develop new probes, please have a look [https://tomtools.cern.ch/confluence/display/SAMDOC/Probes+Development here]. After your probes are developed, please follow [[PROC07 | this procedure]] to enable them at [http://grid-monitoring.egi.eu/myegi/atp/se/ MyEGI portal]. Probes documentation is available at  [https://tomtools.cern.ch/confluence/display/SAMDOC/Probes here] (in case a probe lacks of documentation, please, refer to the probe developing team)


Note: if you define a custom service with multiple functionality to be tested (e.g. http-check and db-check), please, make sure that, each time you add an instance of this custom service, the new instance provides the same multiple functionality. Indeed, service types are associated with a set of probes that cause an alarm in case of a foreseen functionality is not present.
[https://documents.egi.eu/public/ShowDocument?docid=1308 MS421 document (Reference)]


= Process =


{| class="wikitable"
{| class="wikitable" style="width:100%;
!Service provider
!style="width:50%" align=left|Monitoring requestor
!Operations
!style="width:50%" align=left|EGI Operations
|-
|-
| 1. Register your service endpoint instance in GOCDB.<br>
| 1. Register at [http://goc.egi.eu/ GOCDB] and request a new custom service type following this procedure [[GOCDB/Input_System_User_Documentation#Adding_new_services_types| here]].<br>
| 1. Validates the service endpoint registration.<br>
|  
|-
|-
| 2. Find the suitable service type in GOCDB.<br>
|  
2.1 If your service does not match any of the [[GOCDB/Input_System_User_Documentation#Service_types | existing service types at GOCDB]], please [[ GOCDB/Input_System_User_Documentation#Adding_new_services_types | request a new service type ]].
| 2. [[OMB |OMB]] validates the request.<br>
| 2. If an existing service type was chosen, subsequent operations are automatic.<br>
2.1 If a new service type was requested, it will be reviewed by OMB and OTAG.<br>
|-
|-
| 3. If an existing GOCDB service type was suitable, please set option "Monitoring" to start receiving probes.
| 3. Develop the custom service probes for your service. How to develop new probes, please have a look [https://tomtools.cern.ch/confluence/display/SAMDOC/Probes+Development here]. After your probes are developed, please follow [[PROC07 | this procedure]] for submitting them for validation and integration into SAM framework next release.<br>
3.1 If a new service type was requested, choose from the set of currently available probes for your service from the [[ SAM#Probes | list]].<br>
|
3.2 If you can not find any suitable probes, you will have to develop the CUSTOM probes for your service. Naming example: CUSTOM.<domain>.<subdomain>.<test_name>, for more examples take a look at Custom Service Types section [[GOCDB/Input_System_User_Documentation#Adding_new_services_types | here]]. How to develop new probes, please have a look [https://tomtools.cern.ch/confluence/display/SAMDOC/Probes+Development here]. After your probes are developed, please follow [[PROC07 | this procedure]] to enable them at [http://grid-monitoring.egi.eu/myegi/atp/se/ MyEGI portal].<br>
|-
| 3. Subsequent operations are automatic.<br>
|
3.1 If a new service type was requested and existing probes in SAM framework were chosen, they have to be enabled in a profile of local NGI based on service location<br>
| 4. [[OTAG | OTAG]] validates the request.
3.2 If new probes were submitted, operations will start [[PROC07 | this procedure]].<br>
|-
| 5. You can see your service being tested by the probes at [http://grid-monitoring.egi.eu/myegi/atp/se/ MyEGI portal].  
|
|}
|}
= Examples =

Latest revision as of 11:34, 13 August 2021

Alert.png This article is Deprecated and should no longer be used, but is still available for reasons of reference.


Introduction

The Service Availability Monitoring (SAM) system is used to monitor the resources within the production infrastructure. SAM monitoring data is used for calculation of availability and reliability of grid sites. It includes the following components:

  • test execution framework based on the open source monitoring framework Nagios and the Nagios Configuration Generator (NCG)
  • database components which contain topology (gathered from GOCDB and other sources), profiles (mapping between service types and tests), test results and availability and reliability of sites and services
  • visualization portal MyEGI which enables users to access current status, history and availability of monitored sites and services
  • programmatic interface which enables other tools (e.g. Operations Portal, VO dashboards) to access test results and availability and reliability of sites and services
  • probes used to test monitored services. These probes are provided by middleware developers and third parties (e.g. NGIs, Nagios community).


SAM architecture

Sam architecture.png


Operational tools such as the GOCDB management system and the SAM monitoring system are key software components for the reliable and stable operation/monitoring of the infrastructure.

GOCDB - Grid Configuration Database (GOCDB) contains general information about the sites participating to the production Grid. Accessed by all the project actors (end-users, sitemanagers, NGI mangers, support teams, VO managers), by other tools and by third party middleware in order to get Grid topology. The portal has a single central installation.

Services registered in GOCDB are described with the following information:

  • Service Type: a unique name that identifies the type of software component deployed on a Grid.
  • Service Endpoint: is a deployed instance of a named service type
  • Endpoint Location: a Service Endpoint may optionally define an Endpoint Location which locates the service (URL).


MS421 document (Reference)

Process

Monitoring requestor EGI Operations
1. Register at GOCDB and request a new custom service type following this procedure here.
2. OMB validates the request.
3. Develop the custom service probes for your service. How to develop new probes, please have a look here. After your probes are developed, please follow this procedure for submitting them for validation and integration into SAM framework next release.
4. OTAG validates the request.
5. You can see your service being tested by the probes at MyEGI portal.

Examples