Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

PROC06 Setting Nagios test status to operations

From EGIWiki
Jump to navigation Jump to search

Procedure for setting global Nagios tests critical DRAFT

The purpose of this document is to clearly describe the actions and the relative steps to be undertaken for setting Nagios tests critical. This procedure only applies for OPS VO and its range is global.

Revision history

Version Authors Date Comments
0.2 Małgorzata Krakowian 23.09.2010 Add comments from discussion in Amsterdam EGI TF.
0.1 Małgorzata Krakowian First draft

Setting Nagios tests critical request

  • The request should be submitted to The Chief Operations Officer.
  • Everyone is allowed to submit the request.
  • COD has to agree, checking if the test is safe for the infrastructure.
  • When COD validate the request, OMB is informed that new Nagios test will become critical for OPS VO.

How to start the process

  • The Chief Operations Officer opens a GGUS ticket to COD to start the process.
  • The Central Operator on Duty team - in charge of EGI oversight - is responsible of processing the request ticket.

Prerequisities

Before opening the GGUS ticket, the test should be implemented and approved by Nagios team.

Any probe change in fundamental way requires certification.

Setting Nagios tests critical steps

The general idea is that tickets must be closed before being able to move on to the next step.

Steps:

Step Action on Action
1 Nagios Add test to official Nagios package.
2 NGIs Nagios update.
3 NGIs Request to the ROD teams to ask the if they can verify if the test is acceptable, means 75% of affected nodes should be OK.
4 COD The information is broadcast by COD.

(This broadcast should be sent to VO managers and NOC/ROC managers) See the template below for an indication of the message content.

Subject:   

Dear All,

We would like to announce that test XXX will become critical XXX

Best regards,
5 COD Add test to critical tests list.
6 COD Final check. Close parent ticket

Requirements to be implemented