Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "PROC06 Setting Nagios test status to operations"

From EGIWiki
Jump to navigation Jump to search
(Remove deprecated content)
Tag: Replaced
 
Line 5: Line 5:
|[[File:Alert.png]] This page is '''Deprecated'''; the content has been moved to https://confluence.egi.eu/display/EGIPP/PROC06+Setting+Nagios+test+status+to+operations   
|[[File:Alert.png]] This page is '''Deprecated'''; the content has been moved to https://confluence.egi.eu/display/EGIPP/PROC06+Setting+Nagios+test+status+to+operations   
|}
|}
<br> {{Ops_procedures
|Doc_title = Setting Nagios test status to operations
|Doc_link = [[PROC06 |https://wiki.egi.eu/wiki/PROC06]]
|Version = 2020-03-02
|Policy_acronym = OMB
|Policy_name = Operations Management Board
|Contact_group = operations@egi.eu
|Doc_status = Approved
|Approval_date = 2020-02-20
|Procedure_statement = The purpose of this document is to clearly describe the actions and the relative steps to be undertaken for including Nagios tests into the [https://poem.egi.eu/ui/metricprofiles/ARGO_MON_OPERATORS ARGO_MON_OPERATORS] profile: in this way, the operations dashboard will display an alarm in case the test fails.
|Owner = Alessandro Paolini
}}
= Overview  =
The purpose of this document is to clearly describe the actions and the relative steps to be undertaken for including Nagios tests into the [https://poem.egi.eu/poem/admin/poem/profile/3/change/ ARGO_MON_OPERATORS] profile: in this way, the operations dashboard will display an alarm in case the test fails.
'''This procedure only applies for tests run under OPS VO and its range is global, applies for all Operations Centres in EGI project.'''
<br>
= Definitions  =
Please refer to the [[Glossary|EGI Glossary]] for the definitions of the terms used in this procedure.<br>
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
<br>
= Steps  =
== Prerequisites  ==
The ARGO test needs to meet the following requirements.
#It satisfies quality criteria in agreement with the UMD operational capabilities quality criteria: https://documents.egi.eu/document/240.
#It is properly documented.
#It must be part of an official nagios release.
#It must have been deployed in production for at least one month without problems.
#It must be available for validation by Operations
== Sending a request  ==
*Anybody can submit the request for making the test an operations test.
*The request should be submitted to Operations via a GGUS ticket.
== Validation  ==
{| class="wikitable"
|-
! Step
! Action on
! Action
|-
| 1
| Applicant
| Opens a [https://ggus.eu/ GGUS] ticket to Operations to start the process. <pre>Subject: Request for setting XXX test an operations test
Dear Operations,
We would like to request for setting XXX test an operations test
Prerequisite data:
* name of nagios probe:
* name of service on which the test runs:
* link to documentation page:
* motivation (which part of the infrastructure will be improved by making XXX test
or description of users' problems which will be avoided in future - provide list
of GGUS tickets is possible)
Best Regards
XXX
</pre>
|-
| 2
| Operations
| Checks the status of the Nagios probe to see if it meets the specified quality criteria.
|-
| 3
| Operations
| Operations contacts the OMB to request the approval of the new operations test. Date is specified (at least 1 month in future)
|-
| 4
| NGIs
| Request to the ROD teams to try making the test OK. 75% OK in total (entire EGI) is understood as threshold for passing to the next step. If not possible to proceed, report problems to OMB.
|-
| 5
| Operations
| Reassigns the ticket to "Monitoring (ARGO)" agreeing on the date for the inclusion of the test in the operations profile
|-
| 6
| Operations
| The announcement about the new operations test is Monthly broadcast&nbsp;
(This broadcast should be sent to site managers, NGI managers and ROD teams) See the template below for an indication of the message content.
<pre>Subject: XXX have been added to the EGI Operations Profile on XXX
Dear All,
We would like to announce that test XXX will become operational on XXX
Short description of the test:
The documentation can be found:
Best regards,
</pre>
|-
| 7
| Operations
| Final check. Close parent ticket
|}
<br>
= Revision history  =
{| class="wikitable"
|-
! Version
! Authors
! Date
! Comments
|-
|
| M. Krakowian
| 19 August 2014
| Change contact group -&gt; Operations support
|-
|
| Alessandro Paolini
| 2016-06-08
| "EGI Operations Support" was decommissioned, changed all the references to "Operations"
|-
|
| Alessandro Paolini
| 2019-02-25
| updated some old acronyms; removed the step 7 where ARGO team had to modify the related POEM profile; added the link to the operations portal in the step 6
|-
|
| Alessandro Paolini
| 2019-12-17
| updated steps 5 and 6, since the operations tests are managed on Poem (ARGO_MON_OPERATORS) by ARGO team, no more through the operations dashboard
|}
[[Category:Operations_Procedures]]

Latest revision as of 09:43, 15 April 2022