Difference between revisions of "Operations Procedures"
Jump to navigation
Jump to search
Line 9: | Line 9: | ||
{| border="1" class="wikitable sortable" | {| border="1" class="wikitable sortable" | ||
|- style="background-color: lightgray;" | |- style="background-color: lightgray;" | ||
| '''Number''' | |||
| '''Title''' | | '''Title''' | ||
| '''Comment''' | | '''Comment''' | ||
Line 15: | Line 16: | ||
| '''Status''' | | '''Status''' | ||
|- | |- | ||
| [[PROC01| PROC01 Grid Oversight Escalation Procedure]] | | [[PROC01|PROC 01]] | ||
| [[PROC01|Grid Oversight Escalation Procedure]] | |||
| Operations ticket escation | | Operations ticket escation | ||
| Ticket Management | | Ticket Management | ||
| Resource Centre Administrators, Operations Centres, COD | | Resource Centre Administrators, Operations Centres, COD | ||
| ''approved'', October 26 2010 | | ''approved'', October 26 2010 | ||
|- | |- | ||
| [[PROC02| PROC02 Operations Centre Creation]] | | [[PROC02|PROC 02]] | ||
| [[PROC02|Operations Centre Creation]] | |||
| Step-by-step instructions on how to create a new Operations Centre | | Step-by-step instructions on how to create a new Operations Centre | ||
| Operations Centre Management | | Operations Centre Management | ||
| Operations Centres, COD | | Operations Centres, COD | ||
| ''approved'', August 17 2010 | | ''approved'', August 17 2010 | ||
|- | |- | ||
| [[PROC03| PROC03 Operations Centre decommissioning]] | | [[PROC03|PROC 03]] | ||
| [[PROC03|Operations Centre decommissioning]] | |||
| Step-by-step instructions on how to decommission an Operations Centre | | Step-by-step instructions on how to decommission an Operations Centre | ||
| Operations Centre Management | | Operations Centre Management | ||
| Operations Centres, COD | | Operations Centres, COD | ||
| ''approved'', October 26 2010 | | ''approved'', October 26 2010 | ||
|- | |- | ||
| [[ | | [[Availability and reliability monthly statistics#Process_for_quality_verification|PROC 04]] | ||
| [[Availability and reliability monthly statistics#Process_for_quality_verification|Quality verification of monthly availability and reliability statistcs]] | |||
| Instructions RODs and Operations Centres on how to handle justification for poor monthly performance through GGUS | | Instructions RODs and Operations Centres on how to handle justification for poor monthly performance through GGUS | ||
| Availability and Monitoring | | Availability and Monitoring | ||
Line 39: | Line 44: | ||
| ''approved'', August 17 2010 | | ''approved'', August 17 2010 | ||
|- | |- | ||
| [[PROC05 | | | [[PROC05|PROC 05]] | ||
| [https://twiki.cern.ch/twiki/bin/view/EGEE/ValidateROCNagios Validation of a Operations Centre Nagios] | |||
| This procedure is part of the [[Operations Centre creation process coordination|Operations Centre creation]] procedure. | | This procedure is part of the [[Operations Centre creation process coordination|Operations Centre creation]] procedure. | ||
| Availability and Monitoring | | Availability and Monitoring | ||
Line 45: | Line 51: | ||
| ''approved'', August 17 2010 | | ''approved'', August 17 2010 | ||
|- | |- | ||
| [[PROC06|PROC06 Setting a Nagios test status to OPERATIONS]] | | [[PROC06|PROC 06]] | ||
| [[PROC06|Setting a Nagios test status to OPERATIONS]] | |||
| A Nagios probe is set to OPERATIONS when its results are used to generate notifications for the Operations Dashboard. This procedure details the steps to turn a Nagios test to OPERATIONs. | | A Nagios probe is set to OPERATIONS when its results are used to generate notifications for the Operations Dashboard. This procedure details the steps to turn a Nagios test to OPERATIONs. | ||
| Availability and Monitoring | | Availability and Monitoring | ||
Line 51: | Line 58: | ||
| ''approved'', Nov 23 2010 | | ''approved'', Nov 23 2010 | ||
|- | |- | ||
| [[PROC07|PROC07 Adding new probes to SAM]] <!-- Title --> | | [[PROC07|PROC 07]] <!-- Procedure number --> | ||
| [[PROC07|Adding new probes to SAM]] <!-- Title --> | |||
| Addition of new OPS Nagios probes to the SAM release. <!-- Comment --> | | Addition of new OPS Nagios probes to the SAM release. <!-- Comment --> | ||
| Availability and Monitoring <!-- Area --> | | Availability and Monitoring <!-- Area --> | ||
Line 57: | Line 65: | ||
| ''approved'', Mar 28 2011 <!-- Status --> | | ''approved'', Mar 28 2011 <!-- Status --> | ||
|- | |- | ||
| [[PROC08|PROC08 Management of the EGI OPS Availability and Reliability Profile]] <!-- Title --> | | [[PROC08|PROC 08]] <!-- Procedure number --> | ||
| [[PROC08|Management of the EGI OPS Availability and Reliability Profile]] <!-- Title --> | |||
| Request of a OPS EGI Availability and Reliability profile. A change in the profile is needed every time a new Nagios test needs to be added/removed to/from the profile, in order to have its results included/removed in/from Availability and Reliability monthly statistics. <!-- Comment --> | | Request of a OPS EGI Availability and Reliability profile. A change in the profile is needed every time a new Nagios test needs to be added/removed to/from the profile, in order to have its results included/removed in/from Availability and Reliability monthly statistics. <!-- Comment --> | ||
| Availability and Monitoring <!-- Area --> | | Availability and Monitoring <!-- Area --> | ||
Line 63: | Line 72: | ||
| ''approved'', Mar 28 2011 <!-- Status --> | | ''approved'', Mar 28 2011 <!-- Status --> | ||
|- | |- | ||
| [[PROC09|PROC09 Resource Centre Registration and Certification Procedure]] <!-- Title --> | |[[PROC09|PROC 09]] <!-- Procedure number --> | ||
| [[PROC09|Resource Centre Registration and Certification Procedure]] <!-- Title --> | |||
| Registration of a new Resource Centre in the GOCDB | | Registration of a new Resource Centre in the GOCDB | ||
| Resource Centre Management | | Resource Centre Management | ||
Line 69: | Line 79: | ||
| ''approved May 17 2011'' | | ''approved May 17 2011'' | ||
|- | |- | ||
| [[PROC10|PROC10 Recomputation of monitoring results and availability statistics]] <!-- Title --> | |[[PROC10|PROC 10]] <!-- Procedure number --> | ||
| [[PROC10|Recomputation of monitoring results and availability statistics]] <!-- Title --> | |||
| Notification of problems with the monitoring results gathered by SAM and to request a recomputation of results and the related availability and reliability statistics | | Notification of problems with the monitoring results gathered by SAM and to request a recomputation of results and the related availability and reliability statistics | ||
| Availability and Monitoring <!-- Area --> | | Availability and Monitoring <!-- Area --> | ||
Line 75: | Line 86: | ||
| ''approved'', Oct 17 2011 <!-- Status --> | | ''approved'', Oct 17 2011 <!-- Status --> | ||
|- | |- | ||
| [[PROC11|PROC11 Resource Centre Decommissioning Procedure]] | | [[PROC11|PROC 11]] | ||
| [[PROC11|Resource Centre Decommissioning Procedure]] | |||
| Decommissioning of a Resource Centre before it is turned into CLOSED in GOCDB | | Decommissioning of a Resource Centre before it is turned into CLOSED in GOCDB | ||
| Resource Centre Management | | Resource Centre Management | ||
Line 81: | Line 93: | ||
| ''approved'', Feb 28 2012 | | ''approved'', Feb 28 2012 | ||
|- | |- | ||
| [[PROC12|PROC12 Production Service Decommissioning Procedure]] | | [[PROC12|PROC 12]] | ||
| [[PROC12|Production Service Decommissioning Procedure]] | |||
| Decommissioning of a EGI production service | | Decommissioning of a EGI production service | ||
| Resource Centre Management | | Resource Centre Management | ||
Line 87: | Line 100: | ||
| ''approved'', Feb 28 2012 | | ''approved'', Feb 28 2012 | ||
|- | |- | ||
| [[PROC13| PROC13 | | [[PROC13|PROC 13]] | ||
| [[PROC13|Vo Deregistration Procedure]] | |||
| Decommissioning of a Virtual Organization supported by the European Grid Infrastructure | | Decommissioning of a Virtual Organization supported by the European Grid Infrastructure | ||
| VO Management | | VO Management | ||
| VO Managers, Operations Manager | | VO Managers, Operations Manager | ||
| ''approved'', Jul 17 2012 | | ''approved'', Jul 17 2012 | ||
|} | |} | ||
Revision as of 12:35, 25 October 2012
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
Documentation menu: | Home • | Manuals • | Procedures • | Training • | Other • | Contact ► | For: | VO managers • | Administrators |
Operations
EGI Operational Procedures are prescriptive documents that describe step-by-step processes involving several partners. The purpose of a procedure is define the related workflow. Procedures are approved by the OMB and are periodically reviewed.
Number | Title | Comment | Area | Relevant to | Status |
PROC 01 | Grid Oversight Escalation Procedure | Operations ticket escation | Ticket Management | Resource Centre Administrators, Operations Centres, COD | approved, October 26 2010 |
PROC 02 | Operations Centre Creation | Step-by-step instructions on how to create a new Operations Centre | Operations Centre Management | Operations Centres, COD | approved, August 17 2010 |
PROC 03 | Operations Centre decommissioning | Step-by-step instructions on how to decommission an Operations Centre | Operations Centre Management | Operations Centres, COD | approved, October 26 2010 |
PROC 04 | Quality verification of monthly availability and reliability statistcs | Instructions RODs and Operations Centres on how to handle justification for poor monthly performance through GGUS | Availability and Monitoring | Resource Centre Administrators, Operations Centres, COD | approved, August 17 2010 |
PROC 05 | Validation of a Operations Centre Nagios | This procedure is part of the Operations Centre creation procedure. | Availability and Monitoring | Operations Centres, COD | approved, August 17 2010 |
PROC 06 | Setting a Nagios test status to OPERATIONS | A Nagios probe is set to OPERATIONS when its results are used to generate notifications for the Operations Dashboard. This procedure details the steps to turn a Nagios test to OPERATIONs. | Availability and Monitoring | Operations Centres, COD | approved, Nov 23 2010 |
PROC 07 | Adding new probes to SAM | Addition of new OPS Nagios probes to the SAM release. | Availability and Monitoring | Resource Centre Administrators, Operations Centres, COD | approved, Mar 28 2011 |
PROC 08 | Management of the EGI OPS Availability and Reliability Profile | Request of a OPS EGI Availability and Reliability profile. A change in the profile is needed every time a new Nagios test needs to be added/removed to/from the profile, in order to have its results included/removed in/from Availability and Reliability monthly statistics. | Availability and Monitoring | Resource Centre Administrators, Operations Centres, COD | approved, Mar 28 2011 |
PROC 09 | Resource Centre Registration and Certification Procedure | Registration of a new Resource Centre in the GOCDB | Resource Centre Management | Resource Centre Administrator, Operations Centres | approved May 17 2011 |
PROC 10 | Recomputation of monitoring results and availability statistics | Notification of problems with the monitoring results gathered by SAM and to request a recomputation of results and the related availability and reliability statistics | Availability and Monitoring | Resource Centre Administrators, Operations Centres | approved, Oct 17 2011 |
PROC 11 | Resource Centre Decommissioning Procedure | Decommissioning of a Resource Centre before it is turned into CLOSED in GOCDB | Resource Centre Management | Resource Centre Administrator, Operations Centres | approved, Feb 28 2012 |
PROC 12 | Production Service Decommissioning Procedure | Decommissioning of a EGI production service | Resource Centre Management | Resource Centre Administrator, Operations Centres | approved, Feb 28 2012 |
PROC 13 | Vo Deregistration Procedure | Decommissioning of a Virtual Organization supported by the European Grid Infrastructure | VO Management | VO Managers, Operations Manager | approved, Jul 17 2012 |
Security
Number | Title | Comment | Status | Area | Relevant to |
SEC 01 | EGI Security Incident Handling | The "Security Incident Handling Procedure" define site and incident coordinator responsibilities when handling Grid-related security incident. ALL EGI sites are required to follow this procedure to report and handle Grid-related security incident. | approved, July 2010 (MS405) | Security | Resource Centres, EGI CSIRT |
SEC 02 | EGI Vulnerability issue handling process | The process used to report and resolve Grid Software vulnerabilities in the EGI Inspire project. | approved, July 2010 (MS405) | Security | Resource Centres, Risk Assessment Team, Technology Providers, SVG |
SEC 03 | Critical Vulnerability Operational Procedure | After a problem has been assessed as critical, and a solution is available, then sites are required to take action. This document primarily defines the procedure from this time, where sites are asked to take action, and what steps are taken if they do not respond or do not take action. If a site fails to take action, this may lead to site suspension. | approved, March 15 2011 | Security | Resource Centres, Operations Centres, EGI-CSIRT, SVG |
EGI Policies and Procedures
See all EGI policies and procedures
Contacts
If you wish to report problems with this page, or want to suggest additions and improvements please contact:
operational-documentation-manuals[at]mailman.egi.eu