Difference between revisions of "Operations and Operations Support"
Jump to navigation
Jump to search
Work Instructions
Line 28: | Line 28: | ||
|- | |- | ||
! Action | ! Action | ||
! | ! Responsible<br> | ||
! | ! Procedure | ||
! Instructions and related pages<br> | |||
|- | |- | ||
| ''' | | '''ROD certification''' | ||
| <br> | | OS<br> | ||
* | | | ||
* | *[https://wiki.egi.eu/wiki/PROC02 Operations Centre Creation] | ||
| | |||
*[[WI01 ROD certification ticket handling|WI01 - New ROD team certification work instructions]] | |||
|- | |||
| '''Creation of a new NGI''' | |||
* | | OS<br> | ||
| | |||
*[https://wiki.egi.eu/wiki/PROC02 Operations Centre Creation] | |||
| | |||
*[[WI02 Operations centre creation|WI02 - New Opertions Centre creation work instruction]] | |||
|- | |||
| '''Operations Centre decommission''' | |||
| O<br> | |||
| | | | ||
*[ | *[https://wiki.egi.eu/wiki/PROC03 Operations Centre decommissioning] | ||
<br> | | <br> | ||
|- | |||
| '''Setting a Nagios test to an operations test''' | |||
| O<br> | |||
| | |||
*[https://wiki.egi.eu/wiki/PROC06 Setting a Nagios test status to OPERATIONS] | |||
| <br> | |||
|- | |- | ||
| '''Operational portal dashboard issues''' | | '''Operational portal dashboard issues''' | ||
| O<br> | |||
| | | | ||
*[https:// | *[https://wiki.egi.eu/wiki/PROC01 EGI Infrastructure Oversight Escalation] | ||
| | | | ||
*[[ | *[[WI05 Unresponsive NGI escalation|WI05 - Escalation procedure in case of unresponsive NGI]] | ||
*[[WI06 Tickets older than 30 days|WI06 - Tickets > 30 days]] | |||
|- | |- | ||
| '''Availability/reliability followup procedure''' | | '''Availability/reliability followup procedure''' | ||
| O<br> | |||
| | | | ||
*[ | *[https://wiki.egi.eu/wiki/PROC04 Quality verification of monthly availability and reliability statistics]<br> | ||
| | | | ||
*[[ | *[https://wiki.egi.eu/wiki/PROC10 Recomputation of monitoring results and availability statistics] | ||
*[[WI03 Availability and Reliability report followup|WI03 - Availability and reliability report work instruction]] | |||
*[[Underperforming sites and suspensions|Underperforming sites and suspensions<br>]] | |||
[[Underperforming sites and suspensions|Underperforming sites and suspensions]] | |||
|- | |- | ||
| '''Unknown followup procedure''' | | '''Unknown followup procedure''' | ||
| O<br> | |||
| | | | ||
*[ | *[https://wiki.egi.eu/wiki/PROC04 Quality verification of monthly availability and reliability statistics] | ||
| | | | ||
*[[WI05 Unresponsive NGI escalation|WI05 - Escalation procedure in case of unresponsive NGI]] | *[[WI05 Unresponsive NGI escalation|WI05 - Escalation procedure in case of unresponsive NGI]] | ||
*[[Unknown issue|UNKNOWN issue]] | |||
*[[WI08 Unknown report followup|WI08 - Unknown report work instruction]] | |||
|- | |- | ||
| '''Top-level BDII followup procedure''' | | '''Top-level BDII followup procedure''' | ||
| O<br> | |||
| | | | ||
*[ | *[https://wiki.egi.eu/wiki/PROC04 Quality verification of monthly availability and reliability statistics] | ||
| | | | ||
*[[WI05 Unresponsive NGI escalation|WI05 - Escalation procedure in case of unresponsive NGI]] | *[[WI05 Unresponsive NGI escalation|WI05 - Escalation procedure in case of unresponsive NGI]] | ||
*[[WI04 Core services report followup|WI04 - Core services report work instruction]] | |||
|- | |- | ||
| '''ROD performance index followup procedure''' | | '''ROD performance index followup procedure''' | ||
| O<br> | |||
| | |||
| | | | ||
*[[WI05 Unresponsive NGI escalation|WI05 - Escalation procedure in case of unresponsive NGI]] | |||
*[[WI07 ROD performance index report follwup|WI07 - ROD Performance Index report work instruction]] | *[[WI07 ROD performance index report follwup|WI07 - ROD Performance Index report work instruction]] | ||
*[[ROD performance index|ROD performance index]] | *[[ROD performance index|ROD performance index]] | ||
|} | |} | ||
Line 170: | Line 183: | ||
[[Category:Grid_Oversight]] | [[Category:Grid_Oversight]] | ||
<br> |
Revision as of 18:08, 18 August 2014
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
EGI Infrastructure Operations Oversight menu: | Home • | EGI.eu Operations Team • | Regional Operators (ROD) |
Introduction
This page collects internal materials needed by EGI.eu Operations and EGI Operations Support team to perform the EGI Infrastructure operations oversight activities.
Contact
EGI.eu Operations:
- GGUS Support Unit: Operation
- operations @ egi.eu
EGI Operations Support:
- GGUS Suport Unit: EGI Operations Support
- operations-support @ mailman.egi.eu
Duties
Shifters work instructions
In this section are collected all work instructions containing detailed information specifying exactly what steps are to be followed to carry out an activity.
Action | Responsible |
Procedure | Instructions and related pages |
---|---|---|---|
ROD certification | OS |
||
Creation of a new NGI | OS |
||
Operations Centre decommission | O |
||
Setting a Nagios test to an operations test | O |
||
Operational portal dashboard issues | O |
||
Availability/reliability followup procedure | O |
||
Unknown followup procedure | O |
||
Top-level BDII followup procedure | O |
||
ROD performance index followup procedure | O |
Work Instructions
- WI01 - New ROD team certification work instructions
- WI02 - New Opertions Centre creation work instruction
- WI03 - Availability and reliability report work instruction
- WI04 - Core services report work instruction
- WI05 - Escalation procedure in case of unresponsive NGI
- WI06 - Tickets > 30 days
- WI07 - ROD Performance Index report work instruction
- WI08 - Unknown report work instruction
Events
- EGI indico page with COD meeting agendas.
- All open actions can be found from COD actions
Resources
- Document server: ROD newsletter
- Document server: Operations Support Metrics
- Operations Procedures
- Youtube channel
- Mailing lists for each ROD
- Knowledge database
Oct 2011 to date
- Please provide a link here
Definition of Operations Support metrics
May 2010-Sep 2011
- Operations Support metrics
Until April 2010
- EGEE-III Operations Support metrics
Nagios tests
- Operations tests list : list of Nagios probes generating alarms for visualization in the Operations Dashboard
- Availability and reliability tests list: list of Nagios probes whose results are used for Availability and Reliability computation
OTAG topics
Operational Portal: Dashboard
Pages in draft state