Difference between revisions of "EGI Infrastructure operations oversight"
Line 55: | Line 55: | ||
| '''GGUS tickets assigned to COD''' | | '''GGUS tickets assigned to COD''' | ||
| | | | ||
COD shifter is obliged to check the current status of all '''GGUS tickets assigned to COD''' | |||
* see [http://tinyurl.com/2ws735h Link to all GGUS tickets assigned to COD] | |||
* If the ticket is waiting for COD action then he/she should perform the action | |||
In case of a request for: | |||
* '''ROD certification''' | |||
** see [[Procedure_to_handle_new_ROD_certification_GGUS_tickets | New ROD team certification work instructions]] | |||
* '''Creation of a new NGI''' | |||
** see [[Operations:NewNGIs_creation | Creation of a new NGI process coordination]] | |||
** In case where COD is also the Integration Process Coordinator, COD is responsible for the whole procedure. | |||
* '''Operations Centre decommission''' | |||
** see [[Operations:Operations_Centre_decommission|Operations Centre decommission process coordination]] | |||
** COD validates the request and removes ROD information from all-operators mailing list | |||
* '''Setting a Nagios test to an operations test''' | |||
** see [[Operations:Procedure_for_setting_Nagios_test_an_operations_test| Procedure for setting a Nagios test to an operations test]] | |||
** COD is responsible for coordinating the whole process. | |||
If the shifter doesn't know what kind of action should be taken, he/she should contact COD managers | |||
| | | | ||
|- | |- | ||
Line 81: | Line 85: | ||
|- | |- | ||
| '''Operational portal dashboard issues''' | | '''Operational portal dashboard issues''' | ||
| * [https://operations-portal.in2p3.fr/dashboard/ccodView COD dashboard] | | | ||
*[https://operations-portal.in2p3.fr/dashboard/ccodView COD dashboard link] | |||
*[[Operations:Work_instruction_for_escalating_operational_problems_with_ROD | Escalation for operational problems with ROD - work instruction]] | *[[Operations:Work_instruction_for_escalating_operational_problems_with_ROD | Escalation for operational problems with ROD - work instruction]] | ||
| | | |
Revision as of 15:05, 7 January 2011
EGI.eu Operations Oversight Pages
EGI Grid Operations oversight of the e-Infrastructure is a co-ordination task for ensuring that GRID monitoring across EGI runs smoothly. This team communicates among the 3 groups - Operations and e-Infrastructure Oversight (OE); Operational Documentation (OD); and "Coordination of interoperations between NGIs and with other Grids".
The Operations oversight team works with the Tool Developers (and particularly the OTAG group), NGIs and their Operations Teams (ROD). There are regular phone meetings for the co-ordinators and others working in the tasks. The OE co-ordinators also organise face to face meetings for the ROD teams 3 to 4 times a year.
- Co-ordinators:
- Ron Trompert (Chair), Marcin Radecki, Luuk Uljee
- Deputy:
- Malgorzata Krakowian
- Contact:
- There are 3 mailing lists used for different cases:
- manager-central-operator-on-duty AT mailman.egi.eu - for COD managerial issues like suggesting changes in procedures, tools. COD managers are recipients of this list.
- central-operator-on-duty AT mailman.egi.eu - for reporting COD day-to-day issues like problems with tools or Nagios tests. COD shifters are recipients of this list.
- all-central-operator-on-duty AT mailman.egi.eu - for contacting all ROD teams in NGIs. Each ROD team is a recipient of this list.
COD offical web pages
Internal area
Procedures used in COD activity
In this section were collected all procedures in force for COD
- COD Operational Procedures - old EGEE wiki page with the most recent version of the procedure
- New NGI creation process coordination
- Availability and reliability monthly statistics procedure
- Operations Centre decommission process coordination
- Procedure for setting Nagios test an operations test
- COD escalation procedure
COD shifters daily work instructions
In this section are collected all work instructions containing detailed information specifying exactly what steps are to be followed to carry out an activity.
Action | Description | |
---|---|---|
GGUS tickets assigned to COD |
COD shifter is obliged to check the current status of all GGUS tickets assigned to COD
|
|
Availability/reliability reports |
|
|
Operational portal dashboard issues | ||
Handover |
|
|
NOTE: all procedures should contain the following template: https://wiki.egi.eu/wiki/PDT:Procedure_Template