Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @

Difference between revisions of "Operations and Operations Support"

From EGIWiki
Jump to navigation Jump to search
Line 130: Line 130:
| '''ROD performance index followup procedure'''  
| '''ROD performance index followup procedure'''  
*[[Grid operations oversight/WI06|WI07 - Top-BDII report work instruction]]
*[[Grid operations oversight/WI06|WI07 - ROD Performance Index report work instruction]]
*[[Grid operations oversight/ROD performance index|ROD performance index]]  
*[[Grid operations oversight/ROD performance index|ROD performance index]]  

Revision as of 17:54, 27 November 2012

Main operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security

EGI Infrastructure Operations Oversight menu: Home Operations Team Regional Operators (ROD) 


COD team is a small team responsible for coordination of RODs, provided on a global layer. COD represents the whole ROD structure in terms of technical requirements for operations tools as well as on political level.

The purpose of this page is to collect all materials needed by COD team to perform the Grid operations oversight activities.

People and contact

COD team is formed from Dutch and Polish team and includes COD managers (people responsible for managerial issues) and COD shifters (people performing day-to-day COD work)

COD managers: 
Ron Trompert (Chair), Marcin Radecki, Luuk Uljee, Tadeusz Szymocha, Magda Szopa
COD shifters: 
Tadeusz Szymocha, Magda Szopa, Ron Trompert, Luuk Uljee, Maarten van Ingen, Ernst Pijper, Alexander Verkooijen

People behind the names

There are 2 mailing lists used for different cases:

  • manager-central-operator-on-duty AT - for COD managerial issues like suggesting changes in procedures, tools. COD managers are recipients of this list.
  • central-operator-on-duty AT - for reporting COD day-to-day issues like problems with tools or Nagios tests. COD shifters are recipients of this list.

COD Duties

  • COD managers
    • representing RODs/COD in OTAG, OMB and Operations meetings - collecting requirements and improvements proposals from RODs concerning operations tools and procedures
    • suspending Resource Centres in case of operational issues
    • taking part in OLA task force
    • writing new procedures - in case of need COD is taking part in procedures creation process
    • preparing ROD newsletters - informing RODs about recent and upcoming developments related to Grid Oversight
    • preparing ROD metrics reports - providing an overview of operations support process in grid infrastructure.
  • COD shifters
    • escalation of operational problems with RODs
    • dealing with GGUS tickets assigned to COD
    • process coordination of:
      • creation and decommission of Operations Centre
      • setting a Nagios test to an operations test
      • getting explanations for low availability and reliability metrics

COD shifters work instructions

In this section are collected all work instructions containing detailed information specifying exactly what steps are to be followed to carry out an activity.

Action Description Related procedures
GGUS tickets assigned to COD

COD shifter is obliged to check the current status of all GGUS tickets assigned to COD

In case of a request for:

If the shifter doesn't know what kind of action should be taken, he/she should contact COD managers

Operational portal dashboard issues
  • COD dashboard link
  • At the end of the shift a handover should be submitted (send to COD) via Handover tool in the Operational Portal
    • Problems on the dashboard which will pass to next week: the ggus id of the ticket and when next escalation step should be taken
    • GGUS tickets assigned to COD: for each ticket its last status and the action taken by the shifter should be provided
    • Other issues: problems with tools etc.

Availability/reliability followup procedure
Unknown followup procedure
Top-level BDII followup procedure
ROD performance index followup procedure

Work Instructions



Oct 2011 to date

  • Please provide a link here

Definition of Operations Support metrics

May 2010-Sep 2011

Until April 2010

  • EGEE-III Operations Support metrics

Nagios tests

OTAG topics

Operational Portal: Dashboard


Pages in draft state