ROD Duties

From EGIWiki
Revision as of 15:43, 15 June 2011 by Pslizik (talk | contribs) (Created by moving material from Operations/ROD/Draft)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

All duties listed in this section are mandatory for ROD team. In the case of no explicit 1st Line Support team in the NGI, duties of that team must be absorbed by the ROD.

A new ROD member needs to follow the procedures in the procedure for joining ROD teams .

Handling tickets

The main responsibility of ROD is to deal with tickets for sites in the region. This includes making sure that the tickets are opened and handled properly. The procedure for handling tickets is described in section Handling tickets.

Putting a site in downtime for urgent matters

In general, ROD can place a site or a service endpoint (i.e., a host) in downtime (in the GOCDB) if it is either requested by the site, or ROD sees an urgent need to put the site into downtime.

ROD may also suspend a site, under exceptional circumstances, without going through all the steps of the escalation procedure. For example, if a security hazard occurs, ROD must suspend a site on the spot in the case of such an emergency. It is important to know that COD can also suspend a site in the case of an emergency, e.g. security incidents or lack of response.

In both scenarios, it is important that communication channels between all parties involved are active.

Notify COD and EGI CSIRT about urgent matters

ROD should create tickets to COD in the case of urgent matters. For security related issues, 1st Line Support and/or ROD should also notify the CSIRT duty contact.

Summary of ROD duties

Duties of ROD Requirements
Receive incident notification from sites in the scope Mandatory (if not handled by 1st Line Support)
Handle incidents less than 24h old Mandatory (if not handled by 1st Line Support)
Create tickets for alarms older then 24h and that are not in an OK state Mandatory
Escalate tickets to COD if necessary: assignment to COD can be made directly through the dashboard. Mandatory
Propagate actions from COD down to sites Mandatory
Monitor and update any GGUS tickets up to the “solved” status (via the Dashboard) Mandatory
Close alarms for “solved problems” Mandatory
Handle the final state of GGUS tickets not opened from the operations portal by marking them as verified. Mandatory
Put the site in downtime for urgent matters Optional
Create tickets to COD for urgent matters Mandatory

(Definitions in the “Requirements” column: Mandatory – must be covered by either 1st Line Support or the ROD team, Optional – the federation decides how to implement this.)