WI03 RC and RP OLA violation report followup
|Main||EGI.eu operations services||Support||Documentation||Tools||Activities||Performance||Technology||Catch-all Services||Resource Allocation||Security|
|EGI Infrastructure Operations Oversight menu:||Home •||EGI.eu Operations Team •||Regional Operators (ROD)|
Availability and reliability work instruction for COD
This page describes steps which should be taken by COD shifter to follow availability/reliability issues.
- Receiver: Site
- Subject: Availability under target for last 3 months
- Threshold: 70%
- Goal: We expect to see improvement
- Deadline for answers: 10 days
- No response: site suspension
- Ticket is submitted by Georgios Kaklamanos or George Fergadis.
- Add ticket URL to Monthly actions
- Add ticket URL to Underperforming sites and suspensions
Submit child tickets to NGIs
- Go to Dropbox - COD - TicketCreator - AvaRel report
- Prepare input file EGI_sus.csv based on the records marked as red in the source pdf. Input file syntax:
NGI;Site;Availability;Reliability;Make sure NGIs are named according to the below table.
- Run ticket creator:
perl start-suspend.pl ticket_number ‘date, e.g. Ser 2012’ “EGI_sus.csv”More info about Ticket generator
Handling the child tickets
- NGIs that replied within 10 days - check the explanation. If uncertain whether to suspend or not, discuss with COO by submitting a ticket to them.
- If after 3 days from receiving the explanation from NGI availability shows no improvement (is still <70%) COD should suspend the site. Inform NGI and site about the suspension.
- In cases COD agree the site should not be suspended (such as: raise of availability >70% or any other important reason, such as NGI SAM problem) the site can be left certified
- NGIs that didn’t reply - after 10 days suspend the site. Inform NGI and site about the suspension.
- Prepare summary report and place it in the parent ticket.
- Update Underperforming_sites_and_suspensions and List of sites for which the availability followup procedures were not applicable
The whole process should be completed by the end of the month.
Naming the NGIs
In grid view NGIs/ROCs are named differently then in GGUS. You should change NGI/ROC name according to GGUS.
Subject:$SU/$siteName site suspension Dear $SU, According to recent availability/reliability report $siteName has achieved poor performance below target Ava. 50% or Rel. 50% in three consecutive months. More details: [[Availability_and_reliability_monthly_statistics]]. According to procedures approved on OMB 17.08, site will be suspended within 10 working days unless the NGI intervene. If you think that the site should not be suspended please provide justification within 10 working days. Best Regards, EGI Central Operator on Duty
More info about Grid_operations_oversight/WI03/TG-AR Ticket generator for A/R