Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "WI03 RC and RP OLA violation report followup"

From EGIWiki
Jump to navigation Jump to search
Line 1: Line 1:
= COD procedures =
= Internal procedure for COD =


This page describes steps which should be taken by COD shifter to follow availability/reliability issues.
This page describes steps which should be taken by COD shifter to follow availability/reliability issues.

Revision as of 12:32, 23 July 2010

Internal procedure for COD

This page describes steps which should be taken by COD shifter to follow availability/reliability issues.


When GGUS ticket about availability/reliability metrics is assigned to COD:

  1. add ticket url to Availability and reliability internal procedure for COD tickets page
  2. review Ava/Rel report and prepare following lists:
    1. CASE 1: sites for suspension (Look at two previous months in AR report and the current one. If all are below 50% then sites qualifies for suspension.)
    2. CASE 2: sites to be asked for explanation (below 75% for reliability and 70% for availability)
  3. Generate child tickets for both lists
    1. for CASE1: when explanation was provided and is found satisfactory, set child ticket to 'verified' status
  4. When the deadline (7 working days) expired:
    1. suspend in GOC DB sites qualified for suspension
    2. prepare summary report of explanations (it should be placed in parent ticket):
      1. sites which are not responsive
      2. sites which provided not satisfactory explanation
      3. ROCs/NGIs which are not responsive


Tickets content

Request for explanation

Subject:$SU/$siteName - availability/reliability statistics for $date";

Dear $SU,

According to recent availability/reliability report $siteName has achieved
poor performance Ava. $availability  Rel. $realiability.
More details: https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics.

Could you please provide explanations for poor performance of the $siteName site?

Your explanation must be returned within 7 working days from then the ticket is created.
If the explanation is not given in due time, or the explanation is found inadequate,
the EGI Chief Operations Officer can decide within 3 working days after the deadline
to suspend the site.

Best Regards,
EGI Central Operator on Duty

Site for suspension

Subject:$SU/$siteName site suspension";

Dear $SU,

According to recent availability/reliability report $siteName has achieved
poor performance below target Ava. 50% or Rel. 50% in three consecutive months.
More details: https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics.

According to procedures site will be suspended within 7 working days unless the NGI has 
intervened. Non suspension will occur only if both the COD and COO agree on the reasoning 
provided by the NGI. 


Best Regards,
EGI Central Operator on Duty