Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "WI03 RC and RP OLA violation report followup"

From EGIWiki
Jump to navigation Jump to search
Line 133: Line 133:
current version of the script: 2.0
current version of the script: 2.0


* '''Configure the script'''. In start-explanations.pl/start-suspend.pl file at the beginning of the script you have to fill in following variable:
* '''Configure the script'''.  
 
In start-explanations.pl/start-suspend.pl file at the beginning of the script you have to fill in following variable:


<pre>
<pre>
Line 148: Line 150:
   
   


* '''Prepare input file.''' the input plain file format for both scripts is as follow:
* '''Prepare input file.'''  
 
The input plain file format for both scripts is as follow:


''ROC/NGI support unit in GGUS; Site name; Availability; Reliability;''
''ROC/NGI support unit in GGUS; Site name; Availability; Reliability;''

Revision as of 11:59, 31 August 2010

Internal procedure for COD

This page describes steps which should be taken by COD shifter to follow availability/reliability issues.


When GGUS ticket about availability/reliability metrics is assigned to COD:


Timelines Step Substep Description
1 Add ticket url to Availability and reliability internal procedure for COD tickets page
2 Ava/Rel report review
1 Prepare 'sites for suspension' list: Look at two previous months in AR report and the current one. If all are below 50% then sites qualifies for suspension.
2 Prepare 'sites to be asked for explanation' list: Look at current months in AR report. If Ava. is below 70% or Rel. below 75% then sites qualifies to be asked for explanation. This list should be prepared according to requirements for input file for ticket generator
3 Create tickets for each case as a child to the tickets assigned to COD
1 For 'sites for suspension' list please use ticket generator
2 For 'sites to be asked for explanation' list please use ticket generator
Within 7 working days from when the tickets are created. 4 When explanation is provided and is found satisfactory, you should set child ticket to 'verified' status.
After 7 working days from when the tickets are created. 5 Final action.
1 Close all open tickets.
2 Suspend in GOC DB sites from sites for suspension' list qualified for suspension.
3 Prepare summary report of explanations (it should be placed in parent ticket):
  1. sites which are not responsive
  2. sites which provided not satisfactory explanation
  3. ROCs/NGIs which are not responsive

Questions/issues

MR: what do we do with sites marked with "n/a"?

Tickets content

Request for explanation

Subject:$SU/$siteName - availability/reliability statistics for $date

Dear $SU,

According to recent availability/reliability report $siteName has achieved
poor performance Ava. $availability  Rel. $realiability.
More details: https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics.

Could you please provide explanations for poor performance of the $siteName site?

Your explanation must be returned within 7 working days from when the ticket is created.
If the explanation is not given in due time, or the explanation is found inadequate,
the EGI Chief Operations Officer can decide within 3 working days after the deadline
to suspend the site.

If the site was certified during last month please close this ticket and 
put this info in a ticket solution field. There is known bug in report 
generation tool being worked on.


Best Regards,
EGI Central Operator on Duty

Site for suspension

Subject:$SU/$siteName site suspension

Dear $SU,

According to recent availability/reliability report $siteName has achieved
poor performance below target Ava. 50% or Rel. 50% in three consecutive months.
More details: https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics.

According to procedures approved on OMB 17.08, site will be suspended within 10 working days unless the NGI intervene.
If you think that the site should not be suspended please provide justification within 7 working days.

Best Regards,
EGI Central Operator on Duty

How to use ticket generator

current version of the script: 2.0

  • Configure the script.

In start-explanations.pl/start-suspend.pl file at the beginning of the script you have to fill in following variable:

# PRODUCTION
my $endpoint = "https://gusiwr.fzk.de/arsys/services/ARService?server=gusiwr&webService=Grid_HelpDesk";
my $user = ""; # login to GGUS web-services
my $pass = ""; # password to GGUS web-services

# Submitter data, Those data will be used as submitter's data to create tickets
my $Mail = ""; # your email address
my $DN = "";   # your DN
my $Name = ""; # Name and Surname


  • Prepare input file.

The input plain file format for both scripts is as follow:

ROC/NGI support unit in GGUS; Site name; Availability; Reliability;

Remember that in each line should be one site and the number of semicolons should be always 4. For start-suspend.pl script Availability and Reliability values are omitted but semicolons are necessary.

example:

NGI_PL; CYFRONET_LCG2; 50%; 10%;
NGI_PL; IFJ-PAN; 15%; 3%;
  • Execute the tool

Login to machine with perl installed and execute the script as follow:

perl start-explanations.pl/start-suspend.pl PARENT_TICKET_ID "DATE" FILE_NAME

PARENT_TICKET_ID - number of "Availability/reliability statistics for *" ticket

DATE - date of the report. Format: "month year"

FILE_NAME - file with input availability/reliability data

example:

  perl start-explanations.pl 4121 "May 2010" dane.txt