https://wiki.egi.eu/w/api.php?action=feedcontributions&user=Fergadis&feedformat=atomEGIWiki - User contributions [en]2024-03-29T16:01:38ZUser contributionsMediaWiki 1.37.1https://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=58701Resource Centres OLA and Resource infrastructure Provider OLA reports2013-08-05T08:31:57Z<p>Fergadis: /* Top-BDII&nbsp;Availability and Reliability */</p>
<hr />
<div>{{Template:Op menubar}} {{TOC_right}} <br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.cern.ch/myegi/sa/ MyEGI] portal. <br> <br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs). <br />
<br />
[[SAM Tests|SAM metric]] results are used for the calculation of Availability/Reliability. <br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics. <br />
<br />
<br> <br />
<br />
'''Availability and Reliability reports are oversight according to procedure [[PROC04|''PROC04 Quality verification of monthly availability and reliability statistics'']]''' <br />
<br />
= Performance reports =<br />
<br />
== Resource Centres ==<br />
<br />
=== RC&nbsp;Availability and Reliability ===<br />
<br />
[https://documents.egi.eu/document/1622 January 2008 - April 2010 ] (EGEE) <br />
<br />
{| cellspacing="0" cellpadding="5" border="1" class="wikitable"<br />
|-<br />
! Availability/Reliability <br />
! Jan <br />
! Feb <br />
! Mar <br />
! Apr <br />
! May <br />
! Jun <br />
! Jul <br />
! Aug <br />
! Sep <br />
! Oct <br />
! Nov <br />
! Dec<br />
|-<br />
! 2010 <br />
| - <br />
| - <br />
| - <br />
| - <br />
| [https://documents.egi.eu/document/42 05/10] <br />
| [https://documents.egi.eu/document/96 06/10] <br />
| [https://documents.egi.eu/document/130 07/10] <br />
| [https://documents.egi.eu/document/157 08/10] <br />
| [https://documents.egi.eu/document/219 09/10] <br />
| [https://documents.egi.eu/document/238 10/10] <br />
| [https://documents.egi.eu/document/266 11/10] <br />
| [https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011 <br />
| [https://documents.egi.eu/document/332 01/11] <br />
| [https://documents.egi.eu/document/402 02/11] <br />
| [https://documents.egi.eu/document/465 03/11] <br />
| [https://documents.egi.eu/document/508 04/11] <br />
| [https://documents.egi.eu/document/593 05/11] <br />
| [https://documents.egi.eu/document/648 06/11] <br />
| [https://documents.egi.eu/document/716 07/11] <br />
| [https://documents.egi.eu/document/783 08/11] <br />
| [https://documents.egi.eu/document/820 09/11] <br />
| [https://documents.egi.eu/document/879 10/11] <br />
| [https://documents.egi.eu/document/905 11/11] <br />
| [https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012 <br />
| [https://documents.egi.eu/document/1000 01/12] <br />
| [https://documents.egi.eu/document/1033 02/12] <br />
| [https://documents.egi.eu/document/1091 03/12] <br />
| [https://documents.egi.eu/document/1117 04/12] <br />
| [https://documents.egi.eu/document/1174 05/12] <br />
| [https://documents.egi.eu/document/1251 06/12] <br />
| [https://documents.egi.eu/document/1307 07/12] <br />
| [https://documents.egi.eu/document/1332 08/12] <br />
| [https://documents.egi.eu/document/1370 09/12] <br />
| [https://documents.egi.eu/document/1429 10/12] <br />
| [https://documents.egi.eu/document/1487 11/12] <br />
| [https://documents.egi.eu/document/1516 12/12]<br />
|-<br />
! 2013<br />
| [https://documents.egi.eu/document/1567 01/13]<br />
| [https://documents.egi.eu/document/1615 02/13]<br />
| [https://documents.egi.eu/document/1683 03/13]<br />
| [https://documents.egi.eu/document/1734 04/13]<br />
| [https://documents.egi.eu/document/1788 05/13]<br />
| [https://documents.egi.eu/document/1857 06/13]<br />
| [https://documents.egi.eu/document/1880 07/13]<br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
|}<br />
<br />
==== Underperforming/Suspended RCs ====<br />
<br />
*List of [[Underperforming sites and suspensions|underperforming/suspended Resource Centres ]] <br />
*List of [[List of sites for which the availability followup procedures were not applicable|Resource Centres]] to which the Availability followup procedure was not applicable<br />
<br />
== Resource infrastructures Providers ==<br />
<br />
=== Top-BDII&nbsp;Availability and Reliability<br> ===<br />
<br />
{| cellspacing="0" cellpadding="5" border="1" class="wikitable"<br />
|-<br />
| '''Service Level: ''' <br />
'''top-BDII Availability/Reliability''' <br />
<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011''' <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1091&version=2&filename=EGI-core_services_availabilities-per_NGI-Mar2012.pdf 03/12] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf 04/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1174&version=3&filename=EGI-core_services_availabilities-per_NGI-May2012-1.pdf 05/12] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1251&version=3&filename=EGI-core_services_availabilities-per_NGI-June2012-1.pdf 06/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1307&version=1&filename=EGI-core_services_availabilities-per_NGI-July2012.pdf 07/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1332&version=2&filename=EGI-core_services_availabilities-per_NGI-August2012.pdf 08/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1370&version=1&filename=EGI-core_services_availabilities-per_NGI-September2012.pdf 09/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1429&version=1&filename=EGI-core_services_availabilities-per_NGI-October2012.pdf 10/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1487&version=1&filename=EGI-core_services_availabilities-per_NGI-November2012-v3.pdf 11/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1516&version=1&filename=EGI-core_services_availabilities-per_NGI-December2012.pdf 12/12]<br />
|-<br />
| '''2013'''<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1567&version=1&filename=EGI-core_services_availabilities-per_NGI-January2013.pdf 01/13] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1615&version=1&filename=EGI-core_services_availabilities-per_NGI-February2013.pdf 02/13]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1683&version=1&filename=EGI-core_services_availabilities-per_NGI-March2013.pdf 03/13]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1734&version=1&filename=EGI-core_services_availabilities_per_NGI-April2013.pdf 04/13]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1788&version=1&filename=EGI-core_services_availabilities_per_NGI-May2013.pdf 05/13]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1857&version=1&filename=EGI-core_services_availabilities_per_NGI-Jun2013.pdf 06/13]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1880&version=1&filename=EGI-core_services_availabilities_per_NGI-Jul2013.pdf 07/13]<br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
|}<br />
<br />
=== ROD Performance Index ===<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" class="wikitable"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' ticket/[https://documents.egi.eu/document/1089 Report] <br />
<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011''' <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
<br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
<br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
<br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=80841 80841]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=81998 81998]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-04.pdf 04/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=82926 82926]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-05.pdf 05/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=84168 84168]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-06.pdf 06/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=85127 85127]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-07.pdf 07/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=86007 86007]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-08.pdf 08/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=87015 87015]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-09.pdf 09/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=88157 88157]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-10.pdf 10/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=89486 89486]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-11.pdf 11/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=90414 &nbsp;90414]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-12.pdf 12/12 ] <br />
<br />
|-<br />
| '''2013''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=91488 &nbsp;91488]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2013-01.pdf 01/13] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=92270 92270]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2013-02.pdf 02/13] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=93380 93380]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2013-03.pdf 03/13] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=93919 93919]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2013-04.pdf 04/13] <br />
<br />
| [http://ggus.eu/ws/ticket_info.php?ticket=95631 94671/ ]<br />
[http://ggus.eu/ws/ticket_info.php?ticket=95631 05/13] <br />
<br />
| [http://ggus.eu/ws/ticket_info.php?ticket=95631 95631<br>]<br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
|}<br />
<br />
== EGI overall quarterly Availability and Reliability ==<br />
<br />
{| class="wikitable"<br />
|-<br />
| '''PQ''' <br />
| '''Period''' <br />
| '''Availability''' <br />
| '''Reliability'''<br />
|-<br />
| '''PQ01''' <br />
| May-Jun 2010 <br />
| 93.3% <br />
| 94.3%<br />
|-<br />
| '''PQ02''' <br />
| Aug-Oct 2010 <br />
| 90.7% <br />
| 91.7%<br />
|-<br />
| '''PQ03''' <br />
| Nov-Jan 2011 <br />
| 92.3% <br />
| 93.3%<br />
|-<br />
| '''PQ04''' <br />
| Feb-Apr 2011 <br />
| 94.5% <br />
| 95.8%<br />
|-<br />
| '''PQ05''' <br />
| May-June 2011 <br />
| 95.4% <br />
| 96.1%<br />
|-<br />
| '''PQ06''' <br />
| Aug-Oct 2011 <br />
| 93.3% <br />
| 94.5%<br />
|-<br />
| '''PQ07''' <br />
| Nov-Jan 2012 <br />
| 95.1% <br />
| 95.9%<br />
|-<br />
| '''PQ08''' <br />
| Feb-Apr 2012 <br />
| 93.5% <br />
| 94.4%<br />
|-<br />
| '''PQ09''' <br />
| May-Jul 2012 <br />
| 93.9% <br />
| 94.8%<br />
|-<br />
| '''PQ10''' <br />
| Aug-Oct 2012 <br />
| 92.5% <br />
| 93.5%<br />
|-<br />
| '''PQ11''' <br />
| Nov-Jan 2013 <br />
| 94% <br />
| 95.7%<br />
|-<br />
| '''PQ12''' <br />
| Feb-Apr 2013 <br />
| 96.43% <br />
| 96.94%<br />
|}<br />
<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010) <br />
<br />
[[Category:Service_Level_Management]]</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=58700Resource Centres OLA and Resource infrastructure Provider OLA reports2013-08-05T08:29:30Z<p>Fergadis: /* RC&nbsp;Availability and Reliability */</p>
<hr />
<div>{{Template:Op menubar}} {{TOC_right}} <br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.cern.ch/myegi/sa/ MyEGI] portal. <br> <br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs). <br />
<br />
[[SAM Tests|SAM metric]] results are used for the calculation of Availability/Reliability. <br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics. <br />
<br />
<br> <br />
<br />
'''Availability and Reliability reports are oversight according to procedure [[PROC04|''PROC04 Quality verification of monthly availability and reliability statistics'']]''' <br />
<br />
= Performance reports =<br />
<br />
== Resource Centres ==<br />
<br />
=== RC&nbsp;Availability and Reliability ===<br />
<br />
[https://documents.egi.eu/document/1622 January 2008 - April 2010 ] (EGEE) <br />
<br />
{| cellspacing="0" cellpadding="5" border="1" class="wikitable"<br />
|-<br />
! Availability/Reliability <br />
! Jan <br />
! Feb <br />
! Mar <br />
! Apr <br />
! May <br />
! Jun <br />
! Jul <br />
! Aug <br />
! Sep <br />
! Oct <br />
! Nov <br />
! Dec<br />
|-<br />
! 2010 <br />
| - <br />
| - <br />
| - <br />
| - <br />
| [https://documents.egi.eu/document/42 05/10] <br />
| [https://documents.egi.eu/document/96 06/10] <br />
| [https://documents.egi.eu/document/130 07/10] <br />
| [https://documents.egi.eu/document/157 08/10] <br />
| [https://documents.egi.eu/document/219 09/10] <br />
| [https://documents.egi.eu/document/238 10/10] <br />
| [https://documents.egi.eu/document/266 11/10] <br />
| [https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011 <br />
| [https://documents.egi.eu/document/332 01/11] <br />
| [https://documents.egi.eu/document/402 02/11] <br />
| [https://documents.egi.eu/document/465 03/11] <br />
| [https://documents.egi.eu/document/508 04/11] <br />
| [https://documents.egi.eu/document/593 05/11] <br />
| [https://documents.egi.eu/document/648 06/11] <br />
| [https://documents.egi.eu/document/716 07/11] <br />
| [https://documents.egi.eu/document/783 08/11] <br />
| [https://documents.egi.eu/document/820 09/11] <br />
| [https://documents.egi.eu/document/879 10/11] <br />
| [https://documents.egi.eu/document/905 11/11] <br />
| [https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012 <br />
| [https://documents.egi.eu/document/1000 01/12] <br />
| [https://documents.egi.eu/document/1033 02/12] <br />
| [https://documents.egi.eu/document/1091 03/12] <br />
| [https://documents.egi.eu/document/1117 04/12] <br />
| [https://documents.egi.eu/document/1174 05/12] <br />
| [https://documents.egi.eu/document/1251 06/12] <br />
| [https://documents.egi.eu/document/1307 07/12] <br />
| [https://documents.egi.eu/document/1332 08/12] <br />
| [https://documents.egi.eu/document/1370 09/12] <br />
| [https://documents.egi.eu/document/1429 10/12] <br />
| [https://documents.egi.eu/document/1487 11/12] <br />
| [https://documents.egi.eu/document/1516 12/12]<br />
|-<br />
! 2013<br />
| [https://documents.egi.eu/document/1567 01/13]<br />
| [https://documents.egi.eu/document/1615 02/13]<br />
| [https://documents.egi.eu/document/1683 03/13]<br />
| [https://documents.egi.eu/document/1734 04/13]<br />
| [https://documents.egi.eu/document/1788 05/13]<br />
| [https://documents.egi.eu/document/1857 06/13]<br />
| [https://documents.egi.eu/document/1880 07/13]<br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
|}<br />
<br />
==== Underperforming/Suspended RCs ====<br />
<br />
*List of [[Underperforming sites and suspensions|underperforming/suspended Resource Centres ]] <br />
*List of [[List of sites for which the availability followup procedures were not applicable|Resource Centres]] to which the Availability followup procedure was not applicable<br />
<br />
== Resource infrastructures Providers ==<br />
<br />
=== Top-BDII&nbsp;Availability and Reliability<br> ===<br />
<br />
{| cellspacing="0" cellpadding="5" border="1" class="wikitable"<br />
|-<br />
| '''Service Level: ''' <br />
'''top-BDII Availability/Reliability''' <br />
<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011''' <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1091&version=2&filename=EGI-core_services_availabilities-per_NGI-Mar2012.pdf 03/12] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf 04/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1174&version=3&filename=EGI-core_services_availabilities-per_NGI-May2012-1.pdf 05/12] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1251&version=3&filename=EGI-core_services_availabilities-per_NGI-June2012-1.pdf 06/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1307&version=1&filename=EGI-core_services_availabilities-per_NGI-July2012.pdf 07/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1332&version=2&filename=EGI-core_services_availabilities-per_NGI-August2012.pdf 08/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1370&version=1&filename=EGI-core_services_availabilities-per_NGI-September2012.pdf 09/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1429&version=1&filename=EGI-core_services_availabilities-per_NGI-October2012.pdf 10/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1487&version=1&filename=EGI-core_services_availabilities-per_NGI-November2012-v3.pdf 11/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1516&version=1&filename=EGI-core_services_availabilities-per_NGI-December2012.pdf 12/12]<br />
|-<br />
| '''2013'''<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1567&version=1&filename=EGI-core_services_availabilities-per_NGI-January2013.pdf 01/13] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1615&version=1&filename=EGI-core_services_availabilities-per_NGI-February2013.pdf 02/13]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1683&version=1&filename=EGI-core_services_availabilities-per_NGI-March2013.pdf 03/13]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1734&version=1&filename=EGI-core_services_availabilities_per_NGI-April2013.pdf 04/13]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1788&version=1&filename=EGI-core_services_availabilities_per_NGI-May2013.pdf 05/13]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1857&version=1&filename=EGI-core_services_availabilities_per_NGI-Jun2013.pdf 06/13]<br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
|}<br />
<br />
=== ROD Performance Index ===<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" class="wikitable"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' ticket/[https://documents.egi.eu/document/1089 Report] <br />
<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011''' <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
<br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
<br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
<br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=80841 80841]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=81998 81998]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-04.pdf 04/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=82926 82926]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-05.pdf 05/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=84168 84168]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-06.pdf 06/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=85127 85127]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-07.pdf 07/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=86007 86007]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-08.pdf 08/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=87015 87015]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-09.pdf 09/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=88157 88157]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-10.pdf 10/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=89486 89486]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-11.pdf 11/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=90414 &nbsp;90414]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-12.pdf 12/12 ] <br />
<br />
|-<br />
| '''2013''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=91488 &nbsp;91488]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2013-01.pdf 01/13] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=92270 92270]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2013-02.pdf 02/13] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=93380 93380]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2013-03.pdf 03/13] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=93919 93919]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2013-04.pdf 04/13] <br />
<br />
| [http://ggus.eu/ws/ticket_info.php?ticket=95631 94671/ ]<br />
[http://ggus.eu/ws/ticket_info.php?ticket=95631 05/13] <br />
<br />
| [http://ggus.eu/ws/ticket_info.php?ticket=95631 95631<br>]<br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
|}<br />
<br />
== EGI overall quarterly Availability and Reliability ==<br />
<br />
{| class="wikitable"<br />
|-<br />
| '''PQ''' <br />
| '''Period''' <br />
| '''Availability''' <br />
| '''Reliability'''<br />
|-<br />
| '''PQ01''' <br />
| May-Jun 2010 <br />
| 93.3% <br />
| 94.3%<br />
|-<br />
| '''PQ02''' <br />
| Aug-Oct 2010 <br />
| 90.7% <br />
| 91.7%<br />
|-<br />
| '''PQ03''' <br />
| Nov-Jan 2011 <br />
| 92.3% <br />
| 93.3%<br />
|-<br />
| '''PQ04''' <br />
| Feb-Apr 2011 <br />
| 94.5% <br />
| 95.8%<br />
|-<br />
| '''PQ05''' <br />
| May-June 2011 <br />
| 95.4% <br />
| 96.1%<br />
|-<br />
| '''PQ06''' <br />
| Aug-Oct 2011 <br />
| 93.3% <br />
| 94.5%<br />
|-<br />
| '''PQ07''' <br />
| Nov-Jan 2012 <br />
| 95.1% <br />
| 95.9%<br />
|-<br />
| '''PQ08''' <br />
| Feb-Apr 2012 <br />
| 93.5% <br />
| 94.4%<br />
|-<br />
| '''PQ09''' <br />
| May-Jul 2012 <br />
| 93.9% <br />
| 94.8%<br />
|-<br />
| '''PQ10''' <br />
| Aug-Oct 2012 <br />
| 92.5% <br />
| 93.5%<br />
|-<br />
| '''PQ11''' <br />
| Nov-Jan 2013 <br />
| 94% <br />
| 95.7%<br />
|-<br />
| '''PQ12''' <br />
| Feb-Apr 2013 <br />
| 96.43% <br />
| 96.94%<br />
|}<br />
<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010) <br />
<br />
[[Category:Service_Level_Management]]</div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI_CSIRT:Alerts&diff=56686EGI CSIRT:Alerts2013-06-19T12:52:28Z<p>Fergadis: /* EGI Alerts / Advisories */</p>
<hr />
<div><!--{{Egi-csirt-header}}--><br />
{{New-Egi-csirt-header}} <br />
<br />
Security alerts and/or security advisories will be sent to all EGI site security contacts or NGI security officers by EGI CSIRT using either an EGI broadcasting tool or a pre-established mailing list. They will also be listed on this page. They may cover a wide range of software, including — but not limited to — the EGI middleware.<br />
<br />
{| {{egi-table}}<br />
!Date !! Title !! Contents !! Rating<br />
|-<br />
|2010-XX-XX || A brief discription || Link to the alert/advisory ||Critical/High/Moderate/Low Risk<br />
|}<br />
<br />
The risk rating is in line with [https://wiki.egi.eu/wiki/SVG:Issue_Handling_Summary EGI SVG]'s practice. <br />
<br />
== EGI Alerts / Advisories ==<br />
The following alert bulletins describe security vulnerabilities or immediate threats against one or more sites or the EGI infrastructure and include recommendations and mitigation techniques.<br />
<br />
[[EGI_CSIRT:Alerts/AdvisoryTemplate|This template]] should be used when drafting an advisory.<br />
<br />
{| {{egi-table}}<br />
!Date !! Title !! Contents !! Rating<br />
<br />
|-<br />
<br />
|2013-06-19|| Advisory concerning puppet vulnerability<br />
|| [[EGI_CSIRT:Alerts/puppet-2013-06-19|Alerts/puppet-2013-06-19 ]] || Critical<br />
|-<br />
<br />
|-<br />
<br />
|2013-05-14|| Advisory concerning perf_event kernel vulnerability<br />
|| [[EGI_CSIRT:Alerts/kernel-2013-05-14|Alerts/kernel-2013-05-14 ]] || Critical<br />
|-<br />
<br />
|2013-03-18|| Advisory concerning ptrace kernel vulnerability<br />
|| [[EGI_CSIRT:Alerts/kernel-2013-03-18|Alerts/kernel-2013-03-18 ]] || High<br />
|-<br />
<br />
|2012-08-01|| Advisory concerning gLite 3.2 middleware components no longer supported on 01 August 2012.<br />
|| [[EGI_CSIRT:Advisory/EGI-ADV-20120801/ |Advisory-EGI-ADV-20120801 ]] || Advisory<br />
|-<br />
|2012-07-17|| Critical - Wrong permissions on directory containing user proxies|| [[EGI_CSIRT:Alerts/EMI-1-WMS-file-permissions |Alerts/EMI-1-WMS-file-permissions-2012-07-16]] || Critical<br />
|-<br />
|2012-07-16|| Advisory - EGI CSIRT:Advisory; Upgrade gLite-3*, RHel4* and derivatives || [[EGI_CSIRT:Advisory |Advisory;Upgrade gLite-3*, RHel4* and derivatives]] || Advisory<br />
|-<br />
|2012-02-06|| MODERATE RISK - Multiple Vulnerabilities in the libxml (CVE-2012-3919 etc.)|| [[EGI_CSIRT:Alerts/libxml2-2012-02-06 |Alerts/libxml2-2012-02-06]] || Moderate<br />
|-<br />
|2012-01-23 || High risk vulnerability in Linux kernel: Insufficient /proc/pid/mem access control (CVE-2012-0056) || [[EGI_CSIRT:Alerts/kernel-2012-01-23|Alerts/kernel-2012-01-23]] || High<br />
|-<br />
|2011-12-28 || Critical telnetd vulnerability - Remote root vulnerability in telnet daemons (CVE-2011-4862) || [[EGI_CSIRT:Alerts/telnetd-2011-12-28|Alerts/telnetd-2011-12-28]] || Critical<br />
|-<br />
|2011-06-15 || High Risk - Torque Authentication Bypass Vulnerability (CVE-2011-2907) || [[EGI_CSIRT:Alerts/Torque-2011-06-15|Alerts/Torque-2011-06-15]] || High<br />
|-<br />
|2011-04-12 || HIGH Risk glibc Vulnerability - privilege escalation (CVE-2011-0536) || [[EGI_CSIRT:Alerts/glibc-2011-04-12|Alerts/glibc-2011-04-12]] || High<br />
|-<br />
|2011-03-30 || Critical Vulnerability detected in dCache Admin Web Interface || [[EGI_CSIRT:Alerts/dCache-2011-03-30|Alerts/dCache-2011-03-30]] || Critical<br />
|-<br />
|2011-01-07 || High Risk Kernel Vulnerability:heap overflow in tipc_msg_build() (CVE-2010-3859)|| [[EGI_CSIRT:Alerts/tipc-2011-01-07|Alerts/tipc-2011-01-07]] || High<br />
|-<br />
|2010-12-16 || HIGH root vulnerabilities in Tivoli Storage Manager (TSM) client software || [[EGI_CSIRT:Alerts/tsm-2010-12-16|Alerts/tsm-2010-12-16]] || High<br />
|-<br />
|2010-11-18 || CRITICAL Local root vulnerability in systemtap (CVE-2010-4170) || [[EGI_CSIRT:Alerts/systemtap-2010-11-18|Alerts/systemtap-2010-11-18]] || Critical<br />
|-<br />
|2010-11-02 || HIGH iovec integer overflow in net/rds/rdma.c (CVE-2010-3865) || [[EGI_CSIRT:Alerts/rds-rdma-2010-11-02|Alerts/rds/rdma-2010-11-02]] || High<br />
|-<br />
|2010-10-23 || HIGH Vulnerability in C library dynamic linker (CVE-2010-3856) || [[EGI_CSIRT:Alerts/liblinker-2010-10-23|Alerts/liblinker-2010-10-23]] || High<br />
|-<br />
|2010-10-20 || HIGH Local root vulnerability in RDS (CVE-2010-3904) || [[EGI_CSIRT:Alerts/rds-2010-10-20|Alerts/rds-2010-10-20]] || High<br />
|-<br />
|2010-10-18 || HIGH Vulnerability in C library dynamic linker (CVE-2010-3847) || [[EGI_CSIRT:Alerts/liblinker-2010-10-18|Alerts/liblinker-2010-10-18]] || High<br />
|-<br />
|2010-09-30 || RHEL4 patch for CVE-2010-3081 kernel vulnerability (CVE-2010-3081) || [[EGI_CSIRT:Alerts/kernel-2010-09-30|Alerts/kernel-2010-09-30]] || Moderate<br />
|-<br />
|2010-09-16 || Critical Kernel Vulnerability: 64-bit Compatibility Mode Stack Pointer Corruption (CVE-2010-3081)|| [[EGI_CSIRT:Alerts/kernel-2010-09-16|Alerts/kernel-2010-09-16]] || Critical<br />
|-<br />
|2010-08-18 || Moderate Impact Vulnerabilities in Elog Web Application || [[EGI_CSIRT:Alerts/elog-2010-08-18|Alerts/elog-2010-08-18]] || Moderate<br />
|-<br />
|2010-06-28 || Moderate Impact Vulnerability In Intel Compiler Suite || [[EGI_CSIRT:Alerts/intel-28-06-2010|Alerts/intel-28-06-2010]] || Moderate<br />
|}<br />
<br />
== EGEE Alerts ==<br />
List of alerts published during EGEE <br />
<br />
{| {{egi-table}}<br />
!Date !! Title !! Contents !! Rating<br />
|-<br />
|2009-11-24 || Critical-risk vulnerabilities CVE-2009-3547 || [https://wiki.egi.eu/csirt/index.php/Internal_Notes_on_CVEs Alerts/cve-3547] ||Critical risk<br />
|-<br />
|2009-10-20 || High-risk vulnerabilities in CREAM CE software || [[EGI_CSIRT:Alerts/cream-20-10-2009|Alerts/cream-20-10-2009]] ||High risk<br />
|-<br />
|2009-07-09 || Remote command execution in Nagios WAP/WML interface || [[EGI_CSIRT:Alerts/nagios-09-07-2009|Alerts/nagios-09-07-2009]] ||Medium risk<br />
|-<br />
|2008-07-29 || DNS cache poisoning/spoofing || [[EGI_CSIRT:Alerts/dns-29-07-2008|Alerts/dns-29-07-2008]] ||Medium risk<br />
|-<br />
|2006-10-23 || Critical Vulnerability: OpenPBS/Torque || [[EGI_CSIRT:Alerts/openpbs-23-10-2006|Alerts/openpbs-23-10-2006]] ||Extremely critical<br />
|}<br />
{{From OSCT wiki|http://osct.web.cern.ch/osct/alerts.html}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=56148Resource Centres OLA and Resource infrastructure Provider OLA reports2013-06-04T16:05:26Z<p>Fergadis: /* Top-BDII&nbsp;Availability and Reliability */</p>
<hr />
<div>{{Template:Op menubar}} {{TOC_right}} <br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.cern.ch/myegi/sa/ MyEGI] portal. <br> <br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs). <br />
<br />
[[SAM Tests|SAM metric]] results are used for the calculation of Availability/Reliability. <br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics. <br />
<br />
<br> <br />
<br />
'''Availability and Reliability reports are oversight according to procedure [[PROC04|''PROC04 Quality verification of monthly availability and reliability statistics'']]''' <br />
<br />
= Performance reports =<br />
<br />
== Resource Centres ==<br />
<br />
=== RC&nbsp;Availability and Reliability ===<br />
<br />
[https://documents.egi.eu/document/1622 January 2008 - April 2010 ] (EGEE) <br />
<br />
{| cellspacing="0" cellpadding="5" border="1" class="wikitable"<br />
|-<br />
! Availability/Reliability <br />
! Jan <br />
! Feb <br />
! Mar <br />
! Apr <br />
! May <br />
! Jun <br />
! Jul <br />
! Aug <br />
! Sep <br />
! Oct <br />
! Nov <br />
! Dec<br />
|-<br />
! 2010 <br />
| - <br />
| - <br />
| - <br />
| - <br />
| [https://documents.egi.eu/document/42 05/10] <br />
| [https://documents.egi.eu/document/96 06/10] <br />
| [https://documents.egi.eu/document/130 07/10] <br />
| [https://documents.egi.eu/document/157 08/10] <br />
| [https://documents.egi.eu/document/219 09/10] <br />
| [https://documents.egi.eu/document/238 10/10] <br />
| [https://documents.egi.eu/document/266 11/10] <br />
| [https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011 <br />
| [https://documents.egi.eu/document/332 01/11] <br />
| [https://documents.egi.eu/document/402 02/11] <br />
| [https://documents.egi.eu/document/465 03/11] <br />
| [https://documents.egi.eu/document/508 04/11] <br />
| [https://documents.egi.eu/document/593 05/11] <br />
| [https://documents.egi.eu/document/648 06/11] <br />
| [https://documents.egi.eu/document/716 07/11] <br />
| [https://documents.egi.eu/document/783 08/11] <br />
| [https://documents.egi.eu/document/820 09/11] <br />
| [https://documents.egi.eu/document/879 10/11] <br />
| [https://documents.egi.eu/document/905 11/11] <br />
| [https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012 <br />
| [https://documents.egi.eu/document/1000 01/12] <br />
| [https://documents.egi.eu/document/1033 02/12] <br />
| [https://documents.egi.eu/document/1091 03/12] <br />
| [https://documents.egi.eu/document/1117 04/12] <br />
| [https://documents.egi.eu/document/1174 05/12] <br />
| [https://documents.egi.eu/document/1251 06/12] <br />
| [https://documents.egi.eu/document/1307 07/12] <br />
| [https://documents.egi.eu/document/1332 08/12] <br />
| [https://documents.egi.eu/document/1370 09/12] <br />
| [https://documents.egi.eu/document/1429 10/12] <br />
| [https://documents.egi.eu/document/1487 11/12] <br />
| [https://documents.egi.eu/document/1516 12/12]<br />
|-<br />
! 2013<br />
| [https://documents.egi.eu/document/1567 01/13]<br />
| [https://documents.egi.eu/document/1615 02/13]<br />
| [https://documents.egi.eu/document/1683 03/13]<br />
| [https://documents.egi.eu/document/1734 04/13]<br />
| [https://documents.egi.eu/document/1788 05/13]<br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
|}<br />
<br />
==== Underperforming/Suspended RCs ====<br />
<br />
*List of [[Underperforming sites and suspensions|underperforming/suspended Resource Centres ]] <br />
*List of [[List of sites for which the availability followup procedures were not applicable|Resource Centres]] to which the Availability followup procedure was not applicable<br />
<br />
== Resource infrastructures Providers ==<br />
<br />
=== Top-BDII&nbsp;Availability and Reliability<br> ===<br />
<br />
{| cellspacing="0" cellpadding="5" border="1" class="wikitable"<br />
|-<br />
| '''Service Level: ''' <br />
'''top-BDII Availability/Reliability''' <br />
<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011''' <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1091&version=2&filename=EGI-core_services_availabilities-per_NGI-Mar2012.pdf 03/12] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf 04/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1174&version=3&filename=EGI-core_services_availabilities-per_NGI-May2012-1.pdf 05/12] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1251&version=3&filename=EGI-core_services_availabilities-per_NGI-June2012-1.pdf 06/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1307&version=1&filename=EGI-core_services_availabilities-per_NGI-July2012.pdf 07/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1332&version=2&filename=EGI-core_services_availabilities-per_NGI-August2012.pdf 08/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1370&version=1&filename=EGI-core_services_availabilities-per_NGI-September2012.pdf 09/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1429&version=1&filename=EGI-core_services_availabilities-per_NGI-October2012.pdf 10/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1487&version=1&filename=EGI-core_services_availabilities-per_NGI-November2012-v3.pdf 11/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1516&version=1&filename=EGI-core_services_availabilities-per_NGI-December2012.pdf 12/12]<br />
|-<br />
| '''2013'''<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1567&version=1&filename=EGI-core_services_availabilities-per_NGI-January2013.pdf 01/13] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1615&version=1&filename=EGI-core_services_availabilities-per_NGI-February2013.pdf 02/13]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1683&version=1&filename=EGI-core_services_availabilities-per_NGI-March2013.pdf 03/13]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1734&version=1&filename=EGI-core_services_availabilities_per_NGI-April2013.pdf 04/13]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1788&version=1&filename=EGI-core_services_availabilities_per_NGI-May2013.pdf 05/13]<br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
|}<br />
<br />
=== ROD Performance Index ===<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" class="wikitable"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' ticket/[https://documents.egi.eu/document/1089 Report] <br />
<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011''' <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
<br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
<br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
<br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=80841 80841]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=81998 81998]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-04.pdf 04/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=82926 82926]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-05.pdf 05/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=84168 84168]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-06.pdf 06/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=85127 85127]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-07.pdf 07/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=86007 86007]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-08.pdf 08/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=87015 87015]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-09.pdf 09/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=88157 88157]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-10.pdf 10/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=89486 89486]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-11.pdf 11/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=90414 &nbsp;90414]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-12.pdf 12/12 ] <br />
<br />
|-<br />
| '''2013''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=91488 &nbsp;91488]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2013-01.pdf 01/13] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=92270 92270]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2013-02.pdf 02/13] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=93380 93380]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2013-03.pdf 03/13] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=93919 93919]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2013-04.pdf 04/13] <br />
<br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
|}<br />
<br />
== EGI overall quarterly Availability and Reliability ==<br />
<br />
{| class="wikitable"<br />
|-<br />
| '''PQ''' <br />
| '''Period''' <br />
| '''Availability''' <br />
| '''Reliability'''<br />
|-<br />
| '''PQ01''' <br />
| May-Jun 2010 <br />
| 93.3% <br />
| 94.3%<br />
|-<br />
| '''PQ02''' <br />
| Aug-Oct 2010 <br />
| 90.7% <br />
| 91.7%<br />
|-<br />
| '''PQ03''' <br />
| Nov-Jan 2011 <br />
| 92.3% <br />
| 93.3%<br />
|-<br />
| '''PQ04''' <br />
| Feb-Apr 2011 <br />
| 94.5% <br />
| 95.8%<br />
|-<br />
| '''PQ05''' <br />
| May-June 2011 <br />
| 95.4% <br />
| 96.1%<br />
|-<br />
| '''PQ06''' <br />
| Aug-Oct 2011 <br />
| 93.3% <br />
| 94.5%<br />
|-<br />
| '''PQ07''' <br />
| Nov-Jan 2012 <br />
| 95.1% <br />
| 95.9%<br />
|-<br />
| '''PQ08''' <br />
| Feb-Apr 2012 <br />
| 93.5% <br />
| 94.4%<br />
|-<br />
| '''PQ09''' <br />
| May-Jul 2012 <br />
| 93.9% <br />
| 94.8%<br />
|-<br />
| '''PQ10''' <br />
| Aug-Oct 2012 <br />
| 92.5% <br />
| 93.5%<br />
|-<br />
| '''PQ11''' <br />
| Nov-Jan 2013 <br />
| 94% <br />
| 95.7%<br />
|-<br />
| '''PQ12''' <br />
| Feb-Apr 2013 <br />
| 96.43% <br />
| 96.94%<br />
|}<br />
<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010) <br />
<br />
[[Category:Service_Level_Management]]</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=56146Resource Centres OLA and Resource infrastructure Provider OLA reports2013-06-04T16:04:15Z<p>Fergadis: /* RC&nbsp;Availability and Reliability */</p>
<hr />
<div>{{Template:Op menubar}} {{TOC_right}} <br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.cern.ch/myegi/sa/ MyEGI] portal. <br> <br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs). <br />
<br />
[[SAM Tests|SAM metric]] results are used for the calculation of Availability/Reliability. <br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics. <br />
<br />
<br> <br />
<br />
'''Availability and Reliability reports are oversight according to procedure [[PROC04|''PROC04 Quality verification of monthly availability and reliability statistics'']]''' <br />
<br />
= Performance reports =<br />
<br />
== Resource Centres ==<br />
<br />
=== RC&nbsp;Availability and Reliability ===<br />
<br />
[https://documents.egi.eu/document/1622 January 2008 - April 2010 ] (EGEE) <br />
<br />
{| cellspacing="0" cellpadding="5" border="1" class="wikitable"<br />
|-<br />
! Availability/Reliability <br />
! Jan <br />
! Feb <br />
! Mar <br />
! Apr <br />
! May <br />
! Jun <br />
! Jul <br />
! Aug <br />
! Sep <br />
! Oct <br />
! Nov <br />
! Dec<br />
|-<br />
! 2010 <br />
| - <br />
| - <br />
| - <br />
| - <br />
| [https://documents.egi.eu/document/42 05/10] <br />
| [https://documents.egi.eu/document/96 06/10] <br />
| [https://documents.egi.eu/document/130 07/10] <br />
| [https://documents.egi.eu/document/157 08/10] <br />
| [https://documents.egi.eu/document/219 09/10] <br />
| [https://documents.egi.eu/document/238 10/10] <br />
| [https://documents.egi.eu/document/266 11/10] <br />
| [https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011 <br />
| [https://documents.egi.eu/document/332 01/11] <br />
| [https://documents.egi.eu/document/402 02/11] <br />
| [https://documents.egi.eu/document/465 03/11] <br />
| [https://documents.egi.eu/document/508 04/11] <br />
| [https://documents.egi.eu/document/593 05/11] <br />
| [https://documents.egi.eu/document/648 06/11] <br />
| [https://documents.egi.eu/document/716 07/11] <br />
| [https://documents.egi.eu/document/783 08/11] <br />
| [https://documents.egi.eu/document/820 09/11] <br />
| [https://documents.egi.eu/document/879 10/11] <br />
| [https://documents.egi.eu/document/905 11/11] <br />
| [https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012 <br />
| [https://documents.egi.eu/document/1000 01/12] <br />
| [https://documents.egi.eu/document/1033 02/12] <br />
| [https://documents.egi.eu/document/1091 03/12] <br />
| [https://documents.egi.eu/document/1117 04/12] <br />
| [https://documents.egi.eu/document/1174 05/12] <br />
| [https://documents.egi.eu/document/1251 06/12] <br />
| [https://documents.egi.eu/document/1307 07/12] <br />
| [https://documents.egi.eu/document/1332 08/12] <br />
| [https://documents.egi.eu/document/1370 09/12] <br />
| [https://documents.egi.eu/document/1429 10/12] <br />
| [https://documents.egi.eu/document/1487 11/12] <br />
| [https://documents.egi.eu/document/1516 12/12]<br />
|-<br />
! 2013<br />
| [https://documents.egi.eu/document/1567 01/13]<br />
| [https://documents.egi.eu/document/1615 02/13]<br />
| [https://documents.egi.eu/document/1683 03/13]<br />
| [https://documents.egi.eu/document/1734 04/13]<br />
| [https://documents.egi.eu/document/1788 05/13]<br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
|}<br />
<br />
==== Underperforming/Suspended RCs ====<br />
<br />
*List of [[Underperforming sites and suspensions|underperforming/suspended Resource Centres ]] <br />
*List of [[List of sites for which the availability followup procedures were not applicable|Resource Centres]] to which the Availability followup procedure was not applicable<br />
<br />
== Resource infrastructures Providers ==<br />
<br />
=== Top-BDII&nbsp;Availability and Reliability<br> ===<br />
<br />
{| cellspacing="0" cellpadding="5" border="1" class="wikitable"<br />
|-<br />
| '''Service Level: ''' <br />
'''top-BDII Availability/Reliability''' <br />
<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011''' <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1091&version=2&filename=EGI-core_services_availabilities-per_NGI-Mar2012.pdf 03/12] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf 04/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1174&version=3&filename=EGI-core_services_availabilities-per_NGI-May2012-1.pdf 05/12] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1251&version=3&filename=EGI-core_services_availabilities-per_NGI-June2012-1.pdf 06/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1307&version=1&filename=EGI-core_services_availabilities-per_NGI-July2012.pdf 07/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1332&version=2&filename=EGI-core_services_availabilities-per_NGI-August2012.pdf 08/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1370&version=1&filename=EGI-core_services_availabilities-per_NGI-September2012.pdf 09/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1429&version=1&filename=EGI-core_services_availabilities-per_NGI-October2012.pdf 10/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1487&version=1&filename=EGI-core_services_availabilities-per_NGI-November2012-v3.pdf 11/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1516&version=1&filename=EGI-core_services_availabilities-per_NGI-December2012.pdf 12/12]<br />
|-<br />
| '''2013'''<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1567&version=1&filename=EGI-core_services_availabilities-per_NGI-January2013.pdf 01/13] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1615&version=1&filename=EGI-core_services_availabilities-per_NGI-February2013.pdf 02/13]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1683&version=1&filename=EGI-core_services_availabilities-per_NGI-March2013.pdf 03/13]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1734&version=1&filename=EGI-core_services_availabilities_per_NGI-April2013.pdf 04/13]<br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
|}<br />
<br />
=== ROD Performance Index ===<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" class="wikitable"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' ticket/[https://documents.egi.eu/document/1089 Report] <br />
<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011''' <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
<br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
<br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
<br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=80841 80841]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=81998 81998]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-04.pdf 04/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=82926 82926]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-05.pdf 05/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=84168 84168]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-06.pdf 06/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=85127 85127]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-07.pdf 07/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=86007 86007]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-08.pdf 08/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=87015 87015]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-09.pdf 09/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=88157 88157]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-10.pdf 10/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=89486 89486]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-11.pdf 11/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=90414 &nbsp;90414]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-12.pdf 12/12 ] <br />
<br />
|-<br />
| '''2013''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=91488 &nbsp;91488]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2013-01.pdf 01/13] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=92270 92270]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2013-02.pdf 02/13] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=93380 93380]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2013-03.pdf 03/13] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=93919 93919]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2013-04.pdf 04/13] <br />
<br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
| <br />
|}<br />
<br />
== EGI overall quarterly Availability and Reliability ==<br />
<br />
{| class="wikitable"<br />
|-<br />
| '''PQ''' <br />
| '''Period''' <br />
| '''Availability''' <br />
| '''Reliability'''<br />
|-<br />
| '''PQ01''' <br />
| May-Jun 2010 <br />
| 93.3% <br />
| 94.3%<br />
|-<br />
| '''PQ02''' <br />
| Aug-Oct 2010 <br />
| 90.7% <br />
| 91.7%<br />
|-<br />
| '''PQ03''' <br />
| Nov-Jan 2011 <br />
| 92.3% <br />
| 93.3%<br />
|-<br />
| '''PQ04''' <br />
| Feb-Apr 2011 <br />
| 94.5% <br />
| 95.8%<br />
|-<br />
| '''PQ05''' <br />
| May-June 2011 <br />
| 95.4% <br />
| 96.1%<br />
|-<br />
| '''PQ06''' <br />
| Aug-Oct 2011 <br />
| 93.3% <br />
| 94.5%<br />
|-<br />
| '''PQ07''' <br />
| Nov-Jan 2012 <br />
| 95.1% <br />
| 95.9%<br />
|-<br />
| '''PQ08''' <br />
| Feb-Apr 2012 <br />
| 93.5% <br />
| 94.4%<br />
|-<br />
| '''PQ09''' <br />
| May-Jul 2012 <br />
| 93.9% <br />
| 94.8%<br />
|-<br />
| '''PQ10''' <br />
| Aug-Oct 2012 <br />
| 92.5% <br />
| 93.5%<br />
|-<br />
| '''PQ11''' <br />
| Nov-Jan 2013 <br />
| 94% <br />
| 95.7%<br />
|-<br />
| '''PQ12''' <br />
| Feb-Apr 2013 <br />
| 96.43% <br />
| 96.94%<br />
|}<br />
<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010) <br />
<br />
[[Category:Service_Level_Management]]</div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI_CSIRT:Alerts/kernel-2013-05-14&diff=55609EGI CSIRT:Alerts/kernel-2013-05-142013-05-20T13:43:54Z<p>Fergadis: </p>
<hr />
<div><pre><br />
** WHITE information - Unlimited distribution allowed **<br />
** see https://wiki.egi.eu/wiki/EGI_CSIRT:TLP for distribution restrictions **<br />
<br />
EGI CSIRT ADVISORY [EGI-ADV-20130514]<br />
<br />
Title: Linux kernel perf_event vulnerability (CVE-2013-2094) [EGI-ADV-20130514]<br />
Date: 2013-05-14<br />
Updated: 2013-05-17<br />
<br />
URL: https://wiki.egi.eu/wiki/EGI_CSIRT:Alerts/kernel-2013-05-14<br />
<br />
<br />
Update Summary<br />
==============<br />
<br />
+ 2013-05-14: Initial revision.<br />
+ 2013-05-15: Made mitigation drawbacks more explicit.<br />
+ 2013-05-15: Revised systemtap mitigation to support v1.7<br />
+ 2013-05-15: Added a more robust systemtap mitigation, updated recommendation<br />
+ 2013-05-15: Removed sysctl mitigation<br />
+ 2013-05-16: Fixed typo in kernel version in section "Affected Software"<br />
+ 2013-05-16: Added pointer to CERN's RPM repo for mitigation package<br />
+ 2013-05-16: Added pointer to Red Hat's Security Advisory<br />
+ 2013-05-16: Added pointer to Scientific Linux Advisory<br />
+ 2013-05-17: Added pointer to Scientific Linux/CERN Advisory<br />
+ 2013-05-17: Added pointer to CentOS Advisory<br />
+ 2013-05-20: Added pointer to SUSE Advisory, replaced recommendations with required actions<br />
<br />
<br />
Introduction<br />
============<br />
<br />
A recently-discovered vulnerability in the Linux kernel allows a local user<br />
to escalate their privilege level and gain root access. Working exploit code<br />
is publicly available.<br />
All relevant Linux distributions have already published an updated kernel<br />
which fixes this vulnerability.<br />
<br />
<br />
Required Actions<br />
================<br />
<br />
All running EGI resources MUST be either patched or otherwise have a<br />
work-around in place by 2013-05-27 T21:00+01:00. EGI sites failing to act and/or <br />
failing respond to requests from the EGI CSIRT team risk site suspension.<br />
<br />
<br />
Details<br />
=======<br />
<br />
The performance measurement subsystem in the Linux kernel incorrectly casts a<br />
64-bit integer into a 32-bit integer which is subsequently used for array<br />
dereferencing. Providing carefully chosen integers as input allows arbitrary<br />
code to be executed.<br />
<br />
The erroneous code has been introduced in kernel version 2.6.37 (commit<br />
b0a873ebbf87bf38bf70b5e39a7cadc96099fa13 on 2010-09-09) and is fixed in kernel<br />
version 3.8.9 (commit 8176cced706b5e5d15887584150764894e94e02f on 2013-04-15).<br />
Additionally, the vulnerability was backported to 2.6.32 kernels by Red Hat.<br />
<br />
Working exploit code is publicly available. This code will not work on all<br />
vulnerable distributions; however, it appears to work on RHEL 6 and derived<br />
systems.<br />
<br />
The issue has been addressed by CentOS, Debian, Red Hat, Scientific Linux,<br />
Scientific Linux/CERN and Ubuntu. As per standard EGI procedures, as patches<br />
are widely available at this time, and this issue has been assessed as CRITICAL<br />
by EGI CSIRT, sites must have rolled out the proper patches within 7 days.<br />
As widespread attacks are currently being carried out with this vulnerability,<br />
sites should take action urgently.<br />
<br />
<br />
Risk Category<br />
=============<br />
<br />
This issue has been assessed as CRITICAL risk by the EGI CSIRT as a working<br />
exploit is publicly available.<br />
<br />
<br />
Affected Software<br />
=================<br />
<br />
+ Linux kernels 2.6.36 through 3.8.8 (both including).<br />
+ Linux kernels 2.6.32 with Red Hat backports.<br />
<br />
<br />
Mitigation<br />
==========<br />
<br />
The issue can be mitigated with the use of systemtap. CERN has<br />
provided an RPM package that implements systemtap mitigation as<br />
described below for kernel versions<br />
+ 2.6.32-358.0.1.el6<br />
+ 2.6.32-358.2.1.el6<br />
+ 2.6.32-358.6.1.el6.<br />
The RPM packages for 32-bit and 64-bit kernels are available at:<br />
+ 32-bit: http://linuxsoft.cern.ch/cern/updates/slc6X/i386/RPMS/cve_2013_2094-0.2-1.el6.i686.rpm<br />
+ 64-bit: http://linuxsoft.cern.ch/cern/updates/slc6X/x86_64/RPMS/cve_2013_2094-0.2-1.el6.x86_64.rpm<br />
To put the mitigation in place, download and install the proper<br />
package; this implementation is stable across reboots.<br />
<br />
To implement the mitigation manually, perform the following steps.<br />
<br />
1. Install the systemtap package (and its dependencies). In<br />
particular, the kernel-devel package matching the running kernel<br />
is required. If the matching version is not installed, systemtap<br />
will give an error message asking for the correct package to be<br />
installed. Furthermore, the debuginfo package is necessary.<br />
<br />
2. There are at least two ways systemtap can be used to address the<br />
issue. One of them, published by Red Hat, tries to maintain as<br />
many performance monitoring capabilities as possible, at the<br />
expense of more intricate compilation and deployment dependencies,<br />
as far as systemtap versions used are concerned. The other fix<br />
has been provided by Linköping University and disables performance<br />
monitoring altogether, but is more resilient.<br />
<br />
a) To use Red Hat's mitigation, create a file mitigation.stp<br />
containing the following (without the BEGIN/END marker lines):<br />
---BEGIN FILE---<br />
%{<br />
#include <linux/perf_event.h><br />
%}<br />
<br />
function sanitize_config:long (event:long) %{<br />
struct perf_event *event;<br />
<br />
#if STAP_COMPAT_VERSION >= STAP_VERSION(1,8)<br />
event = (struct perf_event *) STAP_ARG_event;<br />
#else<br />
event = (struct perf_event *) THIS->event;<br />
#endif<br />
event->attr.config &= INT_MAX;<br />
%}<br />
<br />
probe kernel.function("perf_swevent_init@kernel/events/core.c").call {<br />
sanitize_config($event);<br />
}<br />
---END FILE---<br />
<br />
b) To use LIU's mitigation, put the following into mitigation.stp:<br />
---BEGIN FILE---<br />
#!/usr/bin/stap<br />
<br />
# quick and ugly hack by cap@nsc.liu.se to block CVE-2013-2094<br />
# must run in guru mode (-g)<br />
# compile to .ko file: "stap -g -m perf_event_blocker perf_event_blocker.stp<br />
# run on non build host using "staprun [-L] ./perf_event_blocker.ko"<br />
# requires build host and staprun to have identical kernel<br />
<br />
# screw up call by setting the attr_uptr pointer to null<br />
probe kernel.function("sys_perf_event_open")<br />
{<br />
printf("hit sys_perf_event_open, DENIED! %s\n", $$vars);<br />
$attr_uptr = 0<br />
}<br />
<br />
# print out return value to verify that the syscall was screwed up<br />
probe kernel.function("sys_perf_event_open").return<br />
{<br />
printf("returning from sys_perf_event_open with: %i\n", $return)<br />
}<br />
<br />
probe begin<br />
{<br />
printf("Guru mode sys_perf_event_open blocker active\n");<br />
}<br />
---END FILE---<br />
<br />
3. Compile into a .ko file with this command:<br />
stap -g -p4 -m mitigation mitigation.stp<br />
<br />
4. Load the systemtap module with this command:<br />
staprun -L ./mitigation.ko<br />
<br />
The .ko file may be distributed and used on all machines that run<br />
a kernel that is identical to the one on the host used to compile<br />
the .ko file.<br />
<br />
This fix is not persistent across reboots.<br />
<br />
In previous releases of this advisory, a sysctl-based mitigation was<br />
also suggested. This is no longer considered sufficient, as it only<br />
protects against a particular piece of exploit code, and this exploit<br />
can trivially be changed so that the mitigation no longer provides<br />
protection.<br />
<br />
<br />
Component Installation information<br />
==================================<br />
<br />
For many distributions, patched kernel packages are available. Refer to your<br />
distribution's information channels.<br />
<br />
<br />
References<br />
==========<br />
<br />
+ Mitre: http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-2094<br />
+ NIST NVD: http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2013-2094<br />
+ OSS-Sec: http://marc.info/?s=CVE-2013-2094&l=oss-security<br />
+ CentOS: http://lists.centos.org/pipermail/centos-announce/2013-May/019733.html<br />
+ Debian: https://security-tracker.debian.org/tracker/CVE-2013-2094<br />
+ Red Hat: https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2013-2094<br />
+ RHSA: https://rhn.redhat.com/errata/RHSA-2013-0830.html<br />
+ SL: http://listserv.fnal.gov/scripts/wa.exe?A2=ind1305&L=scientific-linux-errata&T=0&P=1321<br />
+ SLC: http://linux.web.cern.ch/linux/updates/updates-slc6.shtml<br />
+ Ubuntu: http://people.canonical.com/~ubuntu-security/cve/CVE-2013-2094<br />
+ CERN 32-bit RPM: http://linuxsoft.cern.ch/cern/updates/slc6X/i386/RPMS/cve_2013_2094-0.2-1.el6.i686.rpm<br />
+ CERN 64-bit RPM: http://linuxsoft.cern.ch/cern/updates/slc6X/x86_64/RPMS/cve_2013_2094-0.2-1.el6.x86_64.rpm<br />
+ CERN SRPM: http://linuxsoft.cern.ch/cern/updates/slc6X/SRPMS/cve_2013_2094-0.2-1.el6.src.rpm<br />
+ LIU SystemTap mitigation: http://www.nsc.liu.se/~cap/perf_event_blocker.stp<br />
+ SUSE: http://support.novell.com/security/cve/CVE-2013-2094.html<br />
</pre></div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI_CSIRT:Alerts/kernel-2013-05-14&diff=55608EGI CSIRT:Alerts/kernel-2013-05-142013-05-20T12:42:29Z<p>Fergadis: </p>
<hr />
<div><pre><br />
** WHITE information - Unlimited distribution allowed **<br />
** see https://wiki.egi.eu/wiki/EGI_CSIRT:TLP for distribution restrictions **<br />
<br />
EGI CSIRT ADVISORY [EGI-ADV-20130514]<br />
<br />
Title: Linux kernel perf_event vulnerability (CVE-2013-2094) [EGI-ADV-20130514]<br />
Date: 2013-05-14<br />
Updated: 2013-05-17<br />
<br />
URL: https://wiki.egi.eu/wiki/EGI_CSIRT:Alerts/kernel-2013-05-14<br />
<br />
<br />
Update Summary<br />
==============<br />
<br />
+ 2013-05-14: Initial revision.<br />
+ 2013-05-15: Made mitigation drawbacks more explicit.<br />
+ 2013-05-15: Revised systemtap mitigation to support v1.7<br />
+ 2013-05-15: Added a more robust systemtap mitigation, updated recommendation<br />
+ 2013-05-15: Removed sysctl mitigation<br />
+ 2013-05-16: Fixed typo in kernel version in section "Affected Software"<br />
+ 2013-05-16: Added pointer to CERN's RPM repo for mitigation package<br />
+ 2013-05-16: Added pointer to Red Hat's Security Advisory<br />
+ 2013-05-16: Added pointer to Scientific Linux Advisory<br />
+ 2013-05-17: Added pointer to Scientific Linux/CERN Advisory<br />
+ 2013-05-17: Added pointer to CentOS Advisory<br />
+ 2013-05-20: Added pointer to SUSE Advisory, replaced recommendations with required actions<br />
<br />
<br />
Introduction<br />
============<br />
<br />
A recently-discovered vulnerability in the Linux kernel allows a local user<br />
to escalate their privilege level and gain root access. Working exploit code<br />
is publicly available.<br />
All relevant Linux distributions have already published an updated kernel<br />
which fixes this vulnerability.<br />
<br />
Details<br />
=======<br />
<br />
The performance measurement subsystem in the Linux kernel incorrectly casts a<br />
64-bit integer into a 32-bit integer which is subsequently used for array<br />
dereferencing. Providing carefully chosen integers as input allows arbitrary<br />
code to be executed.<br />
<br />
The erroneous code has been introduced in kernel version 2.6.37 (commit<br />
b0a873ebbf87bf38bf70b5e39a7cadc96099fa13 on 2010-09-09) and is fixed in kernel<br />
version 3.8.9 (commit 8176cced706b5e5d15887584150764894e94e02f on 2013-04-15).<br />
Additionally, the vulnerability was backported to 2.6.32 kernels by Red Hat.<br />
<br />
Working exploit code is publicly available. This code will not work on all<br />
vulnerable distributions; however, it appears to work on RHEL 6 and derived<br />
systems.<br />
<br />
The issue has been addressed by CentOS, Debian, Red Hat, Scientific Linux,<br />
Scientific Linux/CERN and Ubuntu. As per standard EGI procedures, as patches<br />
are widely available at this time, and this issue has been assessed as CRITICAL<br />
by EGI CSIRT, sites must have rolled out the proper patches within 7 days.<br />
As widespread attacks are currently being carried out with this vulnerability,<br />
sites should take action urgently.<br />
<br />
<br />
Risk Category<br />
=============<br />
<br />
This issue has been assessed as CRITICAL risk by the EGI CSIRT as a working<br />
exploit is publicly available.<br />
<br />
<br />
Affected Software<br />
=================<br />
<br />
+ Linux kernels 2.6.36 through 3.8.8 (both including).<br />
+ Linux kernels 2.6.32 with Red Hat backports.<br />
<br />
<br />
Mitigation<br />
==========<br />
<br />
The issue can be mitigated with the use of systemtap. CERN has<br />
provided an RPM package that implements systemtap mitigation as<br />
described below for kernel versions<br />
+ 2.6.32-358.0.1.el6<br />
+ 2.6.32-358.2.1.el6<br />
+ 2.6.32-358.6.1.el6.<br />
The RPM packages for 32-bit and 64-bit kernels are available at:<br />
+ 32-bit: http://linuxsoft.cern.ch/cern/updates/slc6X/i386/RPMS/cve_2013_2094-0.2-1.el6.i686.rpm<br />
+ 64-bit: http://linuxsoft.cern.ch/cern/updates/slc6X/x86_64/RPMS/cve_2013_2094-0.2-1.el6.x86_64.rpm<br />
To put the mitigation in place, download and install the proper<br />
package; this implementation is stable across reboots.<br />
<br />
To implement the mitigation manually, perform the following steps.<br />
<br />
1. Install the systemtap package (and its dependencies). In<br />
particular, the kernel-devel package matching the running kernel<br />
is required. If the matching version is not installed, systemtap<br />
will give an error message asking for the correct package to be<br />
installed. Furthermore, the debuginfo package is necessary.<br />
<br />
2. There are at least two ways systemtap can be used to address the<br />
issue. One of them, published by Red Hat, tries to maintain as<br />
many performance monitoring capabilities as possible, at the<br />
expense of more intricate compilation and deployment dependencies,<br />
as far as systemtap versions used are concerned. The other fix<br />
has been provided by Linköping University and disables performance<br />
monitoring altogether, but is more resilient.<br />
<br />
a) To use Red Hat's mitigation, create a file mitigation.stp<br />
containing the following (without the BEGIN/END marker lines):<br />
---BEGIN FILE---<br />
%{<br />
#include <linux/perf_event.h><br />
%}<br />
<br />
function sanitize_config:long (event:long) %{<br />
struct perf_event *event;<br />
<br />
#if STAP_COMPAT_VERSION >= STAP_VERSION(1,8)<br />
event = (struct perf_event *) STAP_ARG_event;<br />
#else<br />
event = (struct perf_event *) THIS->event;<br />
#endif<br />
event->attr.config &= INT_MAX;<br />
%}<br />
<br />
probe kernel.function("perf_swevent_init@kernel/events/core.c").call {<br />
sanitize_config($event);<br />
}<br />
---END FILE---<br />
<br />
b) To use LIU's mitigation, put the following into mitigation.stp:<br />
---BEGIN FILE---<br />
#!/usr/bin/stap<br />
<br />
# quick and ugly hack by cap@nsc.liu.se to block CVE-2013-2094<br />
# must run in guru mode (-g)<br />
# compile to .ko file: "stap -g -m perf_event_blocker perf_event_blocker.stp<br />
# run on non build host using "staprun [-L] ./perf_event_blocker.ko"<br />
# requires build host and staprun to have identical kernel<br />
<br />
# screw up call by setting the attr_uptr pointer to null<br />
probe kernel.function("sys_perf_event_open")<br />
{<br />
printf("hit sys_perf_event_open, DENIED! %s\n", $$vars);<br />
$attr_uptr = 0<br />
}<br />
<br />
# print out return value to verify that the syscall was screwed up<br />
probe kernel.function("sys_perf_event_open").return<br />
{<br />
printf("returning from sys_perf_event_open with: %i\n", $return)<br />
}<br />
<br />
probe begin<br />
{<br />
printf("Guru mode sys_perf_event_open blocker active\n");<br />
}<br />
---END FILE---<br />
<br />
3. Compile into a .ko file with this command:<br />
stap -g -p4 -m mitigation mitigation.stp<br />
<br />
4. Load the systemtap module with this command:<br />
staprun -L ./mitigation.ko<br />
<br />
The .ko file may be distributed and used on all machines that run<br />
a kernel that is identical to the one on the host used to compile<br />
the .ko file.<br />
<br />
This fix is not persistent across reboots.<br />
<br />
In previous releases of this advisory, a sysctl-based mitigation was<br />
also suggested. This is no longer considered sufficient, as it only<br />
protects against a particular piece of exploit code, and this exploit<br />
can trivially be changed so that the mitigation no longer provides<br />
protection.<br />
<br />
<br />
Component Installation information<br />
==================================<br />
<br />
For many distributions, patched kernel packages are available. Refer to your<br />
distribution's information channels.<br />
<br />
<br />
Required Actions<br />
===============<br />
<br />
All running EGI resources MUST be either patched or otherwise have a<br />
work-around in place by 2013-05-27 T21:00+01:00. EGI sites failing to act and/or <br />
failing to respond to requests from the EGI CSIRT team, risk site suspension. <br />
<br />
References<br />
==========<br />
<br />
+ Mitre: http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-2094<br />
+ NIST NVD: http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2013-2094<br />
+ OSS-Sec: http://marc.info/?s=CVE-2013-2094&l=oss-security<br />
+ CentOS: http://lists.centos.org/pipermail/centos-announce/2013-May/019733.html<br />
+ Debian: https://security-tracker.debian.org/tracker/CVE-2013-2094<br />
+ Red Hat: https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2013-2094<br />
+ RHSA: https://rhn.redhat.com/errata/RHSA-2013-0830.html<br />
+ SL: http://listserv.fnal.gov/scripts/wa.exe?A2=ind1305&L=scientific-linux-errata&T=0&P=1321<br />
+ SLC: http://linux.web.cern.ch/linux/updates/updates-slc6.shtml<br />
+ Ubuntu: http://people.canonical.com/~ubuntu-security/cve/CVE-2013-2094<br />
+ CERN 32-bit RPM: http://linuxsoft.cern.ch/cern/updates/slc6X/i386/RPMS/cve_2013_2094-0.2-1.el6.i686.rpm<br />
+ CERN 64-bit RPM: http://linuxsoft.cern.ch/cern/updates/slc6X/x86_64/RPMS/cve_2013_2094-0.2-1.el6.x86_64.rpm<br />
+ CERN SRPM: http://linuxsoft.cern.ch/cern/updates/slc6X/SRPMS/cve_2013_2094-0.2-1.el6.src.rpm<br />
+ LIU SystemTap mitigation: http://www.nsc.liu.se/~cap/perf_event_blocker.stp<br />
+ SUSE: http://support.novell.com/security/cve/CVE-2013-2094.html<br />
</pre></div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=41110Resource Centres OLA and Resource infrastructure Provider OLA reports2012-10-03T10:39:17Z<p>Fergadis: /* Resource Infrastructures */</p>
<hr />
<div>{{Template:Op menubar}} {{TOC_right}} <br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.cern.ch/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete. <br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs). <br />
<br />
[[SAM Tests|SAM metric]] results are used for the calculation of Availability/Reliability. <br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics. <br />
<br />
= Performance reports =<br />
<br />
== Resource Centres ==<br />
<br />
[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE) <br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
! Availability/Reliability <br />
! Jan <br />
! Feb <br />
! Mar <br />
! Apr <br />
! May <br />
! Jun <br />
! Jul <br />
! Aug <br />
! Sep <br />
! Oct <br />
! Nov <br />
! Dec<br />
|-<br />
! 2010 <br />
| - <br />
| - <br />
| - <br />
| - <br />
| [https://documents.egi.eu/document/42 05/10] <br />
| [https://documents.egi.eu/document/96 06/10] <br />
| [https://documents.egi.eu/document/130 07/10] <br />
| [https://documents.egi.eu/document/157 08/10] <br />
| [https://documents.egi.eu/document/219 09/10] <br />
| [https://documents.egi.eu/document/238 10/10] <br />
| [https://documents.egi.eu/document/266 11/10] <br />
| [https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011 <br />
| [https://documents.egi.eu/document/332 01/11] <br />
| [https://documents.egi.eu/document/402 02/11] <br />
| [https://documents.egi.eu/document/465 03/11] <br />
| [https://documents.egi.eu/document/508 04/11] <br />
| [https://documents.egi.eu/document/593 05/11] <br />
| [https://documents.egi.eu/document/648 06/11] <br />
| [https://documents.egi.eu/document/716 07/11] <br />
| [https://documents.egi.eu/document/783 08/11] <br />
| [https://documents.egi.eu/document/820 09/11] <br />
| [https://documents.egi.eu/document/879 10/11] <br />
| [https://documents.egi.eu/document/905 11/11] <br />
| [https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012 <br />
| [https://documents.egi.eu/document/1000 01/12] <br />
| [https://documents.egi.eu/document/1033 02/12] <br />
| [https://documents.egi.eu/document/1091 03/12] <br />
| [https://documents.egi.eu/document/1117 04/12] <br />
| [https://documents.egi.eu/document/1174 05/12] <br />
| [https://documents.egi.eu/document/1251 06/12] <br />
| [https://documents.egi.eu/document/1307 07/12] <br />
| [https://documents.egi.eu/document/1332 08/12] <br />
| [https://documents.egi.eu/document/1370 09/12] <br />
| [ 10/12] <br />
| [ 11/12] <br />
| [ 12/12]<br />
|}<br />
<br />
== Resource Infrastructures ==<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: ''' <br />
'''top-BDII Availability/Reliability''' <br />
<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011''' <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1091&version=2&filename=EGI-core_services_availabilities-per_NGI-Mar2012.pdf 03/12] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf 04/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1174&version=3&filename=EGI-core_services_availabilities-per_NGI-May2012-1.pdf 05/12] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1251&version=3&filename=EGI-core_services_availabilities-per_NGI-June2012-1.pdf 06/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1307&version=1&filename=EGI-core_services_availabilities-per_NGI-July2012.pdf 07/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1332&version=2&filename=EGI-core_services_availabilities-per_NGI-August2012.pdf 08/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1370&version=1&filename=EGI-core_services_availabilities-per_NGI-September2012.pdf 09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|}<br />
<br />
<br> <br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' ticket/[https://documents.egi.eu/document/1089 Report] <br />
<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011''' <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
<br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
<br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
<br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=80841 80841]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=81998 81998]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-04.pdf 04/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=82926 82926]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-05.pdf 05/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=84168 84168]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-06.pdf 06/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=85127 85127]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-07.pdf 07/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=86007 86007]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-08.pdf 08/12] <br />
<br />
| [GGUS]/ <br />
[09/12] <br />
<br />
| [GGUS]/ <br />
[10/12] <br />
<br />
| [GGUS]/ <br />
[11/12] <br />
<br />
| [GGUS]/ <br />
[12/12] <br />
<br />
|}<br />
<br />
== EGI overall Availability and Reliability ==<br />
<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010) <br />
<br />
== Underperforming/Suspended RCs ==<br />
<br />
*List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
*List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable <!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
= Process for quality verification =<br />
<br />
*'''Generation of statistics'''<br />
<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. <br />
<br />
*'''Preliminary processing'''<br />
<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page. <br />
<br />
*'''Publication'''<br />
<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets. <br />
<br />
*'''Handling of sites below targets'''<br />
<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
#a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
#the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure]. <br />
#if the explanation is found satisfactory the ticket is closed <br />
#conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
#the child ticket can then be closed <br />
#the parent ticket will be closed when all child tickets have been closed.<br />
<br />
*'''Handling of sites that are eligible for suspension'''<br />
<br />
For a site that is eligible for suspension: <br />
<br />
#a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]) <br />
#after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
#in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
#the child ticket closes either when the site is suspended or when suspension is canceled <br />
#the parent ticket will be closed when all child tickets have been closed<br />
<br />
*'''Wiki follow up page'''<br />
<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics] <br />
<br />
*'''Recomputation precedure'''<br />
<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10] <br />
<br />
= Known issues and recommendations to NGIs =<br />
<br />
#ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
#The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should. <br />
#Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
= [[Documentation#OLAs|Operational Level Agreements]] =<br />
<br />
= Links =<br />
<br />
*Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
*NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation <br />
*[https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance] <br />
*Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--> <br />
<br />
{{Template:Creative_commons}} <br />
<br />
[[Category:Procedures]] [[Category:Service_Level_Management]]</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=41109Resource Centres OLA and Resource infrastructure Provider OLA reports2012-10-03T10:38:39Z<p>Fergadis: /* Resource Centres */</p>
<hr />
<div>{{Template:Op menubar}} {{TOC_right}} <br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.cern.ch/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete. <br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs). <br />
<br />
[[SAM Tests|SAM metric]] results are used for the calculation of Availability/Reliability. <br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics. <br />
<br />
= Performance reports =<br />
<br />
== Resource Centres ==<br />
<br />
[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE) <br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
! Availability/Reliability <br />
! Jan <br />
! Feb <br />
! Mar <br />
! Apr <br />
! May <br />
! Jun <br />
! Jul <br />
! Aug <br />
! Sep <br />
! Oct <br />
! Nov <br />
! Dec<br />
|-<br />
! 2010 <br />
| - <br />
| - <br />
| - <br />
| - <br />
| [https://documents.egi.eu/document/42 05/10] <br />
| [https://documents.egi.eu/document/96 06/10] <br />
| [https://documents.egi.eu/document/130 07/10] <br />
| [https://documents.egi.eu/document/157 08/10] <br />
| [https://documents.egi.eu/document/219 09/10] <br />
| [https://documents.egi.eu/document/238 10/10] <br />
| [https://documents.egi.eu/document/266 11/10] <br />
| [https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011 <br />
| [https://documents.egi.eu/document/332 01/11] <br />
| [https://documents.egi.eu/document/402 02/11] <br />
| [https://documents.egi.eu/document/465 03/11] <br />
| [https://documents.egi.eu/document/508 04/11] <br />
| [https://documents.egi.eu/document/593 05/11] <br />
| [https://documents.egi.eu/document/648 06/11] <br />
| [https://documents.egi.eu/document/716 07/11] <br />
| [https://documents.egi.eu/document/783 08/11] <br />
| [https://documents.egi.eu/document/820 09/11] <br />
| [https://documents.egi.eu/document/879 10/11] <br />
| [https://documents.egi.eu/document/905 11/11] <br />
| [https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012 <br />
| [https://documents.egi.eu/document/1000 01/12] <br />
| [https://documents.egi.eu/document/1033 02/12] <br />
| [https://documents.egi.eu/document/1091 03/12] <br />
| [https://documents.egi.eu/document/1117 04/12] <br />
| [https://documents.egi.eu/document/1174 05/12] <br />
| [https://documents.egi.eu/document/1251 06/12] <br />
| [https://documents.egi.eu/document/1307 07/12] <br />
| [https://documents.egi.eu/document/1332 08/12] <br />
| [https://documents.egi.eu/document/1370 09/12] <br />
| [ 10/12] <br />
| [ 11/12] <br />
| [ 12/12]<br />
|}<br />
<br />
== Resource Infrastructures ==<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: ''' <br />
'''top-BDII Availability/Reliability''' <br />
<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011''' <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1091&version=2&filename=EGI-core_services_availabilities-per_NGI-Mar2012.pdf 03/12] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf 04/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1174&version=3&filename=EGI-core_services_availabilities-per_NGI-May2012-1.pdf 05/12] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1251&version=3&filename=EGI-core_services_availabilities-per_NGI-June2012-1.pdf 06/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1307&version=1&filename=EGI-core_services_availabilities-per_NGI-July2012.pdf 07/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1332&version=2&filename=EGI-core_services_availabilities-per_NGI-August2012.pdf 08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|}<br />
<br />
<br> <br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' ticket/[https://documents.egi.eu/document/1089 Report] <br />
<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011''' <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| - <br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
<br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
<br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
<br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=80841 80841]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=81998 81998]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-04.pdf 04/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=82926 82926]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-05.pdf 05/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=84168 84168]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-06.pdf 06/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=85127 85127]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-07.pdf 07/12] <br />
<br />
| [https://ggus.eu/ws/ticket_info.php?ticket=86007 86007]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-08.pdf 08/12] <br />
<br />
| [GGUS]/ <br />
[09/12] <br />
<br />
| [GGUS]/ <br />
[10/12] <br />
<br />
| [GGUS]/ <br />
[11/12] <br />
<br />
| [GGUS]/ <br />
[12/12] <br />
<br />
|}<br />
<br />
== EGI overall Availability and Reliability ==<br />
<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010) <br />
<br />
== Underperforming/Suspended RCs ==<br />
<br />
*List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
*List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable <!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
= Process for quality verification =<br />
<br />
*'''Generation of statistics'''<br />
<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. <br />
<br />
*'''Preliminary processing'''<br />
<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page. <br />
<br />
*'''Publication'''<br />
<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets. <br />
<br />
*'''Handling of sites below targets'''<br />
<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
#a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
#the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure]. <br />
#if the explanation is found satisfactory the ticket is closed <br />
#conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
#the child ticket can then be closed <br />
#the parent ticket will be closed when all child tickets have been closed.<br />
<br />
*'''Handling of sites that are eligible for suspension'''<br />
<br />
For a site that is eligible for suspension: <br />
<br />
#a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]) <br />
#after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
#in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
#the child ticket closes either when the site is suspended or when suspension is canceled <br />
#the parent ticket will be closed when all child tickets have been closed<br />
<br />
*'''Wiki follow up page'''<br />
<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics] <br />
<br />
*'''Recomputation precedure'''<br />
<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10] <br />
<br />
= Known issues and recommendations to NGIs =<br />
<br />
#ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
#The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should. <br />
#Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
= [[Documentation#OLAs|Operational Level Agreements]] =<br />
<br />
= Links =<br />
<br />
*Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
*NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation <br />
*[https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance] <br />
*Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--> <br />
<br />
{{Template:Creative_commons}} <br />
<br />
[[Category:Procedures]] [[Category:Service_Level_Management]]</div>Fergadishttps://wiki.egi.eu/w/index.php?title=PROC10_Recomputation_of_SAM_results_or_availability_reliability_statistics&diff=40926PROC10 Recomputation of SAM results or availability reliability statistics2012-09-27T09:44:55Z<p>Fergadis: /* Tips */</p>
<hr />
<div>{{Template:Op menubar}}<br />
{{Template:Doc_menubar}}<br />
[[Category:Procedures]]<br />
__TOC__<br />
<br />
= Procedure for the recomputation of SAM results and/or availability/reliability statistics=<br />
<br />
*'''Title''': Recomputation of SAM results and/or availability/reliability statistics<br />
*'''Document link''': https://wiki.egi.eu/wiki/PROC10<br />
*'''Last modified''': 03 May 2012 <br />
*'''Version''': 1.2<br />
*'''Policy Group Acronym''': OMB<br />
*'''Policy Group Name''': Operations Management Board<br />
*'''Contact Person''': George Fergadis/AUTH<br />
*'''Document Status''': APPROVED<br />
*'''Approved Date''': 26 March 2012<br />
*'''Procedure Statement''': This procedure documents the steps for requesting a correction in the SAM test results and in the related availability/reliability statistics.<br />
<br />
= Overview =<br />
This procedure documents the steps for requesting a correction in the OPS VO<br />
[[SAM_Instances|SAM test results]] and in the related [[Availability_and_reliability_monthly_statistics|availability/reliability statistics]] if applicable. <br />
<!--A recomputation of these statistics for the affected month is not needed if test results are notified and corrected before the statistics of that month are computed and distributed. Problems with the SAM results should be notified as soon as possible once detected, in order to allow sufficient time for fixing of these and thus to avoid that monthly availability/reliability statistics for the affected month have to be re-computed.--><br />
<br />
DISCLAIMER: This procedure is only applicable to EGI OPS test results. Procedures for the computation of VO-specific availability report are VO-specific and are out of this scope.<br />
<br />
= Who can submit a request? =<br />
Re-computations can be requested by:<br />
* site administrators<br />
* regional operations staff.<br />
<br />
= Re-computation policy =<br />
'''Starting from the 01 May 2012 monitoring results can be recomputed only in the case of problems with the monitoring infrastructure itself. No re-computations will be performed in case of issues with the deployed middleware (e.g. in case of documented bugs affecting the availability of a production service end-point), which will be consequently reflected in lower availability/reliability.'''<br />
<br />
Some examples of possible issues justifying a re-computation request:<br />
* invalid proxy certificate used for submitting the monitoring probes in a Nagios instance;<br />
* problems with the Storage Element used for replica management tests resulting in errors on CE's metrics.<br />
<br />
'''The deadline for requesting re-computations is 10 calendar days after the publication and announcement of the monthly Availability/Reliability reports for a given month X (typically the announcement will be distributed on the 1st day of month X+1).<br />
<br />
According to the re-computation requests received, A/R reports will be regenerated only once for each month, after the 10th of month X+1.'''<br />
<br />
= How to request a re-computation of OPS monitoring results =<br />
<br />
== The request is originated by a site ==<br />
STEP 1 (RC). As soon as the problem is detected, notify your NGI operations centre by opening a [http://helpdesk.egi.eu/ GGUS ticket]. Please address the ticket to your Operations Centre support unit, who is responsible of validating the request. <br />
In the GGUS ticket you must mention:<br />
# the starting and ending time of the problem (including day and hour in UTC)<br />
# the Site, NGI/federation of NGIs affected by the problem<br />
# the VO affected by the problem (must be the OPS VO)<br />
# a description of the problem<br />
<br />
STEP 2 (OC). The NGI operations centre validates the request.<br />
<br />
STEP 3 (OC). If the request is deemed valid, a GGUS ticket is sent to [[GGUS:SLM-FAQ|Service Level Management]](SLM) Support Unit. The SLM support team will take care of discussing all requests received with the SAM team.<br />
<br />
STEP 4 (SLM SU). The SLM SU is responsible of <br />
# validating the reported problems <br />
# discuss the reported problems with the SAM Support Unit if needed<br />
# notify the SAM SU about the requests received through a new parent ticket is submitted to SAM with the children tickets of the validated requests<br />
<br />
STEP 5 (SAM SU). <br />
# The SAM Support Unit is responsible of checking the requests and of regenerating the results. For the accepted requests all Nagios metric results for any site and service are set to ''unknown'' status from the beginning of the hour reported in the starting time to one hour after the ending time. This is to cover late results that could have arrived later. the availability and reliability of other sites won't be affected, as unknown periods are not considered in the computation.<br />
# New monthly availability statistics will be recomputed for that particular period, Site, NGI/federation of NGIs.<br />
# A new report will be made available 10 days after the first publication of the report.<br />
# After publication of the new report, all child GGUS tickets will be closed.<br />
<br />
STEP 6 (SLM SU).<br />
# The parent ticket is closed.<br />
<br />
== The request is originated by an NGI/EIRO operations centre ==<br />
Follow the procedure defined in the section above, starting from STEP 3.<br />
<br />
<!-- # '''STEP 3''': if the request for recomputation of the test results is accepted, the SAM Support Unit will be reponsible of fixing the results and of triggering a recomputation of the monthly availability statistics if necessary. The following these steps are followed:<br />
## All Nagios metric results for any site and service are set to ''unknown'' status from the beginning of the hour reported in the starting time to one hour after the ending time. This is to cover late results that could have arrived later.<br />
## Availability/reliability are then recomputed for that particular period, Site, NGI/federation of NGIs if necessary. As a consequence, the availability and reliability of other sites won't be affected, as unknown periods are not considered in the computation.<br />
# '''STEP 4''': in case new availability/reliability statistics are computed, when these are ready for distribution, the SAM/Nagios SU reassignes the ticket to the SLM Support Unit, in order to notify that a new set of reports can be re-distributed to EGI.--><br />
<br />
= External links =<br />
* [https://tomtools.cern.ch/confluence/display/SAMDOC/Availability+Re-computation+Policy WLCG Availability re-computation policy]<br />
<br />
= Tips =<br />
* Date formats<br />
You can use the Unix <tt>date</tt> command to convert the start and end time from your time zone to <tt>UTC</tt> using the [http://en.wikipedia.org/wiki/ISO_8601 ISO 8601] format.<br />
<br />
''the start time must be rounded to the lower hour and the end time rounded to the higher hour''<br />
<br />
Example:<br />
# date --date="12 Feb 2012 17:00 CET" --utc --iso-8601=hours<br />
will give:<br />
2012-02-12T16:00+0000<br />
<br />
= Revision history =<br />
03/05/2012: updated policy and procedure to reflect the OMB decision of the March 2012 meeting<br />
17/01/2012: the text of the procedure is fixed to clarify that both RC administrators and regional operations staff can request a re-computation.<br />
<br />
16/01/2012: the text of the procedure is fixed to clarify that the recomputation of test results can be requested before the end of the affected month, in which case if sufficient time is allowed for fixing of the test results, no re-computation of availability/reliability statistics will be needed.<br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=40073Resource Centres OLA and Resource infrastructure Provider OLA reports2012-09-05T06:50:18Z<p>Fergadis: /* Resource Infrastructures */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.cern.ch/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Performance reports=<br />
<br />
== Resource Centres ==<br />
[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE)<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
! Availability/Reliability<br />
! Jan<br />
! Feb<br />
! Mar<br />
! Apr<br />
! May<br />
! Jun<br />
! Jul<br />
! Aug<br />
! Sep<br />
! Oct<br />
! Nov<br />
! Dec<br />
|-<br />
! 2010<br />
| -<br />
| -<br />
| -<br />
| -<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012<br />
|[https://documents.egi.eu/document/1000 01/12]<br />
|[https://documents.egi.eu/document/1033 02/12]<br />
|[https://documents.egi.eu/document/1091 03/12]<br />
|[https://documents.egi.eu/document/1117 04/12]<br />
|[https://documents.egi.eu/document/1174 05/12]<br />
|[https://documents.egi.eu/document/1251 06/12]<br />
|[https://documents.egi.eu/document/1307 07/12]<br />
|[https://documents.egi.eu/document/1332 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
== Resource Infrastructures ==<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: '''<br />
'''top-BDII Availability/Reliability''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1091&version=2&filename=EGI-core_services_availabilities-per_NGI-Mar2012.pdf 03/12]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf 04/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1174&version=3&filename=EGI-core_services_availabilities-per_NGI-May2012-1.pdf 05/12] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1251&version=3&filename=EGI-core_services_availabilities-per_NGI-June2012-1.pdf 06/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1307&version=1&filename=EGI-core_services_availabilities-per_NGI-July2012.pdf 07/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1332&version=2&filename=EGI-core_services_availabilities-per_NGI-August2012.pdf 08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|}<br />
<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' <br />
ticket/[https://documents.egi.eu/document/1089 Report]<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=80841 80841]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=81998 81998]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-04.pdf 04/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=82926 82926]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-05.pdf 05/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=84168 84168]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-06.pdf 06/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=85127 85127]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-07.pdf 07/12] <br />
| [GGUS]/ <br />
[08/12] <br />
| [GGUS]/ <br />
[09/12] <br />
| [GGUS]/ <br />
[10/12] <br />
| [GGUS]/ <br />
[11/12] <br />
|[GGUS]/ <br />
[12/12] <br />
|}<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/].<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=40059Resource Centres OLA and Resource infrastructure Provider OLA reports2012-09-04T15:06:36Z<p>Fergadis: /* Resource Infrastructures */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.cern.ch/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Performance reports=<br />
<br />
== Resource Centres ==<br />
[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE)<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
! Availability/Reliability<br />
! Jan<br />
! Feb<br />
! Mar<br />
! Apr<br />
! May<br />
! Jun<br />
! Jul<br />
! Aug<br />
! Sep<br />
! Oct<br />
! Nov<br />
! Dec<br />
|-<br />
! 2010<br />
| -<br />
| -<br />
| -<br />
| -<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012<br />
|[https://documents.egi.eu/document/1000 01/12]<br />
|[https://documents.egi.eu/document/1033 02/12]<br />
|[https://documents.egi.eu/document/1091 03/12]<br />
|[https://documents.egi.eu/document/1117 04/12]<br />
|[https://documents.egi.eu/document/1174 05/12]<br />
|[https://documents.egi.eu/document/1251 06/12]<br />
|[https://documents.egi.eu/document/1307 07/12]<br />
|[https://documents.egi.eu/document/1332 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
== Resource Infrastructures ==<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: '''<br />
'''top-BDII Availability/Reliability''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1091&version=2&filename=EGI-core_services_availabilities-per_NGI-Mar2012.pdf 03/12]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf 04/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1174&version=3&filename=EGI-core_services_availabilities-per_NGI-May2012-1.pdf 05/12] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1251&version=3&filename=EGI-core_services_availabilities-per_NGI-June2012-1.pdf 06/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1307&version=1&filename=EGI-core_services_availabilities-per_NGI-July2012.pdf 07/12] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1332&version=1&filename=EGI-core_services_availabilities-per_NGI-August2012-2.pdf 08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|}<br />
<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' <br />
ticket/[https://documents.egi.eu/document/1089 Report]<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=80841 80841]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=81998 81998]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-04.pdf 04/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=82926 82926]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-05.pdf 05/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=84168 84168]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-06.pdf 06/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=85127 85127]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-07.pdf 07/12] <br />
| [GGUS]/ <br />
[08/12] <br />
| [GGUS]/ <br />
[09/12] <br />
| [GGUS]/ <br />
[10/12] <br />
| [GGUS]/ <br />
[11/12] <br />
|[GGUS]/ <br />
[12/12] <br />
|}<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/].<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=40057Resource Centres OLA and Resource infrastructure Provider OLA reports2012-09-04T15:05:18Z<p>Fergadis: /* Resource Centres */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.cern.ch/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Performance reports=<br />
<br />
== Resource Centres ==<br />
[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE)<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
! Availability/Reliability<br />
! Jan<br />
! Feb<br />
! Mar<br />
! Apr<br />
! May<br />
! Jun<br />
! Jul<br />
! Aug<br />
! Sep<br />
! Oct<br />
! Nov<br />
! Dec<br />
|-<br />
! 2010<br />
| -<br />
| -<br />
| -<br />
| -<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012<br />
|[https://documents.egi.eu/document/1000 01/12]<br />
|[https://documents.egi.eu/document/1033 02/12]<br />
|[https://documents.egi.eu/document/1091 03/12]<br />
|[https://documents.egi.eu/document/1117 04/12]<br />
|[https://documents.egi.eu/document/1174 05/12]<br />
|[https://documents.egi.eu/document/1251 06/12]<br />
|[https://documents.egi.eu/document/1307 07/12]<br />
|[https://documents.egi.eu/document/1332 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
== Resource Infrastructures ==<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: '''<br />
'''top-BDII Availability/Reliability''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1091&version=2&filename=EGI-core_services_availabilities-per_NGI-Mar2012.pdf 03/12]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf 04/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1174&version=3&filename=EGI-core_services_availabilities-per_NGI-May2012-1.pdf 05/12] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1251&version=3&filename=EGI-core_services_availabilities-per_NGI-June2012-1.pdf 06/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1307&version=1&filename=EGI-core_services_availabilities-per_NGI-July2012.pdf 07/12] <br />
| [08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|}<br />
<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' <br />
ticket/[https://documents.egi.eu/document/1089 Report]<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=80841 80841]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=81998 81998]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-04.pdf 04/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=82926 82926]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-05.pdf 05/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=84168 84168]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-06.pdf 06/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=85127 85127]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-07.pdf 07/12] <br />
| [GGUS]/ <br />
[08/12] <br />
| [GGUS]/ <br />
[09/12] <br />
| [GGUS]/ <br />
[10/12] <br />
| [GGUS]/ <br />
[11/12] <br />
|[GGUS]/ <br />
[12/12] <br />
|}<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/].<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI-InSPIRE:SA1.8-QR9&diff=38268EGI-InSPIRE:SA1.8-QR92012-07-12T11:41:13Z<p>Fergadis: /* 4. Plans for the next period */</p>
<hr />
<div>__NOTOC__<br />
= 1. Task Meetings =<br />
<!--<br />
Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all task meetings participants are ALWAYS recorded either on indico from the registrants’ list, or in the minutes. <br />
OMB meeting will be reported under task TSA1.1 only. Monday Operations meetings need to be reported under task TSA1.3 only. Training events will be recorded in the training event registry and need not be mentioned here.<br />
--><br />
{| border="1" cellspacing="0" cellpadding="5" align="center"<br />
! style="width: 25%" | Date (dd/mm/yyyy)<br />
! style="width: 25%" | Url Indico Agenda<br />
! style="width: 10%" | Title<br />
! style="width: 10%" | Outcome<br />
|-<br />
|dd/mm/yyyy<br />
|<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
<br />
= 2. Main Achievements = <br />
<!--<br />
Note. This is a detailed account of progress over the previous quarter of activities within the task. <br />
--><br />
* Continue publishing of A/R league tables - OPS<br />
<br />
* Continue publishing of A/R reports for NGI Core services (top-BDII) <br />
<br />
* The [https://documents.egi.eu/document/1057 MS418] "Operational Level Agreements (OLAs) within the EGI production infrastructure" has been finalized.<br />
<br />
* A new [https://tomtools.cern.ch/confluence/display/SAMDOC/Availability+Re-computation+Policy re-computation policy] (aproved at the [https://indico.egi.eu/indico/conferenceDisplay.py?confId=800 OMB 2012-03-26]) has been in effect since 01 May 2012.<br />
<br />
* The [https://documents.egi.eu/document/415 EGI overall availability] file has been enhanced to include the number of sites that are above the [[SLM/RC_Service_Levels|A/R thresholds]].<br />
<br />
== EGI Catch-All Core Service ==<br />
<br />
* Operation of WMS, LB and Top-BDII for site certification<br />
<br />
* Operation of certification Catch-All portal<br />
<br />
=== dteam VO ===<br />
<br />
* Operation of VOMS/VORMS for dteam VO.<br />
<br />
* The <b><tt>/dteam/uki</tt></b> group has been [https://ggus.eu/ws/ticket_info.php?ticket=81841 removed].<br />
<br />
= 3. Issues and Mitigation =<br />
<!-- fill the table below --><br />
<br />
{| border="1" cellspacing="0" cellpadding="2"<br />
|-<br />
!scope="col"| Issue Description<br />
!scope="col"| Mitigation Description<br />
|-<br />
|Issue description || Issue mitigation<br />
|-<br />
|}<br />
<br />
= 4. Plans for the next period =<br />
<!-- provide your text below --><br />
<br />
* Finalize the EGI.eu OLA.<br />
<br />
* Continue the handling the validation and distribution of the monthly A/R reports and the maintenance of the relevant wiki [[Availability_and_reliability_monthly_statistics|page]].</div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI-InSPIRE:SA1.8-QR9&diff=38267EGI-InSPIRE:SA1.8-QR92012-07-12T11:39:13Z<p>Fergadis: </p>
<hr />
<div>__NOTOC__<br />
= 1. Task Meetings =<br />
<!--<br />
Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all task meetings participants are ALWAYS recorded either on indico from the registrants’ list, or in the minutes. <br />
OMB meeting will be reported under task TSA1.1 only. Monday Operations meetings need to be reported under task TSA1.3 only. Training events will be recorded in the training event registry and need not be mentioned here.<br />
--><br />
{| border="1" cellspacing="0" cellpadding="5" align="center"<br />
! style="width: 25%" | Date (dd/mm/yyyy)<br />
! style="width: 25%" | Url Indico Agenda<br />
! style="width: 10%" | Title<br />
! style="width: 10%" | Outcome<br />
|-<br />
|dd/mm/yyyy<br />
|<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
<br />
= 2. Main Achievements = <br />
<!--<br />
Note. This is a detailed account of progress over the previous quarter of activities within the task. <br />
--><br />
* Continue publishing of A/R league tables - OPS<br />
<br />
* Continue publishing of A/R reports for NGI Core services (top-BDII) <br />
<br />
* The [https://documents.egi.eu/document/1057 MS418] "Operational Level Agreements (OLAs) within the EGI production infrastructure" has been finalized.<br />
<br />
* A new [https://tomtools.cern.ch/confluence/display/SAMDOC/Availability+Re-computation+Policy re-computation policy] (aproved at the [https://indico.egi.eu/indico/conferenceDisplay.py?confId=800 OMB 2012-03-26]) has been in effect since 01 May 2012.<br />
<br />
* The [https://documents.egi.eu/document/415 EGI overall availability] file has been enhanced to include the number of sites that are above the [[SLM/RC_Service_Levels|A/R thresholds]].<br />
<br />
== EGI Catch-All Core Service ==<br />
<br />
* Operation of WMS, LB and Top-BDII for site certification<br />
<br />
* Operation of certification Catch-All portal<br />
<br />
=== dteam VO ===<br />
<br />
* Operation of VOMS/VORMS for dteam VO.<br />
<br />
* The <b><tt>/dteam/uki</tt></b> group has been [https://ggus.eu/ws/ticket_info.php?ticket=81841 removed].<br />
<br />
= 3. Issues and Mitigation =<br />
<!-- fill the table below --><br />
<br />
{| border="1" cellspacing="0" cellpadding="2"<br />
|-<br />
!scope="col"| Issue Description<br />
!scope="col"| Mitigation Description<br />
|-<br />
|Issue description || Issue mitigation<br />
|-<br />
|}<br />
<br />
= 4. Plans for the next period =<br />
<!-- provide your text below --></div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI-InSPIRE:SA1.8-QR9&diff=38266EGI-InSPIRE:SA1.8-QR92012-07-12T11:38:29Z<p>Fergadis: /* dteam VO */</p>
<hr />
<div>__NOTOC__<br />
= 1. Task Meetings =<br />
<!--<br />
Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all task meetings participants are ALWAYS recorded either on indico from the registrants’ list, or in the minutes. <br />
OMB meeting will be reported under task TSA1.1 only. Monday Operations meetings need to be reported under task TSA1.3 only. Training events will be recorded in the training event registry and need not be mentioned here.<br />
--><br />
{| border="1" cellspacing="0" cellpadding="5" align="center"<br />
! style="width: 25%" | Date (dd/mm/yyyy)<br />
! style="width: 25%" | Url Indico Agenda<br />
! style="width: 10%" | Title<br />
! style="width: 10%" | Outcome<br />
|-<br />
|dd/mm/yyyy<br />
|<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
<br />
= 2. Main Achievements = <br />
<!--<br />
Note. This is a detailed account of progress over the previous quarter of activities within the task. <br />
--><br />
* Continue publishing of A/R league tables - OPS<br />
* Continue publishing of A/R reports for NGI Core services (top-BDII) <br />
* The [https://documents.egi.eu/document/1057 MS418] "Operational Level Agreements (OLAs) within the EGI production infrastructure" has been finalized. <br />
* A new [https://tomtools.cern.ch/confluence/display/SAMDOC/Availability+Re-computation+Policy re-computation policy] (aproved at the [https://indico.egi.eu/indico/conferenceDisplay.py?confId=800 OMB 2012-03-26]) has been in effect since 01 May 2012.<br />
* The [https://documents.egi.eu/document/415 EGI overall availability] file has been enhanced to include the number of sites that are above the [[SLM/RC_Service_Levels|A/R thresholds]].<br />
<br />
== EGI Catch-All Core Service ==<br />
<br />
* Operation of WMS, LB and Top-BDII for site certification<br />
<br />
* Operation of certification Catch-All portal<br />
<br />
=== dteam VO ===<br />
<br />
* Operation of VOMS/VORMS for dteam VO.<br />
<br />
* The <b><tt>/dteam/uki</tt></b> group has been [https://ggus.eu/ws/ticket_info.php?ticket=81841 removed].<br />
<br />
= 3. Issues and Mitigation =<br />
<!-- fill the table below --><br />
<br />
{| border="1" cellspacing="0" cellpadding="2"<br />
|-<br />
!scope="col"| Issue Description<br />
!scope="col"| Mitigation Description<br />
|-<br />
|Issue description || Issue mitigation<br />
|-<br />
|}<br />
<br />
= 4. Plans for the next period =<br />
<!-- provide your text below --></div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI-InSPIRE:SA1.8-QR9&diff=38265EGI-InSPIRE:SA1.8-QR92012-07-12T11:31:09Z<p>Fergadis: /* 2. Main Achievements */</p>
<hr />
<div>__NOTOC__<br />
= 1. Task Meetings =<br />
<!--<br />
Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all task meetings participants are ALWAYS recorded either on indico from the registrants’ list, or in the minutes. <br />
OMB meeting will be reported under task TSA1.1 only. Monday Operations meetings need to be reported under task TSA1.3 only. Training events will be recorded in the training event registry and need not be mentioned here.<br />
--><br />
{| border="1" cellspacing="0" cellpadding="5" align="center"<br />
! style="width: 25%" | Date (dd/mm/yyyy)<br />
! style="width: 25%" | Url Indico Agenda<br />
! style="width: 10%" | Title<br />
! style="width: 10%" | Outcome<br />
|-<br />
|dd/mm/yyyy<br />
|<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
<br />
= 2. Main Achievements = <br />
<!--<br />
Note. This is a detailed account of progress over the previous quarter of activities within the task. <br />
--><br />
* Continue publishing of A/R league tables - OPS<br />
* Continue publishing of A/R reports for NGI Core services (top-BDII) <br />
* The [https://documents.egi.eu/document/1057 MS418] "Operational Level Agreements (OLAs) within the EGI production infrastructure" has been finalized. <br />
* A new [https://tomtools.cern.ch/confluence/display/SAMDOC/Availability+Re-computation+Policy re-computation policy] (aproved at the [https://indico.egi.eu/indico/conferenceDisplay.py?confId=800 OMB 2012-03-26]) has been in effect since 01 May 2012.<br />
* The [https://documents.egi.eu/document/415 EGI overall availability] file has been enhanced to include the number of sites that are above the [[SLM/RC_Service_Levels|A/R thresholds]].<br />
<br />
== EGI Catch-All Core Service ==<br />
<br />
* Operation of WMS, LB and Top-BDII for site certification<br />
<br />
* Operation of certification Catch-All portal<br />
<br />
=== dteam VO ===<br />
<br />
* Operation of VOMS/VORMS for dteam VO.<br />
<br />
= 3. Issues and Mitigation =<br />
<!-- fill the table below --><br />
<br />
{| border="1" cellspacing="0" cellpadding="2"<br />
|-<br />
!scope="col"| Issue Description<br />
!scope="col"| Mitigation Description<br />
|-<br />
|Issue description || Issue mitigation<br />
|-<br />
|}<br />
<br />
= 4. Plans for the next period =<br />
<!-- provide your text below --></div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI-InSPIRE:SA1.8-QR9&diff=38264EGI-InSPIRE:SA1.8-QR92012-07-12T11:06:37Z<p>Fergadis: </p>
<hr />
<div>__NOTOC__<br />
= 1. Task Meetings =<br />
<!--<br />
Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all task meetings participants are ALWAYS recorded either on indico from the registrants’ list, or in the minutes. <br />
OMB meeting will be reported under task TSA1.1 only. Monday Operations meetings need to be reported under task TSA1.3 only. Training events will be recorded in the training event registry and need not be mentioned here.<br />
--><br />
{| border="1" cellspacing="0" cellpadding="5" align="center"<br />
! style="width: 25%" | Date (dd/mm/yyyy)<br />
! style="width: 25%" | Url Indico Agenda<br />
! style="width: 10%" | Title<br />
! style="width: 10%" | Outcome<br />
|-<br />
|dd/mm/yyyy<br />
|<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
<br />
= 2. Main Achievements = <br />
<!--<br />
Note. This is a detailed account of progress over the previous quarter of activities within the task. <br />
--><br />
* Continue publishing of A/R league tables - OPS<br />
* Continue publishing of A/R reports for NGI Core services (top-BDII) <br />
* The [https://documents.egi.eu/document/1057 MS418] "Operational Level Agreements (OLAs) within the EGI production infrastructure" has been finalized. <br />
* A new re-computation policy (aproved at the [https://indico.egi.eu/indico/conferenceDisplay.py?confId=800 OMB 2012-03-26]) has been in effect since 01 May 2012.<br />
<br />
= 3. Issues and Mitigation =<br />
<!-- fill the table below --><br />
<br />
{| border="1" cellspacing="0" cellpadding="2"<br />
|-<br />
!scope="col"| Issue Description<br />
!scope="col"| Mitigation Description<br />
|-<br />
|Issue description || Issue mitigation<br />
|-<br />
|}<br />
<br />
= 4. Plans for the next period =<br />
<!-- provide your text below --></div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=38030Resource Centres OLA and Resource infrastructure Provider OLA reports2012-07-06T07:01:44Z<p>Fergadis: /* Resource Infrastructures */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.egi.eu/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Performance reports=<br />
<br />
== Resource Centres ==<br />
[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE)<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
! Availability/Reliability<br />
! Jan<br />
! Feb<br />
! Mar<br />
! Apr<br />
! May<br />
! Jun<br />
! Jul<br />
! Aug<br />
! Sep<br />
! Oct<br />
! Nov<br />
! Dec<br />
|-<br />
! 2010<br />
| -<br />
| -<br />
| -<br />
| -<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012<br />
|[https://documents.egi.eu/document/1000 01/12]<br />
|[https://documents.egi.eu/document/1033 02/12]<br />
|[https://documents.egi.eu/document/1091 03/12]<br />
|[https://documents.egi.eu/document/1117 04/12]<br />
|[https://documents.egi.eu/document/1174 05/12]<br />
|[https://documents.egi.eu/document/1251 06/12]<br />
|[ 07/12]<br />
|[ 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
== Resource Infrastructures ==<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: '''<br />
'''top-BDII Availability/Reliability''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1091&version=2&filename=EGI-core_services_availabilities-per_NGI-Mar2012.pdf 03/12]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf 04/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1174&version=3&filename=EGI-core_services_availabilities-per_NGI-May2012-1.pdf 05/12] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1251&version=2&filename=EGI-core_services_availabilities-per_NGI-June2012-1.pdf 06/12] <br />
| [07/12] <br />
| [08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|}<br />
<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' <br />
ticket/[https://documents.egi.eu/document/1089 Report]<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=80841 80841]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=81998 81998]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-04.pdf 04/12] <br />
| [GGUS]/<br />
[05/12] <br />
| [GGUS]/ <br />
[06/12] <br />
| [GGUS]/<br />
[07/12] <br />
| [GGUS]/ <br />
[08/12] <br />
| [GGUS]/ <br />
[09/12] <br />
| [GGUS]/ <br />
[10/12] <br />
| [GGUS]/ <br />
[11/12] <br />
|[GGUS]/ <br />
[12/12] <br />
|}<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. An Excel version is available at [http://gvdev.cern.ch/GVPC/Excel/ACE/]<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=38023Resource Centres OLA and Resource infrastructure Provider OLA reports2012-07-05T15:19:36Z<p>Fergadis: /* Resource Infrastructures */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.egi.eu/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Performance reports=<br />
<br />
== Resource Centres ==<br />
[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE)<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
! Availability/Reliability<br />
! Jan<br />
! Feb<br />
! Mar<br />
! Apr<br />
! May<br />
! Jun<br />
! Jul<br />
! Aug<br />
! Sep<br />
! Oct<br />
! Nov<br />
! Dec<br />
|-<br />
! 2010<br />
| -<br />
| -<br />
| -<br />
| -<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012<br />
|[https://documents.egi.eu/document/1000 01/12]<br />
|[https://documents.egi.eu/document/1033 02/12]<br />
|[https://documents.egi.eu/document/1091 03/12]<br />
|[https://documents.egi.eu/document/1117 04/12]<br />
|[https://documents.egi.eu/document/1174 05/12]<br />
|[https://documents.egi.eu/document/1251 06/12]<br />
|[ 07/12]<br />
|[ 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
== Resource Infrastructures ==<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: '''<br />
'''top-BDII Availability/Reliability''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1091&version=2&filename=EGI-core_services_availabilities-per_NGI-Mar2012.pdf 03/12]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf 04/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1174&version=3&filename=EGI-core_services_availabilities-per_NGI-May2012-1.pdf 05/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1251&version=1&filename=EGI-core_services_availabilities-per_NGI-June2012.pdf 06/12] <br />
| [07/12] <br />
| [08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|}<br />
<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' <br />
ticket/[https://documents.egi.eu/document/1089 Report]<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=80841 80841]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=81998 81998]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-04.pdf 04/12] <br />
| [GGUS]/<br />
[05/12] <br />
| [GGUS]/ <br />
[06/12] <br />
| [GGUS]/<br />
[07/12] <br />
| [GGUS]/ <br />
[08/12] <br />
| [GGUS]/ <br />
[09/12] <br />
| [GGUS]/ <br />
[10/12] <br />
| [GGUS]/ <br />
[11/12] <br />
|[GGUS]/ <br />
[12/12] <br />
|}<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. An Excel version is available at [http://gvdev.cern.ch/GVPC/Excel/ACE/]<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=38020Resource Centres OLA and Resource infrastructure Provider OLA reports2012-07-05T15:18:42Z<p>Fergadis: /* Resource Centres */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.egi.eu/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Performance reports=<br />
<br />
== Resource Centres ==<br />
[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE)<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
! Availability/Reliability<br />
! Jan<br />
! Feb<br />
! Mar<br />
! Apr<br />
! May<br />
! Jun<br />
! Jul<br />
! Aug<br />
! Sep<br />
! Oct<br />
! Nov<br />
! Dec<br />
|-<br />
! 2010<br />
| -<br />
| -<br />
| -<br />
| -<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012<br />
|[https://documents.egi.eu/document/1000 01/12]<br />
|[https://documents.egi.eu/document/1033 02/12]<br />
|[https://documents.egi.eu/document/1091 03/12]<br />
|[https://documents.egi.eu/document/1117 04/12]<br />
|[https://documents.egi.eu/document/1174 05/12]<br />
|[https://documents.egi.eu/document/1251 06/12]<br />
|[ 07/12]<br />
|[ 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
== Resource Infrastructures ==<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: '''<br />
'''top-BDII Availability/Reliability''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1091&version=2&filename=EGI-core_services_availabilities-per_NGI-Mar2012.pdf 03/12]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf 04/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1174&version=3&filename=EGI-core_services_availabilities-per_NGI-May2012-1.pdf 05/12] <br />
| [06/12] <br />
| [07/12] <br />
| [08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|}<br />
<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' <br />
ticket/[https://documents.egi.eu/document/1089 Report]<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=80841 80841]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=81998 81998]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-04.pdf 04/12] <br />
| [GGUS]/<br />
[05/12] <br />
| [GGUS]/ <br />
[06/12] <br />
| [GGUS]/<br />
[07/12] <br />
| [GGUS]/ <br />
[08/12] <br />
| [GGUS]/ <br />
[09/12] <br />
| [GGUS]/ <br />
[10/12] <br />
| [GGUS]/ <br />
[11/12] <br />
|[GGUS]/ <br />
[12/12] <br />
|}<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. An Excel version is available at [http://gvdev.cern.ch/GVPC/Excel/ACE/]<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=37375Resource Centres OLA and Resource infrastructure Provider OLA reports2012-06-05T12:05:43Z<p>Fergadis: /* Resource Infrastructures */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.egi.eu/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Performance reports=<br />
<br />
== Resource Centres ==<br />
[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE)<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
! Availability/Reliability<br />
! Jan<br />
! Feb<br />
! Mar<br />
! Apr<br />
! May<br />
! Jun<br />
! Jul<br />
! Aug<br />
! Sep<br />
! Oct<br />
! Nov<br />
! Dec<br />
|-<br />
! 2010<br />
| -<br />
| -<br />
| -<br />
| -<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012<br />
|[https://documents.egi.eu/document/1000 01/12]<br />
|[https://documents.egi.eu/document/1033 02/12]<br />
|[https://documents.egi.eu/document/1091 03/12]<br />
|[https://documents.egi.eu/document/1117 04/12]<br />
|[https://documents.egi.eu/document/1174 05/12]<br />
|[ 06/12]<br />
|[ 07/12]<br />
|[ 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
== Resource Infrastructures ==<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: '''<br />
'''top-BDII Availability/Reliability''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1091&version=2&filename=EGI-core_services_availabilities-per_NGI-Mar2012.pdf 03/12]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf 04/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1174&version=3&filename=EGI-core_services_availabilities-per_NGI-May2012-1.pdf 05/12] <br />
| [06/12] <br />
| [07/12] <br />
| [08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|}<br />
<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' <br />
ticket/[https://documents.egi.eu/document/1089 Report]<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=80841 80841]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=81998 81998]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-04.pdf 04/12] <br />
| [GGUS]/<br />
[05/12] <br />
| [GGUS]/ <br />
[06/12] <br />
| [GGUS]/<br />
[07/12] <br />
| [GGUS]/ <br />
[08/12] <br />
| [GGUS]/ <br />
[09/12] <br />
| [GGUS]/ <br />
[10/12] <br />
| [GGUS]/ <br />
[11/12] <br />
|[GGUS]/ <br />
[12/12] <br />
|}<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. An Excel version is available at [http://gvdev.cern.ch/GVPC/Excel/ACE/]<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Top-BDII_list_for_NGI&diff=37276Top-BDII list for NGI2012-06-01T15:01:44Z<p>Fergadis: /* Top-BDIIs operated by the NGIs */</p>
<hr />
<div>{{Template:Op menubar}} <br />
__TOC__<br />
= top-BDII Availability and Reliability =<br />
This page contains the list of the Top-BDII instances that are operated by the NGIs. Starting from October 2011, these instances will be considered for the generation of the '''monthly NGI Availability and Reliability report'''. <br />
<br />
Availability and Reliability figures are extracted from [https://grid-monitoring.cern.ch/myegi/sa/ MyEGI] (ROC profile, OPS VO).<br />
<br />
'''Notes''': <br />
<br />
* if you think that a wrong top-BDII entry is mentioned in the table below, please send a GGUS ticket to the Operations support unit indicating the entry to replace and the new service end-point.<br />
<br />
* the same top-BDII instance can be reported for more than one NGI. This is the case if two ore more NGIs are sharing this service, this is typically the case when they are operated by the same Operations Centre.<br />
<br />
* several top-BDII instances can be reported for one NGI. This is the case one a NGI deploys failover at client side using multiple top-BDII instances as alternative contact points ([[MAN05]]). In this case the NGI overall top-BDII hourly availability is calculated by OR-ing the hourly availability of the individual top-BDIIs.<br />
<br />
= Top-BDIIs operated by the NGIs =<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
! Operations Centre <br />
! Top-BDII host(s) <br />
! Notes<br />
|-<br />
| AsiaPacific <br />
| bdii.grid.sinica.edu.tw <br />
| <br><br />
|-<br />
| CERN <br />
| lcg-bdii.cern.ch <br />
| <br><br />
|-<br />
| Serbia <br />
| bdii.ipb.ac.rs <br />
| <br><br />
|-<br />
| Albania <br />
| <br />
| <br><br />
|-<br />
| Armenia <br />
| bdii.grid.am <br />
| <br><br />
|-<br />
| Austria<br />
| egee-bdii.cnaf.infn.it <br />
| Austrian sites are collaborating with the NGI_IT Operations Center<br />
|-<br />
| Bosnia Herzegovina<br />
| c15.grid.etfbl.net <br />
| <br><br />
|-<br />
| Bulgaria <br />
| bdii.ipp.acad.bg <br />
| <br><br />
|-<br />
| Belarus <br />
| topbdii.glite.basnet.by <br />
| <br><br />
|-<br />
| Switzerland<br />
| bdii-fzk.gridka.de <br />
| <br><br />
|-<br />
| Cyprus<br />
| bdii101.grid.ucy.ac.cy <br />
| <br><br />
|-<br />
| Czech Republic <br />
| bdii1rr.farm.particle.cz<br />
| DNS alias for two instances running at different sites.<br />
|-<br />
| Germany<br />
| bdii-fzk.gridka.de <br />
| <br><br />
|-<br />
| Denmark <br />
| <br> <br />
| <br><br />
|-<br />
| Spain<br />
| topbdii.egi.cesga.es, gridii01.ifca.es, bdii.pic.es, topbdii01.ncg.ingrid.pt<br />
| Spanish sites also point to the Portuguese TopBDII<br />
|-<br />
| Finland <br />
| <br />
| No Top-BDII currently in use<br />
|-<br />
| France <br />
| cclcgtopbdii01.in2p3.fr,topbdii.grif.fr <br />
| <br><br />
|-<br />
| Georgia <br />
| s2.ngi.grena.ge <br />
| <br><br />
|-<br />
| Greece <br />
| bdii.marie.hellasgrid.gr,bdii.ariagni.hellasgrid.gr,bdii.athena.hellasgrid.gr,bdii.isabella.grnet.gr <br />
| <br><br />
|-<br />
| Croatia <br />
| bdii-egee.srce.hr <br />
| <br><br />
|-<br />
| Hungary<br />
| grid152.kfki.hu <br />
| <br><br />
|-<br />
| Ireland<br />
| <br />
| All sites use the CERN top-bdii<br />
|-<br />
| Israel<br />
| wipp-bdii.weizmann.ac.il <br />
| <br><br />
|-<br />
| Italy <br />
| egee-bdii.cnaf.infn.it <br />
| five instances under dns round robin<br />
|-<br />
| Lithuania <br />
| bdii.mif.vu.lt <br />
| <br><br />
|-<br />
| Latvia<br />
| bdii.grid.etf.rtu.lv <br />
| <br><br />
|-<br />
| FYR Macedonia <br />
| grid-bdii.ii.edu.mk,bdii.hpgcc.finki.ukim.mk <br />
| <br><br />
|-<br />
| Moldova <br />
| node06-02.imi.renam.md<br />
| <br><br />
|-<br />
| Montenegro<br />
| bdii.grid.ac.me <br />
| <br><br />
|-<br />
| Netherlands<br />
| kraal.nikhef.nl,bdii03.nikhef.nl,bdii.grid.sara.nl,bdii2.grid.sara.nl <br />
| <br><br />
|-<br />
| Norway<br />
| <br />
| No Top-BDII currently in use<br />
|-<br />
| Poland <br />
| bdii-top.reef.man.poznan.pl, zeus60.cyf-kr.edu.pl, topbdii.polgrid.pl <br />
| All 3 working under DNS pool bdii.cyf-kr.edu.pl<br />
|-<br />
| Portugal <br />
| topbdii01.ncg.ingrid.pt, topbdii.egi.cesga.es, gridii01.ifca.es, bdii.pic.es <br />
| Portuguese sites also point to the Spanish TopBDIIs<br />
|-<br />
| Romania<br />
| bdii.grid.ici.ro, bdii.nipne.ro <br />
| <br><br />
|-<br />
| Slovenia <br />
| bdii.sling.si <br />
| <br><br />
|-<br />
| Slovakia<br />
| bdii.ui.savba.sk,ii.grid.tuke.sk<br />
| <br><br />
|-<br />
| Sweden <br />
| <br />
| <br><br />
|-<br />
| Turkey <br />
| bdii.ulakbim.gov.tr <br />
| <br><br />
|-<br />
| UK <br />
| lcgbdii.gridpp.rl.ac.uk,topbdii.grid.hep.ph.ic.ac.uk<br />
| <br><br />
|-<br />
| SAGrid <br />
| srvslngrd001.uct.ac.za <br />
| <br><br />
|-<br />
| IGALC <br />
| is.igalc.org <br />
| <br><br />
|-<br />
| ROC_LA <br />
| top-bdii.roc-la.org <br />
| <br><br />
|-<br />
| Russia <br />
| lcg15.sinp.msu.ru <br />
| <br><br />
|}<br />
<br />
=Algorithm =<br />
top-BDII availability/reliability statistics are based on availability/reliability data that is published by MyEGI through the SAM Programmatic Interface (SAM PI). [https://grid-monitoring.cern.ch/myegi/sa/ MyEGI] is the authoritative source of availability/reliability data (GridView is now decommissioned).<br />
<br />
A brief description of the algorithm adopted to compute top-BDII availbility and reliability statistics follows.<br />
<br />
Different schenarios are possible.<br />
<br />
== Scenario 1. Single top-BDII ==<br />
The NGI reports only one Top-BDII. It could be a single top-bdii instance, or alternatively a DNS alias for a pool of top-BDII instances (the DNS alias MUST BE monitored by the NGI Nagios). In this case Availability and Reliability are the ''monthly'' Availability and Reliability reported directly by the MyEGI Programmatic Interface.<br />
<br />
'''Note''': If the alias is used by the sites but only the single instances in the pool are monitored by Nagios, then in the table above the list of instances per NGI has to be reported. In this case Availability and Reliability are <br />
computed according to Scenario 2.<br />
<br />
== Scenario 2. List of top-BDIIs ==<br />
The NGI reports a list of top-BDIIs. These are all used by the sites to configure failover at client side. <br />
In this case the algorithm steps are the following:<br />
#Query MyEGI to get the '''hourly''' A/R values for every top-BDII instance in the list. ([[SAM_PI_examples|Example of query]])<br />
#Hourly Availability and Reliability: given the list of Availability/Reliability figures for the top-BDII in the list, the maximum Availability/Reliability figures are computed. These are selected to be the hourly Availability/Reliability. ([[SAM_PI_examples|Example of query]])<br />
#Monthly Availability and Reliability: the monthly Availability/Reliability is the arithmetic mean of the hourly Availability/Reliability figures in the reference month:<br />
::'''Monthly A/R''': (sum of the '''Hourly A/R''' figures)/(number of hours when the status was known).<br />
[[Category:Service_Level_Management]]</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=37274Resource Centres OLA and Resource infrastructure Provider OLA reports2012-06-01T14:22:42Z<p>Fergadis: /* Resource Infrastructures */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.egi.eu/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Performance reports=<br />
<br />
== Resource Centres ==<br />
[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE)<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
! Availability/Reliability<br />
! Jan<br />
! Feb<br />
! Mar<br />
! Apr<br />
! May<br />
! Jun<br />
! Jul<br />
! Aug<br />
! Sep<br />
! Oct<br />
! Nov<br />
! Dec<br />
|-<br />
! 2010<br />
| -<br />
| -<br />
| -<br />
| -<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012<br />
|[https://documents.egi.eu/document/1000 01/12]<br />
|[https://documents.egi.eu/document/1033 02/12]<br />
|[https://documents.egi.eu/document/1091 03/12]<br />
|[https://documents.egi.eu/document/1117 04/12]<br />
|[https://documents.egi.eu/document/1174 05/12]<br />
|[ 06/12]<br />
|[ 07/12]<br />
|[ 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
== Resource Infrastructures ==<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: '''<br />
'''top-BDII Availability/Reliability''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1091&version=2&filename=EGI-core_services_availabilities-per_NGI-Mar2012.pdf 03/12]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf 04/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1174&version=1&filename=EGI-core_services_availabilities-per_NGI-May2012%20NGIs%20core%20services-v1.1.pdf 05/12] <br />
| [06/12] <br />
| [07/12] <br />
| [08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|}<br />
<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' <br />
ticket/[https://documents.egi.eu/document/1089 Report]<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=80841 80841]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=81998 81998]/ <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-04.pdf 04/12] <br />
| [GGUS]/<br />
[05/12] <br />
| [GGUS]/ <br />
[06/12] <br />
| [GGUS]/<br />
[07/12] <br />
| [GGUS]/ <br />
[08/12] <br />
| [GGUS]/ <br />
[09/12] <br />
| [GGUS]/ <br />
[10/12] <br />
| [GGUS]/ <br />
[11/12] <br />
|[GGUS]/ <br />
[12/12] <br />
|}<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. An Excel version is available at [http://gvdev.cern.ch/GVPC/Excel/ACE/]<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=37272Resource Centres OLA and Resource infrastructure Provider OLA reports2012-06-01T14:15:03Z<p>Fergadis: /* Resource Infrastructures */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.egi.eu/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Performance reports=<br />
<br />
== Resource Centres ==<br />
[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE)<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
! Availability/Reliability<br />
! Jan<br />
! Feb<br />
! Mar<br />
! Apr<br />
! May<br />
! Jun<br />
! Jul<br />
! Aug<br />
! Sep<br />
! Oct<br />
! Nov<br />
! Dec<br />
|-<br />
! 2010<br />
| -<br />
| -<br />
| -<br />
| -<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012<br />
|[https://documents.egi.eu/document/1000 01/12]<br />
|[https://documents.egi.eu/document/1033 02/12]<br />
|[https://documents.egi.eu/document/1091 03/12]<br />
|[https://documents.egi.eu/document/1117 04/12]<br />
|[https://documents.egi.eu/document/1174 05/12]<br />
|[ 06/12]<br />
|[ 07/12]<br />
|[ 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
== Resource Infrastructures ==<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: '''<br />
'''top-BDII Availability/Reliability''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1091&version=2&filename=EGI-core_services_availabilities-per_NGI-Mar2012.pdf 03/12]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf 04/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1174&version=1&filename=EGI-core_services_availabilities-per_NGI-May2012%20NGIs%20core%20services-v1.1.pdf 05/12] <br />
| [06/12] <br />
| [07/12] <br />
| [08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|}<br />
<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' <br />
ticket/[https://documents.egi.eu/document/1089 Report]<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
| [GGUS]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
| [GGUS]/ <br />
[04/12] <br />
| [GGUS]/<br />
[05/12] <br />
| [GGUS]/ <br />
[06/12] <br />
| [GGUS]/<br />
[07/12] <br />
| [GGUS]/ <br />
[08/12] <br />
| [GGUS]/ <br />
[09/12] <br />
| [GGUS]/ <br />
[10/12] <br />
| [GGUS]/ <br />
[11/12] <br />
|[GGUS]/ <br />
[12/12] <br />
|}<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. An Excel version is available at [http://gvdev.cern.ch/GVPC/Excel/ACE/]<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=37271Resource Centres OLA and Resource infrastructure Provider OLA reports2012-06-01T14:13:59Z<p>Fergadis: /* Resource Centres */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.egi.eu/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Performance reports=<br />
<br />
== Resource Centres ==<br />
[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE)<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
! Availability/Reliability<br />
! Jan<br />
! Feb<br />
! Mar<br />
! Apr<br />
! May<br />
! Jun<br />
! Jul<br />
! Aug<br />
! Sep<br />
! Oct<br />
! Nov<br />
! Dec<br />
|-<br />
! 2010<br />
| -<br />
| -<br />
| -<br />
| -<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012<br />
|[https://documents.egi.eu/document/1000 01/12]<br />
|[https://documents.egi.eu/document/1033 02/12]<br />
|[https://documents.egi.eu/document/1091 03/12]<br />
|[https://documents.egi.eu/document/1117 04/12]<br />
|[https://documents.egi.eu/document/1174 05/12]<br />
|[ 06/12]<br />
|[ 07/12]<br />
|[ 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
== Resource Infrastructures ==<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: '''<br />
'''top-BDII Availability/Reliability''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1091&version=2&filename=EGI-core_services_availabilities-per_NGI-Mar2012.pdf 03/12]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf 04/12]<br />
| [05/12] <br />
| [06/12] <br />
| [07/12] <br />
| [08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|}<br />
<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' <br />
ticket/[https://documents.egi.eu/document/1089 Report]<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
| [GGUS]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
| [GGUS]/ <br />
[04/12] <br />
| [GGUS]/<br />
[05/12] <br />
| [GGUS]/ <br />
[06/12] <br />
| [GGUS]/<br />
[07/12] <br />
| [GGUS]/ <br />
[08/12] <br />
| [GGUS]/ <br />
[09/12] <br />
| [GGUS]/ <br />
[10/12] <br />
| [GGUS]/ <br />
[11/12] <br />
|[GGUS]/ <br />
[12/12] <br />
|}<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. An Excel version is available at [http://gvdev.cern.ch/GVPC/Excel/ACE/]<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=SLM_RP_OLA_Service_Levels&diff=36585SLM RP OLA Service Levels2012-05-07T11:33:19Z<p>Fergadis: </p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Service Level Management]]<br />
This page provides information on the current service levels provided by the Resource infrastructure Provider.<br />
<br />
As of January 2012, it is mandatory that top-BDII services operated by NGIs provide a minimum availability of 99% (see the [https://documents.egi.eu/document/463 RP Operational Level Agreement] and its [[Resource_infrastructure_Provider_OLA:_Release_Notes|Release Notes]] for details). Availability/Reliability NGI reports are distributed monthly.<br />
<br />
Note: Service Level Targets specified below will come into force as of January 2011.<br />
{| border="1" cellspacing="0" cellpadding="5" align="center"<br />
| '''minimum top-BDII Availability'''<br />
| 99%, profile [http://grid-monitoring.cern.ch/myegi/sam-pi/metrics_in_profiles?vo_name=ops&profile_name=ROC ROC]<br />
|-<br />
| '''minimum top-BDII Reliability'''<br />
| 99%, profile [http://grid-monitoring.cern.ch/myegi/sam-pi/metrics_in_profiles?vo_name=ops&profile_name=ROC ROC]<br />
|-<br />
|'''[[Grid_operations_oversight/ROD_performance_index|maximum ROD Performance Index]]'''<br />
| 10<br />
|-<br />
|'''Liability'''<br />
|Resource infrastructure Providers not providing the requested monthly performance for one month MUST provide a service improvement plan.<br />
|}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=36345Resource Centres OLA and Resource infrastructure Provider OLA reports2012-05-03T10:21:25Z<p>Fergadis: /* Resource Infrastructures */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.egi.eu/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Performance reports=<br />
<br />
== Resource Centres ==<br />
[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE)<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
! Availability/Reliability<br />
! Jan<br />
! Feb<br />
! Mar<br />
! Apr<br />
! May<br />
! Jun<br />
! Jul<br />
! Aug<br />
! Sep<br />
! Oct<br />
! Nov<br />
! Dec<br />
|-<br />
! 2010<br />
| -<br />
| -<br />
| -<br />
| -<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012<br />
|[https://documents.egi.eu/document/1000 01/12]<br />
|[https://documents.egi.eu/document/1033 02/12]<br />
|[https://documents.egi.eu/document/1091 03/12]<br />
|[https://documents.egi.eu/document/1117 04/12]<br />
|[ 05/12]<br />
|[ 06/12]<br />
|[ 07/12]<br />
|[ 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
== Resource Infrastructures ==<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: '''<br />
'''top-BDII Availability/Reliability''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12]<br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1091&version=2&filename=EGI-core_services_availabilities-per_NGI-Mar2012.pdf 03/12]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf 04/12]<br />
| [05/12] <br />
| [06/12] <br />
| [07/12] <br />
| [08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|}<br />
<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' <br />
ticket/[https://documents.egi.eu/document/1089 Report]<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
| [GGUS]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
| [GGUS]/ <br />
[04/12] <br />
| [GGUS]/<br />
[05/12] <br />
| [GGUS]/ <br />
[06/12] <br />
| [GGUS]/<br />
[07/12] <br />
| [GGUS]/ <br />
[08/12] <br />
| [GGUS]/ <br />
[09/12] <br />
| [GGUS]/ <br />
[10/12] <br />
| [GGUS]/ <br />
[11/12] <br />
|[GGUS]/ <br />
[12/12] <br />
|}<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. An Excel version is available at [http://gvdev.cern.ch/GVPC/Excel/ACE/]<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=36344Resource Centres OLA and Resource infrastructure Provider OLA reports2012-05-03T10:16:13Z<p>Fergadis: /* Resource Centres */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.egi.eu/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Performance reports=<br />
<br />
== Resource Centres ==<br />
[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE)<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
! Availability/Reliability<br />
! Jan<br />
! Feb<br />
! Mar<br />
! Apr<br />
! May<br />
! Jun<br />
! Jul<br />
! Aug<br />
! Sep<br />
! Oct<br />
! Nov<br />
! Dec<br />
|-<br />
! 2010<br />
| -<br />
| -<br />
| -<br />
| -<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012<br />
|[https://documents.egi.eu/document/1000 01/12]<br />
|[https://documents.egi.eu/document/1033 02/12]<br />
|[https://documents.egi.eu/document/1091 03/12]<br />
|[https://documents.egi.eu/document/1117 04/12]<br />
|[ 05/12]<br />
|[ 06/12]<br />
|[ 07/12]<br />
|[ 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
== Resource Infrastructures ==<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: '''<br />
'''top-BDII Availability/Reliability''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12] <br />
| [03/12] <br />
| [04/12] <br />
| [05/12] <br />
| [06/12] <br />
| [07/12] <br />
| [08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|}<br />
<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' <br />
ticket/[https://documents.egi.eu/document/1089 Report]<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
| [GGUS]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
| [GGUS]/ <br />
[04/12] <br />
| [GGUS]/<br />
[05/12] <br />
| [GGUS]/ <br />
[06/12] <br />
| [GGUS]/<br />
[07/12] <br />
| [GGUS]/ <br />
[08/12] <br />
| [GGUS]/ <br />
[09/12] <br />
| [GGUS]/ <br />
[10/12] <br />
| [GGUS]/ <br />
[11/12] <br />
|[GGUS]/ <br />
[12/12] <br />
|}<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. An Excel version is available at [http://gvdev.cern.ch/GVPC/Excel/ACE/]<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI-InSPIRE:SA1.8-QR8&diff=36229EGI-InSPIRE:SA1.8-QR82012-05-02T12:39:15Z<p>Fergadis: /* 4. Plans for the next period */</p>
<hr />
<div>__NOTOC__<br />
= 1. Task Meetings =<br />
<!--<br />
Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all task meetings participants are ALWAYS recorded either on indico from the registrants’ list, or in the minutes. <br />
OMB meeting will be reported under task TSA1.1 only. Monday Operations meetings need to be reported under task TSA1.3 only. Training events will be recorded in the training event registry and need not be mentioned here.<br />
--><br />
{| border="1" cellspacing="0" cellpadding="5" align="center"<br />
! style="width: 10%" | Date (dd/mm/yyyy)<br />
! style="width: 20%" | Url Indico Agenda<br />
! style="width: 20%" | Title<br />
! style="width: 50%" | Outcome<br />
|-<br />
|<br />
|<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 2. Main Achievements = <br />
<!--<br />
Note. This is a detailed account of progress over the previous quarter of activities within the task. <br />
--><br />
<br />
* Continue publishing of A/R league tables - OPS<br />
<br />
* Continue publishing of A/R reports for NGI Core services (top-BDII)<br />
<br />
* The RC OLA was updated to v1.1. [[ https://wiki.egi.eu/wiki/Resource_Centre_OLA:_Release_Notes#Resource_Centre_OLA_v._1.1 | Changes ]] were approved at [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)] <br />
<br />
* The RP OLA was updated to v1.1. [[ https://wiki.egi.eu/wiki/Resource_infrastructure_Provider_OLA:_Release_Notes#Release_notes_v._1.1 | Changes ]] were approved at [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)] <br />
<br />
* A draft version of [https://documents.egi.eu/document/1093 EGI.eu OLA] has been published.<br />
<br />
* Assessment of the EGI.eu central tools service level targets.<br>The availability metrics for individual tool instances were extracted from the [https://ops-monitor.cern.ch/nagios/ Nagios] and [https://www.egi.eu/indico/getFile.py/access?contribId=7&resId=1&materialId=slides&confId=719 presented] at the [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)].<br>The OMB agreed to the [https://www.egi.eu/indico/getFile.py/access?contribId=7&resId=0&materialId=slides&confId=719 proposed] service level targets.<br />
<br />
* A draft version of [https://documents.egi.eu/document/1057 MS418] "Operational Level Agreements (OLAs) within the EGI production infrastructure" has been published.<br />
<br />
* Clarify the procedure for EGI sites/regions in the new SAM [https://tomtools.cern.ch/confluence/display/SAMDOC/Availability+Re-computation+Policy Availability Re-computation Policy].<br />
<br />
* Starting from the February 2012 A/R reports, a parent GGUS ticket is created per month in order to gather all the re-computation requests. <br />
<br />
== EGI Catch-All Core Service ==<br />
<br />
* Operation of WMS, LB and Top-BDII for site certification<br />
<br />
* Operation of certification Catch-All portal<br />
<br />
=== dteam VO ===<br />
<br />
* Operation of VOMS/VORMS for dteam VO.<br />
<br />
* New dteam/EGI group has been created. The dteam [http://operations-portal.egi.eu/vo/downloadAUP/file/dteam-AcceptableUsePolicy-20110926-1316993681969.txt Acceptable Use Policy] had to change slightly to allow this.<br />
<br />
* New dteam/NGI_UA (Ukraine) has been created.<br />
<br />
* Add the new "UK e-Science CA 2B" certificate to all the dteam VO members which had the old "UK e-Science CA" [https://ggus.eu/ws/ticket_info.php?ticket=78909].<br />
<br />
= 3. Issues and Mitigation =<br />
<!-- fill the table below --><br />
<br />
{| border="1" cellspacing="0" cellpadding="2"<br />
|-<br />
!scope="col"| Issue Description<br />
!scope="col"| Mitigation Description<br />
|-<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 4. Plans for the next period =<br />
<!-- provide your text below --><br />
<br />
* The final version of EGI.eu OLA is expected in 2012Q3<br />
<br />
* The milestone MS418 "Operational Level Agreements (OLAs) within the EGI production infrastructure" is expected in 2012Q2<br />
<br />
* Continue the handling the validation and distribution of the monthly A/R reports and the maintenance of the relevant wiki page</div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI-InSPIRE:SA1.8-QR8&diff=36225EGI-InSPIRE:SA1.8-QR82012-05-02T12:26:36Z<p>Fergadis: /* 2. Main Achievements */</p>
<hr />
<div>__NOTOC__<br />
= 1. Task Meetings =<br />
<!--<br />
Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all task meetings participants are ALWAYS recorded either on indico from the registrants’ list, or in the minutes. <br />
OMB meeting will be reported under task TSA1.1 only. Monday Operations meetings need to be reported under task TSA1.3 only. Training events will be recorded in the training event registry and need not be mentioned here.<br />
--><br />
{| border="1" cellspacing="0" cellpadding="5" align="center"<br />
! style="width: 10%" | Date (dd/mm/yyyy)<br />
! style="width: 20%" | Url Indico Agenda<br />
! style="width: 20%" | Title<br />
! style="width: 50%" | Outcome<br />
|-<br />
|<br />
|<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 2. Main Achievements = <br />
<!--<br />
Note. This is a detailed account of progress over the previous quarter of activities within the task. <br />
--><br />
<br />
* Continue publishing of A/R league tables - OPS<br />
<br />
* Continue publishing of A/R reports for NGI Core services (top-BDII)<br />
<br />
* The RC OLA was updated to v1.1. [[ https://wiki.egi.eu/wiki/Resource_Centre_OLA:_Release_Notes#Resource_Centre_OLA_v._1.1 | Changes ]] were approved at [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)] <br />
<br />
* The RP OLA was updated to v1.1. [[ https://wiki.egi.eu/wiki/Resource_infrastructure_Provider_OLA:_Release_Notes#Release_notes_v._1.1 | Changes ]] were approved at [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)] <br />
<br />
* A draft version of [https://documents.egi.eu/document/1093 EGI.eu OLA] has been published.<br />
<br />
* Assessment of the EGI.eu central tools service level targets.<br>The availability metrics for individual tool instances were extracted from the [https://ops-monitor.cern.ch/nagios/ Nagios] and [https://www.egi.eu/indico/getFile.py/access?contribId=7&resId=1&materialId=slides&confId=719 presented] at the [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)].<br>The OMB agreed to the [https://www.egi.eu/indico/getFile.py/access?contribId=7&resId=0&materialId=slides&confId=719 proposed] service level targets.<br />
<br />
* A draft version of [https://documents.egi.eu/document/1057 MS418] "Operational Level Agreements (OLAs) within the EGI production infrastructure" has been published.<br />
<br />
* Clarify the procedure for EGI sites/regions in the new SAM [https://tomtools.cern.ch/confluence/display/SAMDOC/Availability+Re-computation+Policy Availability Re-computation Policy].<br />
<br />
* Starting from the February 2012 A/R reports, a parent GGUS ticket is created per month in order to gather all the re-computation requests. <br />
<br />
== EGI Catch-All Core Service ==<br />
<br />
* Operation of WMS, LB and Top-BDII for site certification<br />
<br />
* Operation of certification Catch-All portal<br />
<br />
=== dteam VO ===<br />
<br />
* Operation of VOMS/VORMS for dteam VO.<br />
<br />
* New dteam/EGI group has been created. The dteam [http://operations-portal.egi.eu/vo/downloadAUP/file/dteam-AcceptableUsePolicy-20110926-1316993681969.txt Acceptable Use Policy] had to change slightly to allow this.<br />
<br />
* New dteam/NGI_UA (Ukraine) has been created.<br />
<br />
* Add the new "UK e-Science CA 2B" certificate to all the dteam VO members which had the old "UK e-Science CA" [https://ggus.eu/ws/ticket_info.php?ticket=78909].<br />
<br />
= 3. Issues and Mitigation =<br />
<!-- fill the table below --><br />
<br />
{| border="1" cellspacing="0" cellpadding="2"<br />
|-<br />
!scope="col"| Issue Description<br />
!scope="col"| Mitigation Description<br />
|-<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 4. Plans for the next period =<br />
<!-- provide your text below --><br />
* ...<br />
<br />
* Continue the handling the validation and distribution of the monthly A/R reports and the maintenance of the relevant wiki page</div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI-InSPIRE:SA1.8-QR8&diff=36193EGI-InSPIRE:SA1.8-QR82012-05-02T09:37:53Z<p>Fergadis: /* 2. Main Achievements */</p>
<hr />
<div>__NOTOC__<br />
= 1. Task Meetings =<br />
<!--<br />
Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all task meetings participants are ALWAYS recorded either on indico from the registrants’ list, or in the minutes. <br />
OMB meeting will be reported under task TSA1.1 only. Monday Operations meetings need to be reported under task TSA1.3 only. Training events will be recorded in the training event registry and need not be mentioned here.<br />
--><br />
{| border="1" cellspacing="0" cellpadding="5" align="center"<br />
! style="width: 10%" | Date (dd/mm/yyyy)<br />
! style="width: 20%" | Url Indico Agenda<br />
! style="width: 20%" | Title<br />
! style="width: 50%" | Outcome<br />
|-<br />
|<br />
|<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 2. Main Achievements = <br />
<!--<br />
Note. This is a detailed account of progress over the previous quarter of activities within the task. <br />
--><br />
<br />
* Continue publishing of A/R league tables - OPS<br />
<br />
* Continue publishing of A/R reports for NGI Core services (top-BDII)<br />
<br />
* The RC OLA was updated to v1.1. [[ https://wiki.egi.eu/wiki/Resource_Centre_OLA:_Release_Notes#Resource_Centre_OLA_v._1.1 | Changes ]] were approved at [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)] <br />
<br />
* The RP OLA was updated to v1.1. [[ https://wiki.egi.eu/wiki/Resource_infrastructure_Provider_OLA:_Release_Notes#Release_notes_v._1.1 | Changes ]] were approved at [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)] <br />
<br />
* A draft version of [https://documents.egi.eu/document/1093 EGI.eu OLA] has been published.<br />
<br />
* Assessment of the EGI.eu central tools service level targets.<br>The availability metrics for individual tool instances were extracted from the [https://ops-monitor.cern.ch/nagios/ Nagios] and [https://www.egi.eu/indico/getFile.py/access?contribId=7&resId=1&materialId=slides&confId=719 presented] at the [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)].<br>The OMB agreed to the [https://www.egi.eu/indico/getFile.py/access?contribId=7&resId=0&materialId=slides&confId=719 proposed] service level targets.<br />
<br />
* A draft version of [https://documents.egi.eu/document/1057 MS418] "Operational Level Agreements (OLAs) within the EGI production infrastructure" has been published.<br />
<br />
== EGI Catch-All Core Service ==<br />
<br />
* Operation of WMS, LB and Top-BDII for site certification<br />
<br />
* Operation of certification Catch-All portal<br />
<br />
=== dteam VO ===<br />
<br />
* Operation of VOMS/VORMS for dteam VO.<br />
<br />
* New dteam/EGI group has been created. The dteam [http://operations-portal.egi.eu/vo/downloadAUP/file/dteam-AcceptableUsePolicy-20110926-1316993681969.txt Acceptable Use Policy] had to change slightly to allow this.<br />
<br />
* New dteam/NGI_UA (Ukraine) has been created.<br />
<br />
* Add the new "UK e-Science CA 2B" certificate to all the dteam VO members which had the old "UK e-Science CA" [https://ggus.eu/ws/ticket_info.php?ticket=78909].<br />
<br />
= 3. Issues and Mitigation =<br />
<!-- fill the table below --><br />
<br />
{| border="1" cellspacing="0" cellpadding="2"<br />
|-<br />
!scope="col"| Issue Description<br />
!scope="col"| Mitigation Description<br />
|-<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 4. Plans for the next period =<br />
<!-- provide your text below --><br />
* ...<br />
<br />
* Continue the handling the validation and distribution of the monthly A/R reports and the maintenance of the relevant wiki page</div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI-InSPIRE:SA1.8-QR8&diff=36079EGI-InSPIRE:SA1.8-QR82012-04-30T13:51:10Z<p>Fergadis: /* 2. Main Achievements */</p>
<hr />
<div>__NOTOC__<br />
= 1. Task Meetings =<br />
<!--<br />
Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all task meetings participants are ALWAYS recorded either on indico from the registrants’ list, or in the minutes. <br />
OMB meeting will be reported under task TSA1.1 only. Monday Operations meetings need to be reported under task TSA1.3 only. Training events will be recorded in the training event registry and need not be mentioned here.<br />
--><br />
{| border="1" cellspacing="0" cellpadding="5" align="center"<br />
! style="width: 10%" | Date (dd/mm/yyyy)<br />
! style="width: 20%" | Url Indico Agenda<br />
! style="width: 20%" | Title<br />
! style="width: 50%" | Outcome<br />
|-<br />
|<br />
|<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 2. Main Achievements = <br />
<!--<br />
Note. This is a detailed account of progress over the previous quarter of activities within the task. <br />
--><br />
<br />
* Continue publishing of A/R league tables - OPS<br />
<br />
* Continue publishing of A/R reports for NGI Core services (top-BDII)<br />
<br />
* The RC OLA was updated to v1.1. [[ https://wiki.egi.eu/wiki/Resource_Centre_OLA:_Release_Notes#Resource_Centre_OLA_v._1.1 | Changes ]] were approved at [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)] <br />
<br />
* The RP OLA was updated to v1.1. [[ https://wiki.egi.eu/wiki/Resource_infrastructure_Provider_OLA:_Release_Notes#Release_notes_v._1.1 | Changes ]] were approved at [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)] <br />
<br />
* The 1st draft of [https://documents.egi.eu/document/1093 EGI.eu OLA] has been written.<br />
<br />
* Assessment of the EGI.eu central tools service level targets.<br>The availability metrics for individual tool instances were extracted from the [https://ops-monitor.cern.ch/nagios/ Nagios] and [https://www.egi.eu/indico/getFile.py/access?contribId=7&resId=1&materialId=slides&confId=719 presented] at the [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)].<br>The OMB agreed to the [https://www.egi.eu/indico/getFile.py/access?contribId=7&resId=0&materialId=slides&confId=719 proposed] service level targets.<br />
<br />
* The ROC_CRITICAL profile replaced the previously used WLCG_CREAM_LCGCE_CRITICAL, starting with the A/R reports of January 2012.<br />
<br />
* As of January 2012, it is mandatory that top-BDII services operated by NGIs provide a minimum availability of 99%.<br />
<br />
* ...<br />
<br />
== EGI Catch-All Core Service ==<br />
<br />
= 3. Issues and Mitigation =<br />
<!-- fill the table below --><br />
<br />
{| border="1" cellspacing="0" cellpadding="2"<br />
|-<br />
!scope="col"| Issue Description<br />
!scope="col"| Mitigation Description<br />
|-<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 4. Plans for the next period =<br />
<!-- provide your text below --><br />
* ...<br />
<br />
* Continue the handling the validation and distribution of the monthly A/R reports and the maintenance of the relevant wiki page</div>Fergadishttps://wiki.egi.eu/w/index.php?title=SLM_RP_OLA_Service_Levels&diff=36078SLM RP OLA Service Levels2012-04-30T13:49:09Z<p>Fergadis: </p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Service Level Management]]<br />
This page provides information on the current service levels provided by the Resource infrastructure Provider.<br />
<br />
As of January 2012, it is mandatory that top-BDII services operated by NGIs provide a minimum availability of 99% (see the [https://documents.egi.eu/document/463 RP Operational Level Agreement] and its [[Resource_infrastructure_Provider_OLA:_Release_Notes|Release Notes]] for details). Availability/Reliability NGI reports are distributed monthly.<br />
<br />
Note: Service Level Targets specified below will come into force as of January 2011.<br />
{| border="1" cellspacing="0" cellpadding="5" align="center"<br />
| '''minimum top-BDII Availability'''<br />
| 99%, profile [http://grid-monitoring.cern.ch/myegi/sam-pi/metrics_in_profiles?vo_name=ops&profile_name=ROC ROC]<br />
|-<br />
| '''minimum top-BDII Reliabilty'''<br />
| 99%, profile [http://grid-monitoring.cern.ch/myegi/sam-pi/metrics_in_profiles?vo_name=ops&profile_name=ROC ROC]<br />
|-<br />
|'''[[Grid_operations_oversight/ROD_performance_index|maximum ROD Performance Index]]'''<br />
| 10<br />
|-<br />
|'''Liability'''<br />
|Resource infrastructure Providers not providing the requested monthly performance for one month MUST provide a service improvement plan.<br />
|}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI-InSPIRE:SA1.8-QR8&diff=36077EGI-InSPIRE:SA1.8-QR82012-04-30T13:44:13Z<p>Fergadis: </p>
<hr />
<div>__NOTOC__<br />
= 1. Task Meetings =<br />
<!--<br />
Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all task meetings participants are ALWAYS recorded either on indico from the registrants’ list, or in the minutes. <br />
OMB meeting will be reported under task TSA1.1 only. Monday Operations meetings need to be reported under task TSA1.3 only. Training events will be recorded in the training event registry and need not be mentioned here.<br />
--><br />
{| border="1" cellspacing="0" cellpadding="5" align="center"<br />
! style="width: 10%" | Date (dd/mm/yyyy)<br />
! style="width: 20%" | Url Indico Agenda<br />
! style="width: 20%" | Title<br />
! style="width: 50%" | Outcome<br />
|-<br />
|<br />
|<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 2. Main Achievements = <br />
<!--<br />
Note. This is a detailed account of progress over the previous quarter of activities within the task. <br />
--><br />
<br />
* Continue publishing of A/R league tables - OPS<br />
<br />
* Continue publishing of A/R reports for NGI Core services (top-BDII)<br />
<br />
* The RC OLA was updated to v1.1. [[ https://wiki.egi.eu/wiki/Resource_Centre_OLA:_Release_Notes#Resource_Centre_OLA_v._1.1 | Changes ]] were approved at [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)] <br />
<br />
* The RP OLA was updated to v1.1. [[ https://wiki.egi.eu/wiki/Resource_infrastructure_Provider_OLA:_Release_Notes#Release_notes_v._1.1 | Changes ]] were approved at [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)] <br />
<br />
* The 1st draft of [https://documents.egi.eu/document/1093 EGI.eu OLA] has been written.<br />
<br />
* Assessment of the EGI.eu central tools service level targets.<br>The availability metrics for individual tool instances were extracted from the [https://ops-monitor.cern.ch/nagios/ Nagios] and [https://www.egi.eu/indico/getFile.py/access?contribId=7&resId=1&materialId=slides&confId=719 presented] at the [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)].<br>The OMB agreed to the [https://www.egi.eu/indico/getFile.py/access?contribId=7&resId=0&materialId=slides&confId=719 proposed] service level targets.<br />
<br />
* The new ROC_CRITICAL profile replaced the previously used WLCG_CREAM_LCGCE_CRITICAL, starting with the A/R reports of January 2012.<br />
<br />
* ...<br />
<br />
== EGI Catch-All Core Service ==<br />
<br />
= 3. Issues and Mitigation =<br />
<!-- fill the table below --><br />
<br />
{| border="1" cellspacing="0" cellpadding="2"<br />
|-<br />
!scope="col"| Issue Description<br />
!scope="col"| Mitigation Description<br />
|-<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 4. Plans for the next period =<br />
<!-- provide your text below --><br />
* ...<br />
<br />
* Continue the handling the validation and distribution of the monthly A/R reports and the maintenance of the relevant wiki page</div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI-InSPIRE:SA1.8-QR8&diff=36068EGI-InSPIRE:SA1.8-QR82012-04-30T12:53:03Z<p>Fergadis: /* 2. Main Achievements */</p>
<hr />
<div>__NOTOC__<br />
= 1. Task Meetings =<br />
<!--<br />
Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all task meetings participants are ALWAYS recorded either on indico from the registrants’ list, or in the minutes. <br />
OMB meeting will be reported under task TSA1.1 only. Monday Operations meetings need to be reported under task TSA1.3 only. Training events will be recorded in the training event registry and need not be mentioned here.<br />
--><br />
{| border="1" cellspacing="0" cellpadding="5" align="center"<br />
! style="width: 10%" | Date (dd/mm/yyyy)<br />
! style="width: 20%" | Url Indico Agenda<br />
! style="width: 20%" | Title<br />
! style="width: 50%" | Outcome<br />
|-<br />
|<br />
|<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 2. Main Achievements = <br />
<!--<br />
Note. This is a detailed account of progress over the previous quarter of activities within the task. <br />
--><br />
<br />
* Continue publishing of A/R league tables - OPS<br />
<br />
* Continue publishing of A/R reports for NGI Core services (top-BDII)<br />
<br />
* The RC OLA was updated to v1.1. [[ https://wiki.egi.eu/wiki/Resource_Centre_OLA:_Release_Notes#Resource_Centre_OLA_v._1.1 | Changes ]] were approved at [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)] <br />
<br />
* The RP OLA was updated to v1.1. [[ https://wiki.egi.eu/wiki/Resource_infrastructure_Provider_OLA:_Release_Notes#Release_notes_v._1.1 | Changes ]] were approved at [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)] <br />
<br />
* The 1st draft of [https://documents.egi.eu/document/1093 EGI.eu OLA] has been written.<br />
<br />
* Assessment of the EGI.eu central tools service level targets.<br>The availability metrics for individual tool instances were extracted from the [https://ops-monitor.cern.ch/nagios/ Nagios] and [https://www.egi.eu/indico/getFile.py/access?contribId=7&resId=1&materialId=slides&confId=719 presented] at the [https://www.egi.eu/indico/conferenceDisplay.py?confId=719 OMB (2012-02-28)].<br>The OMB agreed to the [https://www.egi.eu/indico/getFile.py/access?contribId=7&resId=0&materialId=slides&confId=719 proposed] service level targets.<br />
<br />
* ...<br />
<br />
== EGI Catch-All Core Service ==<br />
<br />
= 3. Issues and Mitigation =<br />
<!-- fill the table below --><br />
<br />
{| border="1" cellspacing="0" cellpadding="2"<br />
|-<br />
!scope="col"| Issue Description<br />
!scope="col"| Mitigation Description<br />
|-<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 4. Plans for the next period =<br />
<!-- provide your text below --></div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI-InSPIRE:SA1.8-QR8&diff=36060EGI-InSPIRE:SA1.8-QR82012-04-30T09:32:20Z<p>Fergadis: </p>
<hr />
<div>__NOTOC__<br />
= 1. Task Meetings =<br />
<!--<br />
Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all task meetings participants are ALWAYS recorded either on indico from the registrants’ list, or in the minutes. <br />
OMB meeting will be reported under task TSA1.1 only. Monday Operations meetings need to be reported under task TSA1.3 only. Training events will be recorded in the training event registry and need not be mentioned here.<br />
--><br />
{| border="1" cellspacing="0" cellpadding="5" align="center"<br />
! style="width: 10%" | Date (dd/mm/yyyy)<br />
! style="width: 20%" | Url Indico Agenda<br />
! style="width: 20%" | Title<br />
! style="width: 50%" | Outcome<br />
|-<br />
|<br />
|<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 2. Main Achievements = <br />
<!--<br />
Note. This is a detailed account of progress over the previous quarter of activities within the task. <br />
--><br />
<br />
== EGI Catch-All Core Service ==<br />
<br />
= 3. Issues and Mitigation =<br />
<!-- fill the table below --><br />
<br />
{| border="1" cellspacing="0" cellpadding="2"<br />
|-<br />
!scope="col"| Issue Description<br />
!scope="col"| Mitigation Description<br />
|-<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 4. Plans for the next period =<br />
<!-- provide your text below --></div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI-InSPIRE:SA1.8-QR8&diff=36058EGI-InSPIRE:SA1.8-QR82012-04-30T09:23:44Z<p>Fergadis: /* 6. NGI Monthly Availability and Reliability Results */</p>
<hr />
<div>__NOTOC__<br />
= 1. Task Meetings =<br />
<!--<br />
Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all task meetings participants are ALWAYS recorded either on indico from the registrants’ list, or in the minutes. <br />
OMB meeting will be reported under task TSA1.1 only. Monday Operations meetings need to be reported under task TSA1.3 only. Training events will be recorded in the training event registry and need not be mentioned here.<br />
--><br />
{| border="1" cellspacing="0" cellpadding="5" align="center"<br />
! style="width: 10%" | Date (dd/mm/yyyy)<br />
! style="width: 20%" | Url Indico Agenda<br />
! style="width: 20%" | Title<br />
! style="width: 50%" | Outcome<br />
|-<br />
|<br />
|<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 2. Main Achievements = <br />
<!--<br />
Note. This is a detailed account of progress over the previous quarter of activities within the task. <br />
--><br />
<br />
== EGI Catch-All Core Service ==<br />
<br />
= 3. Issues and Mitigation =<br />
<!-- fill the table below --><br />
<br />
{| border="1" cellspacing="0" cellpadding="2"<br />
|-<br />
!scope="col"| Issue Description<br />
!scope="col"| Mitigation Description<br />
|-<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 4. Plans for the next period =<br />
<!-- provide your text below --><br />
<br />
= 5. Number of sites suspended =<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" align="center" style="text-align: center;"<br />
!align="left" | Month !! Suspended sites<br />
|-<br />
|align="left" | MONTH YEAR || 0<br />
|- <br />
|align="left" | MONTH YEAR || 0<br />
|-<br />
|align="left" | MONTH YEAR || 0<br />
|-<br />
|}<br />
<br />
= 6. NGI Monthly Availability and Reliability Results =<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" align="center" style="text-align: center;"<br />
!rowspan="2" align="left" | Month !!colspan="2" | Site Middleware Services !!colspan="2" | Core Middleware Services<br />
|-<br />
! Availability !! Reliability !! Availability !! Reliability<br />
|-<br />
|align="left" | February 2012 || 93.36 || 94.32 || - || -<br />
|-<br />
|align="left" | March 2012 || 94.68 || 95.75 || - || -<br />
|-<br />
|align="left" | April 2012 || - || - || - || -<br />
|-<br />
!align="left" | Average || - || - || - || -<br />
|-<br />
|}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI-InSPIRE:SA1.8-QR8&diff=36049EGI-InSPIRE:SA1.8-QR82012-04-30T07:56:16Z<p>Fergadis: Created page with "__NOTOC__ = 1. Task Meetings = <!-- Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all tas..."</p>
<hr />
<div>__NOTOC__<br />
= 1. Task Meetings =<br />
<!--<br />
Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all task meetings participants are ALWAYS recorded either on indico from the registrants’ list, or in the minutes. <br />
OMB meeting will be reported under task TSA1.1 only. Monday Operations meetings need to be reported under task TSA1.3 only. Training events will be recorded in the training event registry and need not be mentioned here.<br />
--><br />
{| border="1" cellspacing="0" cellpadding="5" align="center"<br />
! style="width: 10%" | Date (dd/mm/yyyy)<br />
! style="width: 20%" | Url Indico Agenda<br />
! style="width: 20%" | Title<br />
! style="width: 50%" | Outcome<br />
|-<br />
|<br />
|<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 2. Main Achievements = <br />
<!--<br />
Note. This is a detailed account of progress over the previous quarter of activities within the task. <br />
--><br />
<br />
== EGI Catch-All Core Service ==<br />
<br />
= 3. Issues and Mitigation =<br />
<!-- fill the table below --><br />
<br />
{| border="1" cellspacing="0" cellpadding="2"<br />
|-<br />
!scope="col"| Issue Description<br />
!scope="col"| Mitigation Description<br />
|-<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 4. Plans for the next period =<br />
<!-- provide your text below --><br />
<br />
= 5. Number of sites suspended =<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" align="center" style="text-align: center;"<br />
!align="left" | Month !! Suspended sites<br />
|-<br />
|align="left" | MONTH YEAR || 0<br />
|- <br />
|align="left" | MONTH YEAR || 0<br />
|-<br />
|align="left" | MONTH YEAR || 0<br />
|-<br />
|}<br />
<br />
= 6. NGI Monthly Availability and Reliability Results =<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" align="center" style="text-align: center;"<br />
!rowspan="2" align="left" | Month !!colspan="2" | Site Middleware Services !!colspan="2" | Core Middleware Services<br />
|-<br />
! Availability !! Reliability !! Availability !! Reliability<br />
|-<br />
|align="left" | MONTH YEAR || 99.99 || 99.99 || 99.99 || 99.99<br />
|-<br />
|align="left" | MONTH YEAR || 99.99 || 99.99 || 99.99 || 99.99<br />
|-<br />
|align="left" | MONTH YEAR || 99.99 || 99.99 || 99.99 || 99.99<br />
|-<br />
!align="left" | Average || 99.99 || 99.99 || 99.99 || 99.99<br />
|-<br />
|}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=PROC10_Recomputation_of_SAM_results_or_availability_reliability_statistics&diff=35315PROC10 Recomputation of SAM results or availability reliability statistics2012-04-09T14:43:37Z<p>Fergadis: /* Tips */</p>
<hr />
<div>{{Template:Op menubar}}<br />
{{Template:Doc_menubar}}<br />
[[Category:Procedures]]<br />
__TOC__<br />
<br />
= Procedure for the recomputation of SAM results and/or availability/reliability statistics=<br />
<br />
*'''Title''': Recomputation of SAM results and/or availability/reliability statistics<br />
*'''Document link''': https://wiki.egi.eu/wiki/PROC10<br />
*'''Last modified''': 16 Jan 2012 <br />
*'''Version''': 1.1 <br />
*'''Policy Group Acronym''': OMB<br />
*'''Policy Group Name''': Operations Management Board<br />
*'''Contact Person''': George Fergadis/AUTH<br />
*'''Document Status''': APPROVED<br />
*'''Approved Date''': 17 October 2011<br />
*'''Procedure Statement''': This procedure documents the steps for requesting a correction in the SAM test results and in the related availability/reliability when applicable statistics.<br />
<br />
= Overview =<br />
This procedure documents the steps for requesting a correction in the <br />
[[SAM_Instances|SAM test results]] and in the related [[Availability_and_reliability_monthly_statistics|availability/reliability statistics]] if applicable. A recomputation of these statistics for the affected month is not needed if test results are notified and corrected before the statistics of that month are computed and distributed. Problems with the SAM results should be notified as soon as possible once detected, in order to allow sufficient time for fixing of these and thus to avoid that monthly availability/reliability statistics for the affected month have to be re-computed.<br />
<br />
DISCLAIMER: This procedure is only applicable to EGI OPS test results. Procedures for the computation of VO-specific availability report are VO-specific and are out of scope.<br />
<br />
= Who can submit a request? =<br />
Re-computations can be requested by site administrators and by regional operations staff.<br />
<br />
= Prerequisites =<br />
Fixes in test results are accepted only when failures in test results were due to problems <br />
cased to the monitoring infrastructure itself. Some examples:<br />
* invalid proxy certificate used for submitting the monitoring probes in a Nagios instance;<br />
* problems with the Storage Element used for replica management tests resulting in errors on CE's metrics.<br />
<br />
= Steps =<br />
<br />
# '''STEP 1''': as soon as the problem is detected, notify by opening a [http://helpdesk.egi.eu/ GGUS ticket]. '''If the submitter is a Resource Centre administrator''': please address the ticket to your Operations Centre support unit. '''If the submitter is a member of a regional operations staff''': please address the ticket to the Service Level Management support unit. In the GGUS ticket you must mention:<br />
## the starting and ending time of the problem (including day and hour in UTC)<br />
## the Site, NGI/federation of NGIs affected by the problem<br />
## the VO affected by the problem (must be the OPS VO)<br />
## a description of the problem<br />
# '''STEP 2''': (only applicable if the submitter of the request is a Resource Centre administrator) the Operations Centre anlayzes the request. If the request is validated, the ticket is re-assigned to the [[GGUS:SLM-FAQ|Service Level Management]](SLM) Support Unit, who will be responsible of (1) collecting all reported problems and (2) discuss the reported problems with the SAM Support Unit by re-assigning the ticket to the [[GGUS:SAM/Nagios_FAQ|SAM/Nagios SU]].<br />
# '''STEP 3''': if the request for recomputation of the test results is accepted, the SAM Support Unit will be reponsible of fixing the results and of triggering a recomputation of the monthly availability statistics if necessary. The following these steps are followed:<br />
## All Nagios metric results for any site and service are set to ''unknown'' status from the beginning of the hour reported in the starting time to one hour after the ending time. This is to cover late results that could have arrived later.<br />
## Availability/reliability are then recomputed for that particular period, Site, NGI/federation of NGIs if necessary. As a consequence, the availability and reliability of other sites won't be affected, as unknown periods are not considered in the computation.<br />
# '''STEP 4''': in case new availability/reliability statistics are computed, when these are ready for distribution, the SAM/Nagios SU reassignes the ticket to the SLM Support Unit, in order to notify that a new set of reports can be re-distributed to EGI.<br />
<br />
= External links =<br />
* [https://tomtools.cern.ch/confluence/display/SAMDOC/Availability+Re-computation+Policy WLCG Availability re-computation policy]<br />
<br />
= Tips =<br />
* Date formats<br />
You can use the Unix <tt>date</tt> command to convert the start and end time from your time zone to <tt>UTC</tt> using the [http://en.wikipedia.org/wiki/ISO_8601 ISO 8601] format.<br />
<br />
''the start time must be rounded to the lower hour and the end time rounded to the higher hour''<br />
<br />
Example:<br />
# date --date="12 Feb 2012 17:00 CET" --utc --iso-8601=minutes<br />
will give:<br />
2012-02-12T16:00+0000<br />
<br />
= Revision history =<br />
17/01/2012: the text of the procedure is fixed to clarify that both RC administrators and regional operations staff can request a re-computation.<br />
<br />
16/01/2012: the text of the procedure is fixed to clarify that the recomputation of test results can be requested before the end of the affected month, in which case if sufficient time is allowed for fixing of the test results, no re-computation of availability/reliability statistics will be needed.<br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=35274Resource Centres OLA and Resource infrastructure Provider OLA reports2012-04-04T10:27:50Z<p>Fergadis: /* Resource Centres */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.egi.eu/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Performance reports=<br />
<br />
== Resource Centres ==<br />
[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE)<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
! Availability/Reliability<br />
! Jan<br />
! Feb<br />
! Mar<br />
! Apr<br />
! May<br />
! Jun<br />
! Jul<br />
! Aug<br />
! Sep<br />
! Oct<br />
! Nov<br />
! Dec<br />
|-<br />
! 2010<br />
| -<br />
| -<br />
| -<br />
| -<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|-<br />
! 2011<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11]<br />
|-<br />
! 2012<br />
|[https://documents.egi.eu/document/1000 01/12]<br />
|[https://documents.egi.eu/document/1033 02/12]<br />
|[https://documents.egi.eu/document/1090 03/12]<br />
|[ 04/12]<br />
|[ 05/12]<br />
|[ 06/12]<br />
|[ 07/12]<br />
|[ 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
== Resource Infrastructures ==<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: '''<br />
'''top-BDII Availability/Reliability''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf 09/11]<br />
| [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf 10/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf 11/11] <br />
| [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf 12/11]<br />
|-<br />
| '''2012''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12] <br />
| [03/12] <br />
| [04/12] <br />
| [05/12] <br />
| [06/12] <br />
| [07/12] <br />
| [08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|}<br />
<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level: ''' <br />
'''ROD Performance Index''' <br />
ticket/[https://documents.egi.eu/document/1089 Report]<br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''2011'''<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| -<br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-10.pdf 10/11] <br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-11.pdf 11/11] <br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2011-12.pdf 12/11] <br />
|-<br />
| '''2012''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=78078 78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-01.pdf 01/12] <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-02.pdf 02/12] <br />
| [GGUS]/<br />
[https://documents.egi.eu/secure/RetrieveFile?docid=1089&version=1&filename=OlaMetrics_2012-03.pdf 03/12] <br />
| [GGUS]/ <br />
[04/12] <br />
| [GGUS]/<br />
[05/12] <br />
| [GGUS]/ <br />
[06/12] <br />
| [GGUS]/<br />
[07/12] <br />
| [GGUS]/ <br />
[08/12] <br />
| [GGUS]/ <br />
[09/12] <br />
| [GGUS]/ <br />
[10/12] <br />
| [GGUS]/ <br />
[11/12] <br />
|[GGUS]/ <br />
[12/12] <br />
|}<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. An Excel version is available at [http://gvdev.cern.ch/GVPC/Excel/ACE/]<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=PROC10_Recomputation_of_SAM_results_or_availability_reliability_statistics&diff=33981PROC10 Recomputation of SAM results or availability reliability statistics2012-03-09T14:06:30Z<p>Fergadis: </p>
<hr />
<div>{{Template:Op menubar}}<br />
{{Template:Doc_menubar}}<br />
[[Category:Procedures]]<br />
__TOC__<br />
<br />
= Procedure for the recomputation of SAM results and/or availability/reliability statistics=<br />
<br />
*'''Title''': Recomputation of SAM results and/or availability/reliability statistics<br />
*'''Document link''': https://wiki.egi.eu/wiki/PROC10<br />
*'''Last modified''': 16 Jan 2012 <br />
*'''Version''': 1.1 <br />
*'''Policy Group Acronym''': OMB<br />
*'''Policy Group Name''': Operations Management Board<br />
*'''Contact Person''': George Fergadis/AUTH<br />
*'''Document Status''': APPROVED<br />
*'''Approved Date''': 17 October 2011<br />
*'''Procedure Statement''': This procedure documents the steps for requesting a correction in the SAM test results and in the related availability/reliability when applicable statistics.<br />
<br />
= Overview =<br />
This procedure documents the steps for requesting a correction in the <br />
[[SAM_Instances|SAM test results]] and in the related [[Availability_and_reliability_monthly_statistics|availability/reliability statistics]] if applicable. A recomputation of these statistics for the affected month is not needed if test results are notified and corrected before the statistics of that month are computed and distributed. Problems with the SAM results should be notified as soon as possible once detected, in order to allow sufficient time for fixing of these and thus to avoid that monthly availability/reliability statistics for the affected month have to be re-computed.<br />
<br />
DISCLAIMER: This procedure is only applicable to EGI OPS test results. Procedures for the computation of VO-specific availability report are VO-specific and are out of scope.<br />
<br />
= Who can submit a request? =<br />
Re-computations can be requested by site administrators and by regional operations staff.<br />
<br />
= Prerequisites =<br />
Fixes in test results are accepted only when failures in test results were due to problems <br />
cased to the monitoring infrastructure itself. Some examples:<br />
* invalid proxy certificate used for submitting the monitoring probes in a Nagios instance;<br />
* problems with the Storage Element used for replica management tests resulting in errors on CE's metrics.<br />
<br />
= Steps =<br />
<br />
# '''STEP 1''': as soon as the problem is detected, notify by opening a [http://helpdesk.egi.eu/ GGUS ticket]. '''If the submitter is a Resource Centre administrator''': please address the ticket to your Operations Centre support unit. '''If the submitter is a member of a regional operations staff''': please address the ticket to the Service Level Management support unit. In the GGUS ticket you must mention:<br />
## the starting and ending time of the problem (including day and hour in UTC)<br />
## the Site, NGI/federation of NGIs affected by the problem<br />
## the VO affected by the problem (must be the OPS VO)<br />
## a description of the problem<br />
# '''STEP 2''': (only applicable if the submitter of the request is a Resource Centre administrator) the Operations Centre anlayzes the request. If the request is validated, the ticket is re-assigned to the [[GGUS:SLM-FAQ|Service Level Management]](SLM) Support Unit, who will be responsible of (1) collecting all reported problems and (2) discuss the reported problems with the SAM Support Unit by re-assigning the ticket to the [[GGUS:SAM/Nagios_FAQ|SAM/Nagios SU]].<br />
# '''STEP 3''': if the request for recomputation of the test results is accepted, the SAM Support Unit will be reponsible of fixing the results and of triggering a recomputation of the monthly availability statistics if necessary. The following these steps are followed:<br />
## All Nagios metric results for any site and service are set to ''unknown'' status from the beginning of the hour reported in the starting time to one hour after the ending time. This is to cover late results that could have arrived later.<br />
## Availability/reliability are then recomputed for that particular period, Site, NGI/federation of NGIs if necessary. As a consequence, the availability and reliability of other sites won't be affected, as unknown periods are not considered in the computation.<br />
# '''STEP 4''': in case new availability/reliability statistics are computed, when these are ready for distribution, the SAM/Nagios SU reassignes the ticket to the SLM Support Unit, in order to notify that a new set of reports can be re-distributed to EGI.<br />
<br />
= External links =<br />
* [https://tomtools.cern.ch/confluence/display/SAMDOC/Availability+Re-computation+Policy WLCG Availability re-computation policy]<br />
<br />
= Tips =<br />
* Date formats<br />
You can use the Unix <tt>date</tt> command to convert the start and end time from your time zone to UTC using the [http://en.wikipedia.org/wiki/ISO_8601 ISO 8601] format.<br />
<br />
Example:<br />
# date --date="12 Feb 2012 17:35 CET" --utc --iso-8601=minutes<br />
will give:<br />
2012-02-12T16:35+0000<br />
<br />
= Revision history =<br />
17/01/2012: the text of the procedure is fixed to clarify that both RC administrators and regional operations staff can request a re-computation.<br />
<br />
16/01/2012: the text of the procedure is fixed to clarify that the recomputation of test results can be requested before the end of the affected month, in which case if sufficient time is allowed for fixing of the test results, no re-computation of availability/reliability statistics will be needed.<br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=33724Resource Centres OLA and Resource infrastructure Provider OLA reports2012-03-02T15:04:50Z<p>Fergadis: /* Resource infrastructure Provider Performance */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Introduction =<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.egi.eu/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
= Performance reports=<br />
<br />
== 2012 ==<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''Jan'''<br />
|'''Feb'''<br />
|'''Mar'''<br />
|'''Apr'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://documents.egi.eu/document/1000 01/12]<br />
|[https://documents.egi.eu/document/1033 02/12]<br />
|[ 03/12]<br />
|[ 04/12]<br />
|[ 05/12]<br />
|[ 06/12]<br />
|[ 07/12]<br />
|[ 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
=== Resource infrastructure Provider Performance ===<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''top-BDII Availability/Reliability''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12] <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1033&version=1&filename=EGI-core_services_availabilities-per_NGI-Feb2012%20Top-BDIIs.pdf 02/12] <br />
| [03/12] <br />
| [04/12] <br />
| [05/12] <br />
| [06/12] <br />
| [07/12] <br />
| [08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|-<br />
| '''ROD Performance Index''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 GGUS-79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2001-2012.pdf Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
|}<br />
<br />
== 2011 ==<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''Jan'''<br />
|'''Feb'''<br />
|'''Mar'''<br />
|'''Apr'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11]<br />
|}<br />
<br />
=== Resource infrastructure Provider Performance ===<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''top-BDII Availability/Reliability''' <br />
| <!--Sep--> [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf Sep] <br />
| <!--Oct--> [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf Oct] <br />
| <!--Nov--> [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf Nov] <br />
| <!--Dec--> [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf Dec]<br />
|-<br />
| '''ROD Performance Index''' <br />
| <!--Sep-->NA <br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 GGUS-76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2011-2011.pdf Newsletter] <br />
<br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 GGUS-77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2012-2011.pdf Newsletter] <br />
<br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 GGUS-78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2012-2011.pdf Newsletter] <br />
<br />
|}<br />
<br />
== 2010 ==<br />
<!--*[https://documents.egi.eu/document/42 May]|[https://documents.egi.eu/document/96 Jun]|[https://documents.egi.eu/document/130 Jul]|[https://documents.egi.eu/document/157 Aug]|[https://documents.egi.eu/document/219 Sep]|[https://documents.egi.eu/document/238 Oct]|[https://documents.egi.eu/document/266 Nov]|[https://documents.egi.eu/document/299 Dec]<br />
*[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE league tables)--><br />
<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''EGEE Statistics'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://edms.cern.ch/document/963325 January 2008 - April 2010]<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|}<br />
<br />
<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. An Excel version is available at [http://gvdev.cern.ch/GVPC/Excel/ACE/]<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=33714Resource Centres OLA and Resource infrastructure Provider OLA reports2012-03-02T12:47:53Z<p>Fergadis: /* Resource Centre Performance */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Introduction =<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.egi.eu/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
= Performance reports=<br />
<br />
== 2012 ==<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''Jan'''<br />
|'''Feb'''<br />
|'''Mar'''<br />
|'''Apr'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://documents.egi.eu/document/1000 01/12]<br />
|[https://documents.egi.eu/document/1033 02/12]<br />
|[ 03/12]<br />
|[ 04/12]<br />
|[ 05/12]<br />
|[ 06/12]<br />
|[ 07/12]<br />
|[ 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
=== Resource infrastructure Provider Performance ===<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''top-BDII Availability/Reliability''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12] <br />
| [02/12] <br />
| [03/12] <br />
| [04/12] <br />
| [05/12] <br />
| [06/12] <br />
| [07/12] <br />
| [08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|-<br />
| '''ROD Performance Index''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 GGUS-79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2001-2012.pdf Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
|}<br />
<br />
== 2011 ==<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''Jan'''<br />
|'''Feb'''<br />
|'''Mar'''<br />
|'''Apr'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11]<br />
|}<br />
<br />
=== Resource infrastructure Provider Performance ===<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''top-BDII Availability/Reliability''' <br />
| <!--Sep--> [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf Sep] <br />
| <!--Oct--> [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf Oct] <br />
| <!--Nov--> [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf Nov] <br />
| <!--Dec--> [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf Dec]<br />
|-<br />
| '''ROD Performance Index''' <br />
| <!--Sep-->NA <br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 GGUS-76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2011-2011.pdf Newsletter] <br />
<br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 GGUS-77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2012-2011.pdf Newsletter] <br />
<br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 GGUS-78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2012-2011.pdf Newsletter] <br />
<br />
|}<br />
<br />
== 2010 ==<br />
<!--*[https://documents.egi.eu/document/42 May]|[https://documents.egi.eu/document/96 Jun]|[https://documents.egi.eu/document/130 Jul]|[https://documents.egi.eu/document/157 Aug]|[https://documents.egi.eu/document/219 Sep]|[https://documents.egi.eu/document/238 Oct]|[https://documents.egi.eu/document/266 Nov]|[https://documents.egi.eu/document/299 Dec]<br />
*[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE league tables)--><br />
<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''EGEE Statistics'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://edms.cern.ch/document/963325 January 2008 - April 2010]<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|}<br />
<br />
<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. An Excel version is available at [http://gvdev.cern.ch/GVPC/Excel/ACE/]<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=33639Resource Centres OLA and Resource infrastructure Provider OLA reports2012-02-29T12:44:41Z<p>Fergadis: /* Resource Centre Performance */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Introduction =<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.egi.eu/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
= Performance reports=<br />
<br />
== 2012 ==<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''Jan'''<br />
|'''Feb'''<br />
|'''Mar'''<br />
|'''Apr'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://documents.egi.eu/document/1000 01/12]<br />
|[ 02/12]<br />
|[ 03/12]<br />
|[ 04/12]<br />
|[ 05/12]<br />
|[ 06/12]<br />
|[ 07/12]<br />
|[ 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
=== Resource infrastructure Provider Performance ===<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''top-BDII Availability/Reliability''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12] <br />
| [02/12] <br />
| [03/12] <br />
| [04/12] <br />
| [05/12] <br />
| [06/12] <br />
| [07/12] <br />
| [08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|-<br />
| '''ROD Performance Index''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 GGUS-79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2001-2012.pdf Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
|}<br />
<br />
== 2011 ==<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''Jan'''<br />
|'''Feb'''<br />
|'''Mar'''<br />
|'''Apr'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11]<br />
|}<br />
<br />
=== Resource infrastructure Provider Performance ===<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''top-BDII Availability/Reliability''' <br />
| <!--Sep--> [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf Sep] <br />
| <!--Oct--> [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf Oct] <br />
| <!--Nov--> [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf Nov] <br />
| <!--Dec--> [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf Dec]<br />
|-<br />
| '''ROD Performance Index''' <br />
| <!--Sep-->NA <br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 GGUS-76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2011-2011.pdf Newsletter] <br />
<br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 GGUS-77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2012-2011.pdf Newsletter] <br />
<br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 GGUS-78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2012-2011.pdf Newsletter] <br />
<br />
|}<br />
<br />
== 2010 ==<br />
<!--*[https://documents.egi.eu/document/42 May]|[https://documents.egi.eu/document/96 Jun]|[https://documents.egi.eu/document/130 Jul]|[https://documents.egi.eu/document/157 Aug]|[https://documents.egi.eu/document/219 Sep]|[https://documents.egi.eu/document/238 Oct]|[https://documents.egi.eu/document/266 Nov]|[https://documents.egi.eu/document/299 Dec]<br />
*[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE league tables)--><br />
<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''EGEE Statistics'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://edms.cern.ch/document/963325 January 2008 - April 2010]<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|}<br />
<br />
<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. An Excel version is available at [http://gvdev.cern.ch/GVPC/Excel/ACE/]<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=33632Resource Centres OLA and Resource infrastructure Provider OLA reports2012-02-29T12:22:25Z<p>Fergadis: /* Resource Centre Performance */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Introduction =<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.egi.eu/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
= Performance reports=<br />
<br />
== 2012 ==<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''Jan'''<br />
|'''Feb'''<br />
|'''Mar'''<br />
|'''Apr'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://documents.egi.eu/document/1000 01/12]<br>re-computations<br>are in progress...<br />
|[ 02/12]<br />
|[ 03/12]<br />
|[ 04/12]<br />
|[ 05/12]<br />
|[ 06/12]<br />
|[ 07/12]<br />
|[ 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
=== Resource infrastructure Provider Performance ===<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''top-BDII Availability/Reliability''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12] <br />
| [02/12] <br />
| [03/12] <br />
| [04/12] <br />
| [05/12] <br />
| [06/12] <br />
| [07/12] <br />
| [08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|-<br />
| '''ROD Performance Index''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 GGUS-79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2001-2012.pdf Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
|}<br />
<br />
== 2011 ==<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''Jan'''<br />
|'''Feb'''<br />
|'''Mar'''<br />
|'''Apr'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11]<br />
|}<br />
<br />
=== Resource infrastructure Provider Performance ===<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''top-BDII Availability/Reliability''' <br />
| <!--Sep--> [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf Sep] <br />
| <!--Oct--> [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf Oct] <br />
| <!--Nov--> [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf Nov] <br />
| <!--Dec--> [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf Dec]<br />
|-<br />
| '''ROD Performance Index''' <br />
| <!--Sep-->NA <br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 GGUS-76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2011-2011.pdf Newsletter] <br />
<br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 GGUS-77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2012-2011.pdf Newsletter] <br />
<br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 GGUS-78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2012-2011.pdf Newsletter] <br />
<br />
|}<br />
<br />
== 2010 ==<br />
<!--*[https://documents.egi.eu/document/42 May]|[https://documents.egi.eu/document/96 Jun]|[https://documents.egi.eu/document/130 Jul]|[https://documents.egi.eu/document/157 Aug]|[https://documents.egi.eu/document/219 Sep]|[https://documents.egi.eu/document/238 Oct]|[https://documents.egi.eu/document/266 Nov]|[https://documents.egi.eu/document/299 Dec]<br />
*[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE league tables)--><br />
<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''EGEE Statistics'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://edms.cern.ch/document/963325 January 2008 - April 2010]<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|}<br />
<br />
<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. An Excel version is available at [http://gvdev.cern.ch/GVPC/Excel/ACE/]<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Dteam_vo&diff=33174Dteam vo2012-02-21T15:37:47Z<p>Fergadis: /* General Information */</p>
<hr />
<div>{{Template:Op menubar}} <br />
{{Template:Doc_menubar}} <br />
{{TOC_right}} <br />
<br />
= General Information =<br />
<br />
The DTEAM VO is an infrastructure VO that MUST be enabled by all EGI Resource Centres that support the VO concept for user authentication, as stated in the [https://documents.egi.eu/document/31 Resource Centre Operational Level Agreement]. It is meant for testing and troubleshooting of grid capabilities across EGI Resource Centres. Usage of the DTEAM VO is subject to the EGI [[SPG:Documents|Security Policies]]. <br />
<br />
*[http://operations-portal.egi.eu/vo/downloadAUP/file/dteam-AcceptableUsePolicy-20120221-1329838531075.txt DTEAM AUP]. <br />
*'''Get support''': in order to get support about the DTEAM VO please [http://helpdesk.egi.eu/ open a ticket], select type ''Operations'', and set ''concerned VO'' to ''dteam''. If you have privileges, assign it to the Support Unit ''VOsupport unit''.<br />
*[https://voms.hellasgrid.gr:8443/vo/dteam/vomrs DTEAM VOMRS]<br />
<br />
= Recipes for VO/ROC/NGI/Site managers =<br />
<br />
== What users filling the '''dteam''' VO Registration form should do ==<br />
<br />
Select the appropriate '''Representative''' and '''Group''' for themselves. The Representative corresponding to their region is offered in a drop-down menu. <br />
<br />
'''Example:''' <br />
<blockquote style="background-color: lightgrey; border: solid thin grey; padding: 5px;">dteam users from Greece should select Kostas Koumantaros or Ioannis Liabotis as their Representative and /dteam/NGI_GRNET as their Group. </blockquote> <br />
Everybody is automatically registered under the root group /dteam in addition to any Group they might select. Nobody can de-assign them from this "root group" unless they get "Denied", in the first place or, later on, "Suspended", by the VO-Admin, in which case they can't run any Grid jobs and they get deleted from the VOMS database. <br />
<br />
When users select additional Groups, the GroupOwners have nothing to do, if they have no objection. Users may select GroupRoles within a given Group as well. <br />
<br />
== What the VO-Admin can do ==<br />
<br />
Everything including VO member suspension/removal that nobody else can do! <br />
<br />
If you try to remove a member and the box-to-tick is grey, this means that the member has some authority (GroupOwner/Manager or Representative). You 'll have to remove that funtion first from him/her via "Manage VO Admin Roles". <br />
<br />
To remove the GroupOwner/Manager autority, use control/click on the relevant Group/Role (it will be blue)! <br />
<br />
== What the Representative can do ==<br />
<br />
Approve Candidates during the initial registration and handle Expired users. <br />
<br />
To do this, the Representative should either click on the link (s)he got in the email notification or go to the web interface, open the "Members" sub-menu, click on "Set status", search for "New" candidates and approve those assigned to him/her. <br />
<br />
The Representative selected by the user can assign another Representative before approving, as appropriate. <br />
<br />
'''Example:''' <br />
<blockquote style="background-color: lightgrey; border: solid thin grey; padding: 5px;">a DTEAM VO Candidate from a Russian LCG Site selected the SWE ROC manager as Representative. Gonzalo (SWE) can replace himself with Alexander (RDIG). </blockquote> <br />
== What the GroupOwners can do ==<br />
<br />
Group Owners can create groups/group roles and assign new Group Owner/Manager roles to member within the subgroups. If they decided that the user doesn't belong to their group(s) they can de-assign him/her at any time. <br />
<br />
'''Example:''' <br />
<blockquote style="background-color: lightgrey; border: solid thin grey;padding: 5px;">If Sven from DECH selects additional group /dteam/see, Kostas can move him out. </blockquote> <br />
== What the GroupManagers can do ==<br />
<br />
They can deassign users from their group at any time. <br />
<br />
http://cern.ch/dimou/lcg/vomrs/Groups-Roles.doc contains EGEE era implementation details and plans on Groups/Roles. As VOMRS fuctionality will be implemented in VOMS this document is becoming obsolete. <br />
<br />
== Proposed distribution of responsibilities ==<br />
<br />
{| border="1"<br />
|-<br />
! Operations manager and deputy <br />
! Operations centre staff <br />
! Site staff<br />
|-<br />
| GroupOwner,GroupManager, VO Representative <br />
| GroupManager <br />
| Group Member<br />
|}<br />
<br />
= Mini How-To =<br />
<br />
*To (De)Assign someone as Representative go to "Manage VO Admin Roles". <br />
*To (De)Assign someone as GroupOwner go to "Manage VO Admin Roles", search for the VO member and select the Group (s)he should own. <br />
*To Change Representative for all members go to "Change Representative", Select the right DN from the drop dowm menu, click on each member. <br />
*To receive email notification for actions you need to take go to "Subscription" and select what you wish to be notified about.<br />
<br />
{| border="1"<br />
|-<br />
! <br />
! VO Admin <br />
! Representative <br />
! GroupOwner <br />
! GroupManager<br />
|-<br />
| Candidate <br />
| remove <br />
| <br />
| <br />
| <br />
|-<br />
| Applicant <br />
| Remove/approve/deny Assign/deassign to/from group and group role <br />
| Remove/approve/suspend/expire <br />
| Assign/deassign to/from group and group role<br />
|-<br />
| Member <br />
| Remove/approve/suspend/expire Assign/deassign to/from group and group role <br />
| expire from Institute but not from the VO <br />
| assign/deassign to/from group and group role <br />
| assign/deassign to/from group and group role<br />
|-<br />
| Member’s certificate <br />
| Remove/approve/deny/suspend <br />
| <br />
| assign/deassign to/from group and group role <br />
| assign/deassign to/from group and group role<br />
|}<br />
<br />
= Resources =<br />
<br />
*VOMRS Tutorials: http://www.uscms.org/SoftwareComputing/Grid/VO/tutorials.html <br />
*VOMRS Online Documentation: http://computing.fnal.gov/docs/products/vomrs/<br />
<br />
= Acknowledgements =<br />
<br />
Information provided in this page was collected from M. Dimou's VOMRS [http://dimou.web.cern.ch/dimou/lcg/registrar/TF/vomrs-tips.html tips page], with material provided by Tanya Levshina (VOMRS Project Leader and developer).</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=33062Resource Centres OLA and Resource infrastructure Provider OLA reports2012-02-20T11:00:00Z<p>Fergadis: /* Resource Centre Performance */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Introduction =<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.egi.eu/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
= Performance reports=<br />
<br />
== 2012 ==<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''Jan'''<br />
|'''Feb'''<br />
|'''Mar'''<br />
|'''Apr'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://documents.egi.eu/document/1000 01/12]<br>re-computations<br>are in progress...<br />
|[ 02/12]<br />
|[ 03/12]<br />
|[ 04/12]<br />
|[ 05/12]<br />
|[ 06/12]<br />
|[ 07/12]<br />
|[ 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
=== Resource infrastructure Provider Performance ===<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''top-BDII Availability/Reliability''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12] <br />
| [02/12] <br />
| [03/12] <br />
| [04/12] <br />
| [05/12] <br />
| [06/12] <br />
| [07/12] <br />
| [08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|-<br />
| '''ROD Performance Index''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 GGUS-79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2001-2012.pdf Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
|}<br />
<br />
== 2011 ==<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''Jan'''<br />
|'''Feb'''<br />
|'''Mar'''<br />
|'''Apr'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11] re-computations are in progress...<br />
|}<br />
<br />
=== Resource infrastructure Provider Performance ===<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''top-BDII Availability/Reliability''' <br />
| <!--Sep--> [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf Sep] <br />
| <!--Oct--> [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf Oct] <br />
| <!--Nov--> [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf Nov] <br />
| <!--Dec--> [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf Dec]<br />
|-<br />
| '''ROD Performance Index''' <br />
| <!--Sep-->NA <br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 GGUS-76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2011-2011.pdf Newsletter] <br />
<br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 GGUS-77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2012-2011.pdf Newsletter] <br />
<br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 GGUS-78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2012-2011.pdf Newsletter] <br />
<br />
|}<br />
<br />
== 2010 ==<br />
<!--*[https://documents.egi.eu/document/42 May]|[https://documents.egi.eu/document/96 Jun]|[https://documents.egi.eu/document/130 Jul]|[https://documents.egi.eu/document/157 Aug]|[https://documents.egi.eu/document/219 Sep]|[https://documents.egi.eu/document/238 Oct]|[https://documents.egi.eu/document/266 Nov]|[https://documents.egi.eu/document/299 Dec]<br />
*[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE league tables)--><br />
<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''EGEE Statistics'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://edms.cern.ch/document/963325 January 2008 - April 2010]<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|}<br />
<br />
<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. An Excel version is available at [http://gvdev.cern.ch/GVPC/Excel/ACE/]<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=Resource_Centres_OLA_and_Resource_infrastructure_Provider_OLA_reports&diff=33060Resource Centres OLA and Resource infrastructure Provider OLA reports2012-02-20T10:55:39Z<p>Fergadis: /* Resource Centre Performance */</p>
<hr />
<div>{{Template:Op menubar}}<br />
[[Category:Procedures]]<br />
[[Category:Service Level Management]]<br />
{{TOC_right}}<br />
<br />
Go to the '''[[Performance|main]]''' page for information on service level targets and related statistics.<br />
<br />
= Introduction =<br />
EGI Performance is measured using two parameters: Availability and Reliability ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 definition]). <br />
<br />
Availability/Reliability data is provided by the [https://grid-monitoring.egi.eu/myegi/sa/ MyEGI] portal. Note: GridView Availability/Reliability views are now obsolete.<br />
<br />
Availability/Reliability are measured at a Resource Centre (RC) level and at a Resource infrastructure Provider (RP) level (for NGIs and EIROs).<br />
<br />
[[SAM_Tests|SAM metric]] results are used for the calculation of Availability/Reliability.<br />
<br />
= Performance reports=<br />
<br />
== 2012 ==<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''Jan'''<br />
|'''Feb'''<br />
|'''Mar'''<br />
|'''Apr'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://documents.egi.eu/document/1000 01/12]<br />
|[ 02/12]<br />
|[ 03/12]<br />
|[ 04/12]<br />
|[ 05/12]<br />
|[ 06/12]<br />
|[ 07/12]<br />
|[ 08/12]<br />
|[ 09/12]<br />
|[ 10/12]<br />
|[ 11/12] <br />
|[ 12/12]<br />
|}<br />
<br />
=== Resource infrastructure Provider Performance ===<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level''' <br />
| '''Jan''' <br />
| '''Feb''' <br />
| '''Mar''' <br />
| '''Apr''' <br />
| '''May''' <br />
| '''Jun''' <br />
| '''Jul''' <br />
| '''Aug''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''top-BDII Availability/Reliability''' <br />
| [https://documents.egi.eu/secure/RetrieveFile?docid=1000&version=1&filename=EGI-core_services_availabilities-per_NGI-Jan2012.pdf 01/12] <br />
| [02/12] <br />
| [03/12] <br />
| [04/12] <br />
| [05/12] <br />
| [06/12] <br />
| [07/12] <br />
| [08/12] <br />
| [09/12] <br />
| [10/12] <br />
| [11/12] <br />
| [12/12]<br />
|-<br />
| '''ROD Performance Index''' <br />
| [https://ggus.eu/ws/ticket_info.php?ticket=79006 GGUS-79006] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2001-2012.pdf Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
| [GGUS0 n] <br />
[Newsletter] <br />
<br />
|}<br />
<br />
== 2011 ==<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''Jan'''<br />
|'''Feb'''<br />
|'''Mar'''<br />
|'''Apr'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://documents.egi.eu/document/332 01/11]<br />
|[https://documents.egi.eu/document/402 02/11]<br />
|[https://documents.egi.eu/document/465 03/11]<br />
|[https://documents.egi.eu/document/508 04/11]<br />
|[https://documents.egi.eu/document/593 05/11]<br />
|[https://documents.egi.eu/document/648 06/11]<br />
|[https://documents.egi.eu/document/716 07/11]<br />
|[https://documents.egi.eu/document/783 08/11]<br />
|[https://documents.egi.eu/document/820 09/11]<br />
|[https://documents.egi.eu/document/879 10/11]<br />
|[https://documents.egi.eu/document/905 11/11]<br />
|[https://documents.egi.eu/document/959 12/11] re-computations are in progress...<br />
|}<br />
<br />
=== Resource infrastructure Provider Performance ===<br />
<br />
{| cellspacing="0" cellpadding="5" border="1"<br />
|-<br />
| '''Service Level''' <br />
| '''Sep''' <br />
| '''Oct''' <br />
| '''Nov''' <br />
| '''Dec'''<br />
|-<br />
| '''top-BDII Availability/Reliability''' <br />
| <!--Sep--> [https://documents.egi.eu/public/RetrieveFile?docid=820&version=5&filename=EGI-core_services_availabilities-per_NGI%20NGIs%20core%20services.pdf Sep] <br />
| <!--Oct--> [https://documents.egi.eu/public/RetrieveFile?docid=879&version=4&filename=EGI-core_services_availabilities-per_NGI-Oct2011-1.pdf Oct] <br />
| <!--Nov--> [https://documents.egi.eu/public/RetrieveFile?docid=905&version=3&filename=EGI-core_services_availabilities-per_NGI-Nov2011.pdf Nov] <br />
| <!--Dec--> [https://documents.egi.eu/public/RetrieveFile?docid=959&version=1&filename=EGI-core_services_availabilities-per_NGI-Dec2011.pdf Dec]<br />
|-<br />
| '''ROD Performance Index''' <br />
| <!--Sep-->NA <br />
| <!--Oct-->[https://ggus.eu/ws/ticket_info.php?ticket=76116 GGUS-76116] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2011-2011.pdf Newsletter] <br />
<br />
| <!--Nov-->[https://ggus.eu/ws/ticket_info.php?ticket=77235 GGUS-77235] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2012-2011.pdf Newsletter] <br />
<br />
| <!--Dec--> [https://ggus.eu/ws/ticket_info.php?ticket=78078 GGUS-78078] <br />
[https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2012-2011.pdf Newsletter] <br />
<br />
|}<br />
<br />
== 2010 ==<br />
<!--*[https://documents.egi.eu/document/42 May]|[https://documents.egi.eu/document/96 Jun]|[https://documents.egi.eu/document/130 Jul]|[https://documents.egi.eu/document/157 Aug]|[https://documents.egi.eu/document/219 Sep]|[https://documents.egi.eu/document/238 Oct]|[https://documents.egi.eu/document/266 Nov]|[https://documents.egi.eu/document/299 Dec]<br />
*[https://edms.cern.ch/document/963325 January 2008 - April 2010] (EGEE league tables)--><br />
<br />
<br />
=== Resource Centre Performance ===<br />
{| border="1" cellspacing="0" cellpadding="5" <br />
|-<!--style="background-color: lightgray;"--><br />
| '''Service Level'''<br />
|'''EGEE Statistics'''<br />
|'''May'''<br />
|'''Jun'''<br />
|'''Jul'''<br />
|'''Aug'''<br />
|'''Sep'''<br />
|'''Oct'''<br />
|'''Nov'''<br />
|'''Dec'''<br />
|-<br />
| '''Availability/Reliability'''<br />
|[https://edms.cern.ch/document/963325 January 2008 - April 2010]<br />
|[https://documents.egi.eu/document/42 05/10]<br />
|[https://documents.egi.eu/document/96 06/10]<br />
|[https://documents.egi.eu/document/130 07/10]<br />
|[https://documents.egi.eu/document/157 08/10]<br />
|[https://documents.egi.eu/document/219 09/10]<br />
|[https://documents.egi.eu/document/238 10/10]<br />
|[https://documents.egi.eu/document/266 11/10] <br />
|[https://documents.egi.eu/document/299 12/10]<br />
|}<br />
<br />
<br />
<br />
== EGI overall Availability and Reliability ==<br />
It is available [https://documents.egi.eu/public/ShowDocument?docid=415 here] (xls file, data from May 01 2010)<br />
<br />
== Underperforming/Suspended RCs ==<br />
* List of [https://wiki.egi.eu/wiki/Availability_and_reliability_reports_metrics underperforming/suspended Resource Centres ] <br />
* List of [https://wiki.egi.eu/wiki/List_of_sites_for_which_the_availability_followup_procedures_were_not_applicable Rsource Centres] to which the Availability followup procedure was not applicable<br />
<!--* [https://twiki.cern.ch/twiki/bin/view/EGEE/SuspendedSites List of suspended sites (2009)]--><br />
<br />
=Process for quality verification=<br />
<br />
* '''Generation of statistics'''<br />
Availability and reliability statistics are automatically generated the first week of the month by the [https://wiki.egi.eu/wiki/External_tools#Availability_Computation_Engine Availability Computation Engine] (Gridview until May 2011) using the profile in pdf format and placed under [http://gvdev.cern.ch/GRIDVIEW/downloads/Reports/]. An Excel version is available at [http://gvdev.cern.ch/GVPC/Excel/ACE/]<br />
<br />
* '''Preliminary processing'''<br />
Once the reports are generated, sanity checks are performed by EGI SA1 (Task TSA1.8). After this step is completed, statistics are uploaded into the EGI document server. Links to monthly statistics will be provided on a regular basis at this wiki page.<br />
<br />
* '''Publication'''<br />
An announcement of the new results is distributed by EGI SA1 (TSA1.8) to the NGI Operations Managers mailing list. COD (TSA1.7) is responsible of supervising statistics by chasing NGIs to chase sites that need to provide comments in case thresholds are not met, and identifies sites eligible for suspension. This phase starts by filing a ticket to the COD Support Unit. The overall comments gathering process is handled through tickets.<br />
<br />
* '''Handling of sites below targets'''<br />
For a site that misses availability/reliability targets but is not eligible for suspension: <br />
<br />
# a child ticket is opened by the COD team and assigned to the respective NGI, asking for explanation to be given <br />
# the explanation must be produced within 10 working days since the ticket is received by the site (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs]). Reminders and escalation is performed in accordance to COD escalation procedures [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure].<br />
# if the explanation is found satisfactory the ticket is closed <br />
# conversely if the explanation is not given in due time, or the explanation is found inadequate, COD escalation procedure will be followed [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], with the site being suspended if neither site or NGI reply to the ticket <br />
# the child ticket can then be closed <br />
# the parent ticket will be closed when all child tickets have been closed.<br />
<br />
* '''Handling of sites that are eligible for suspension'''<br />
For a site that is eligible for suspension: <br />
# a child ticket is opened by the COD team assigned to appropriate NGI, notifying that the site will be suspended within 10 working days (please see known issues section [https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics#Known_issues_and_recommendations_to_NGIs])<br />
# after the 10 days period passes during which normal COD escalation procedures apply [https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure], the site is suspended by COD unless the NGI has intervened or the EGI Chief Operations officer objects. <br />
# in the case of NGI intervention, non suspension will occur if both the COD and COO agree on the reasoning provided by the NGI <br />
# the child ticket closes either when the site is suspended or when suspension is canceled <br />
# the parent ticket will be closed when all child tickets have been closed<br />
<br />
* '''Wiki follow up page'''<br />
Sites that fail to provide explanations justifying the failure to meet OLA targets, or the explanation is found inadequate, as well as sites that are suspended, will be recorded in a wiki page [https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD_metrics]<br />
<br />
* '''Recomputation precedure'''<br />
Should there be doubts about the validity of Availability/Reliability reports, a RC/NGI can request recomputations according to the procedure defined at [https://wiki.egi.eu/wiki/PROC10]<br />
<br />
=Known issues and recommendations to NGIs=<br />
# ACE as Gridview in the past, is always calculating reliability and reliability of a site as soon as it shows up in GOCDB and in the BDIIs, regardless of its certification status. While processing the data in order to generate the availability/reliability report, ACE takes into account the Certification status of the site at that moment in order to decide if the site is certified and as a result it will show up in the report, or if it uncertified and it has to be excluded. '''Thus newly certified sites will get inaccurate Availability/Reliability figures for the month they were certified and all months before that.''' Because the Certification status history is not currently available in the operations tools, until a solution is implemented NGIs should check if they have sites affected by this issue and report it as explanation. More information at [https://gus.fzk.de/ws/ticket_info.php?ticket=60594] and [https://gus.fzk.de/ws/ticket_info.php?ticket=60925]. '''As of December 2010, Gridview had included a snapshot feature so availability takes into account the topology at the last day of the month. While it does not solve the problem completely, it reduces its impact. However ACE reports (used since May 2011) do not include the snapshot feature yet.''' <br />
# The calculations performed by ACE always '''take into account the information system status and gocdb information at the time the calculation is performed, and not that of a certain checkpoint in the past'''. The implication of this is that any complete recalculation has the risk of altering the results for sites that had correct numbers in the first place. Thus until a solution is found, '''complete recalculations are avoided whenever possible''', and errors are fixed on per site basis for those that have lower number than they should.<br />
# Weighted availability is calculated by multiplying the number of logical CPUs a site published with the published HEPSPEC value. It is important that these numbers are correct, if HEPSPEC for a site is too high or too low (for example in case of mistake) the overall NGI wighted availability will be affected.<br />
<br />
=[[Documentation#OLAs|Operational Level Agreements]]=<br />
<br />
=Links=<br />
* Definition of Availability and Reliability and related computation algorithm ([https://tomtools.cern.ch/confluence/download/attachments/2261694/Ace_Service_Availability_Computation.pdf?version=1&modificationDate=1314361543000 paper])<br />
<br />
* NEW! [https://wiki.egi.eu/wiki/Availability_and_reliability_tests List of Nagios tests] used for availability computation<br />
* [https://tomtools.cern.ch/confluence/display/SAM/ACE Availability Computation Engine (ACE) home page]<br />
<br />
*[https://wiki.egi.eu/wiki/Availability_and_reliability_internal_procedure_for_COD COD procedure for oversight of availability and reliability performance]<br />
* Impact of change of suspension policy for under-performing sites: [https://wiki.egi.eu/wiki/Availability_and_reliability_threshold_change_impact report]<br />
<br />
<!-- DEPRECATED LINKS<br />
* [https://twiki.cern.ch/twiki/pub/LCG/GridView/Gridview_Service_Availability_Computation.pdf paper]<br />
* [https://twiki.cern.ch/twiki/bin/view/LCG/ACE (Old) Availability Computation Engine] (ACE)<br />
* [https://twiki.cern.ch/twiki/bin/view/EGEE/MonthlyAvailability EGEE-III Comments on site availability and reliability statistics]<br />
*[http://gvdev.cern.ch/GVPC/Excel/ '''(DEPRECATED)''' GridView availability/reliability report generator] (providing access to the database including Nagios results for OPS and SAM results for VOs)<br />
*[http://gridview015.cern.ch/GVPC/Excel/ACE/ ACE report generator]<br />
*[https://gvdev.cern.ch/ACEVAL/ace_index.php ACE visualization portal]--><br />
{{Template:Creative_commons}}</div>Fergadishttps://wiki.egi.eu/w/index.php?title=EGI-InSPIRE:SA1.8-QR7&diff=32795EGI-InSPIRE:SA1.8-QR72012-02-13T14:06:30Z<p>Fergadis: /* 6. NGI Monthly Availability and Reliability Results */</p>
<hr />
<div>__NOTOC__<br />
= 1. Task Meetings =<br />
<!--<br />
Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all task meetings participants are ALWAYS recorded either on indico from the registrants’ list, or in the minutes. <br />
OMB meeting will be reported under task TSA1.1 only. Monday Operations meetings need to be reported under task TSA1.3 only. Training events will be recorded in the training event registry and need not be mentioned here.<br />
--><br />
{| border="1" cellspacing="0" cellpadding="5" align="center"<br />
! style="width: 10%" | Date (dd/mm/yyyy)<br />
! style="width: 20%" | Url Indico Agenda<br />
! style="width: 20%" | Title<br />
! style="width: 50%" | Outcome<br />
|-<br />
|<br />
|<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 2. Main Achievements = <br />
<!--<br />
Note. This is a detailed account of progress over the previous quarter of activities within the task. <br />
--><br />
* Continue publishing of A/R league tables - OPS VO<br />
<br />
* Continue publishing of A/R reports for NGI Core services (top-BDII)<br />
<br />
* The [https://documents.egi.eu/document/463 RP OLA document] has been approved at [https://www.egi.eu/indico/conferenceDisplay.py?confId=615 OMB (2011-10-25)]<br />
<br />
* Starting from Jan. 2012 the minimum Top-BDII monthly availability of NGIs has to be 99%<br />
<br />
* A new A/R profile ROC_CRITICAL (copy of WLCG_CREAM_LCGCE_CRITICAL) with the addition of the "org.bdii.Freshness" test was introduced since November<br />
<br />
* The impact of adding the above test was assessed in the A/R statistics for Nov. and Dec. using the MyEGI PI<br />
<br />
* [https://www.egi.eu/indico/conferenceDisplay.py?confId=617 OMB (2011-12-20)] decided that the new ROC_CRITICAL profile will replace the current profile WLCG_CREAM_LCGCE_CRITICAL starting with the A/R statistics of Jan. 2012<br />
<br />
* Starting from Jan. 2012 each month COD team will send a GGUS tickets to NGIs indicating the list of sites which are above 10% of UNKNOWN<br />
<br />
* [[PROC10]] updated to clarify that re-computation requests can be requested before the end of the affected month<br />
<br />
== EGI Catch-All Core Service ==<br />
* Operation of WMS, LB and Top-BDII for site certification<br />
<br />
* Operation of certification Catch-All portal<br />
<br />
* Operation of VOMS/VORMS for Dteam VO<br />
<br />
= 3. Issues and Mitigation =<br />
<!-- fill the table below --><br />
<br />
{| border="1" cellspacing="0" cellpadding="2"<br />
|-<br />
!scope="col"| Issue Description<br />
!scope="col"| Mitigation Description<br />
|-<br />
|<br />
|<br />
|-<br />
|}<br />
<br />
= 4. Plans for the next period =<br />
<!-- provide your text below --><br />
* The milestone MSA 418 "Operational Level Agreements (OLAs) within the EGI production infrastructure" is planned for 2012Q1 with deadline the end of the first month of 2012Q2<br />
<br />
* The 2nd release of the RP OLA will be finalized early 2012Q1<br />
<br />
* Establish direct communication channels with the people that are operating services at the EGI.eu level<br />
<br />
* Continue the handling the validation and distribution of the monthly A/R reports and the maintenance of the relevant wiki page<br />
<br />
= 5. Number of sites suspended =<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" align="center" style="text-align: center;"<br />
!align="left" | Month !! Suspended sites<br />
|-<br />
|align="left" | October 2011 || 1<br />
|- <br />
|align="left" | November 2011 || 0<br />
|-<br />
|align="left" | December 2011 || 0<br />
|-<br />
|}<br />
<br />
= 6. NGI Monthly Availability and Reliability Results =<br />
<br />
{| border="1" cellspacing="0" cellpadding="5" align="center" style="text-align: center;"<br />
!rowspan="2" align="left" | Month !!colspan="2" | Site Middleware Services !!colspan="2" | Core Middleware Services<br />
|-<br />
! Availability !! Reliability !! Availability !! Reliability<br />
|-<br />
|align="left" | November 2011 || 94.24 || 95.32 || 94.75 || 95.12<br />
|-<br />
|align="left" | December 2011 || 94.80 || 95.37 || 95.45 || 95.60<br />
|-<br />
|align="left" | January 2012 || 95.45 || 96.18 || 96.24 || 96.64<br />
|-<br />
!align="left" | Average || 94.83 || 95.62 || 95.48 || 95.79<br />
|-<br />
|}</div>Fergadis