Revision as of 15:02, 19 January 2012

Main

EGI.eu operations services

Support

Documentation

Tools

Activities

Performance

Technology

Catch-all Services

Resource Allocation

Security

ROD OLA metric

ROD OLA metric was introcuded to track the level of Grid Oversight service delivered according to Resource Provider OLA.

The metric was accepted during Technical Forum 2011 in Lyon and is available on EGI Operations Portal.

How the metrics is calculated?

The metric is calculated monthly from the data gathered by EGI Operations Portal. It does not take into account weekends.

ROD OLA metric sum of:

No. of ticket expired appearance
No. alarms older than 72h appearance

The threshold was set to 10 items. Above this value ROD teams has to provide explanation and improvement plan.

Performance reports

2012

2011

	Oct	Nov	Dec
GGUS	76116	77235
Newsletter	11.2011	12.2011

Recalculation procedure in case of intervention on the regional NAGIOS or dashboard

Prerequisite:

In case of problems with synchronization between regional dashboard NGI should create a GGUS ticket to Operations Portal team.
In case of problems with the regional Nagios, NGI should create a GGUS ticket to SAM team.
In case of work carried out on regional Nagios or dashboard NGI should declare downtime in GOC DB.

Procedure steps:

When NGI get a ticket from COD about ROD performance NGI should provide GGUS ticket or link to Nagios or dashboard downtime page in GOC DB.
Based on GGUS trouble tickets referenced in prerequisites, or on the GGUS ticket opened by NGI to MyEGI requesting for A/R recualculation, or GOC DB service downtime entry, COD, knowing when the problem occurred,
can remove the metrics items for given days from final report pdf.

Future plans

In the future the metric will also include no. alarms closed in NON-OK status without explanation. This will need some inplementation effort.

Issues to be implemented:

Taking into account holidays periods in alarms ageing
Automatic check if site/node is in downtime while alarm is closing
Automatic check if node is not in production while alarm is closing
In case of SCHEDULED interventions, the monthly metrics calculation should automatically take the scheduled downtime into account. At the time the metrics are computed, the application which does such calculation should access the GOC PI to determine which regional nagios machines were in downtime, and include that restriction in the calculation.

@@ Line 2: / Line 2: @@
 [[Category:COD]]
 [[Category:ROD]]
+{{TOC_right}}
 = ROD&nbsp;OLA&nbsp;metric  =
 ROD&nbsp;OLA metric was introcuded to track the level of Grid Oversight service delivered according to [https://documents.egi.eu/secure/ShowDocument?docid=463 Resource Provider OLA].
 The metric was accepted during Technical Forum 2011 in Lyon and is available on [https://operations-portal.in2p3.fr/dashboard/rodOlaMetrics EGI&nbsp;Operations Portal].
-<br>
 == How the metrics is calculated?  ==
 The metric is calculated monthly from the data gathered by EGI Operations Portal. It does not <span lang="en" id="result_box" class="short_text"><span class="hps">take</span> <span class="hps">into account</span> <span class="hps">weekends</span></span>.
-<br>
 '''ROD OLA metric sum of:'''
@@ Line 29: / Line 26: @@
 <br>
-<br>
 <br>
@@ Line 51: / Line 46: @@
 | <br>
 |}
-<br>
 == Recalculation procedure in case of intervention on the regional NAGIOS or dashboard<br>  ==
@@ Line 61: / Line 54: @@
 #In case of problems with the regional Nagios, NGI should create a GGUS ticket to SAM team.
 #In case of work carried out on regional Nagios or dashboard NGI&nbsp;should declare downtime in GOC DB.
-<br>
 '''Procedure steps:'''
@@ Line 71: / Line 62: @@
 == Future plans<br>  ==
-In the future the metric will also include no. alarms closed in NON-OK status without explanation. This will need some inplementation effort. <br>
+In the future the metric will also include no. alarms closed in NON-OK status without explanation. This will need some inplementation effort.
 '''Issues to be implemented'''<span style="font-weight: bold;">:</span>
@@ Line 78: / Line 69: @@
 *Automatic check if site/node is in downtime while alarm is closing
 *Automatic check if node is not in production while alarm is closing
-*In case of SCHEDULED interventions, the monthly metrics calculation should automatically take the scheduled downtime into account. At the time the metrics are computed, the application which does such calculation should access the GOC PI to determine which regional nagios machines were in downtime, and include that restriction in the calculation.<br><br>
+*In case of SCHEDULED interventions, the monthly metrics calculation should automatically take the scheduled downtime into account. At the time the metrics are computed, the application which does such calculation should access the GOC PI to determine which regional nagios machines were in downtime, and include that restriction in the calculation.
-<br>

Difference between revisions of "Service Level Target - ROD performance index"

Revision as of 15:02, 19 January 2012

Contents

ROD OLA metric

How the metrics is calculated?

Performance reports

Recalculation procedure in case of intervention on the regional NAGIOS or dashboard

Future plans

Navigation menu

Difference between revisions of "Service Level Target - ROD performance index"

Revision as of 15:02, 19 January 2012

ROD OLA metric

How the metrics is calculated?

Performance reports

Recalculation procedure in case of intervention on the regional NAGIOS or dashboard

Future plans

Navigation menu

Search