Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "MAN04 Tool Intervention Management"

From EGIWiki
Jump to navigation Jump to search
Line 37: Line 37:
----
----


== Core operational tool intervention management ==
= Management of central operational tool unscheduled downtimes =


The purpose of this document is to describe the intervention in case of unscheduled failure of central operational tool.
The purpose of this document is to describe the intervention in case of unscheduled failure of central operational tool.
Line 46: Line 46:


Scheduled downtimes are management according to existing procedures.
Scheduled downtimes are management according to existing procedures.
= Procedure =
In the following sections
== Short downtime ==
Description: Service fails and recovers before the administrator manages to react (e.g. short power or network outage).
Action: Administrator should enter UNSCHEDULED downtime to GOCDB with detailed description of
= Notification templates =
== 1. Service failure notification ==
== 2. Prolonged service failure notification ==
== 3. Service recovery notification without detailed information ==
== 4. Post mortem analysis ==


= Revision History =
= Revision History =
<!-- to track changes introduced after the document is officially approved -->
<!-- to track changes introduced after the document is officially approved -->

Revision as of 10:13, 31 May 2011

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Documentation menu: Home Manuals Procedures Training Other Contact For: VO managers Administrators


Title Management of central operational tool unscheduled downtimes
Document link https://wiki.egi.eu/wiki/MAN03_Tool_Intervention_Management
Last review Tferrari 13:55, 8 March 2011 (UTC)
Policy Group Acronym OMB
Policy Group Name Operations Management Board
Contact Person E. Imamagic
Document Status draft
Approved Date specify
Procedure Statement This manual provides information on how to manage central operational tool unscheduled downtimes.

Management of central operational tool unscheduled downtimes

The purpose of this document is to describe the intervention in case of unscheduled failure of central operational tool.

Scope

This manual only applies to unscheduled downtimes of central operational tools. List of central operational tool

Scheduled downtimes are management according to existing procedures.

Procedure

In the following sections

Short downtime

Description: Service fails and recovers before the administrator manages to react (e.g. short power or network outage).

Action: Administrator should enter UNSCHEDULED downtime to GOCDB with detailed description of

Notification templates

1. Service failure notification

2. Prolonged service failure notification

3. Service recovery notification without detailed information

4. Post mortem analysis

Revision History