Difference between revisions of "PROC12 Production Service Decommissioning"
Jump to navigation
Jump to search
(→Steps) |
(→Steps) |
||
Line 126: | Line 126: | ||
| RC | | RC | ||
| | | | ||
#[''If the service is a CE or a workload management service''] After the announce of the service decommissioning the Resource Centre | #[''If the service is a CE or a workload management service''] After the announce of the service decommissioning the Resource Centre MAY disable VO job submissions to prevent further VO activity - except the monitoring jobs. | ||
#:[''If the service is a storage or data management service''] After the announce of the service decommissioning the Resource Centre | #:[''If the service is a storage or data management service''] After the announce of the service decommissioning the Resource Centre MAY disable VO writing access to prevent further VO activity - except infrastructure VOs (where possible). | ||
|- valign="top" | |- valign="top" | ||
|3 bis | |3 bis | ||
| VO | | VO | ||
| | | | ||
# The VO Manager in the time between the announcement of the decommissioning and the begin of the downtime SHOULD check If the volume of data stored by a VO in the site is big enough to require more than one month to be moved, the VO manager can ask to reschedule the downtime period. | # [''If the service is a storage element''] The VO Manager in the time between the announcement of the decommissioning and the begin of the downtime SHOULD check If the volume of data stored by a VO in the site is big enough to require more than one month to be moved, the VO manager can ask to reschedule the downtime period. | ||
#* If no communications are sent to the Resource Centre by the first week of downtime the schedule can be considered agreed by all VO Managers. | #* If no communications are sent to the Resource Centre by the first week of downtime the schedule can be considered agreed by all VO Managers. | ||
#* Any request of reschedule MUST be supported by technical reasons (e.g. total amount of data to move / Site max data transfer throughput) | #* Any request of reschedule MUST be supported by technical reasons (e.g. total amount of data to move / Site max data transfer throughput) | ||
# [''If the service is a central service like VOMS or LFC for a given VO''] VO Manager, Resource Centre Operations Manager and Resource Infrastructure Operations Manager should discuss finding a new Resource Centre for hosting these services, taking into account pre-existing agreement between VO and NGI. For international VOs, this discussion could be held at the EGI level, especially if a solution cannot be easily found within that Resource Infrastructure Provider. | |||
|- valign="top" | |- valign="top" | ||
| 4 | | 4 | ||
| RC | | RC | ||
| | | | ||
#According to the dates announced in the broadcast or differently agreed in step '''3 bis''', the Resource Centre puts | #According to the dates announced in the broadcast or differently agreed in step '''3 bis''', the Resource Centre puts the service in downtime to prevent any further usage. This downtime shall last for the scheduled period or until phase 5 is over - which ever is the shorter. | ||
#* The downtime must be recorded in the ''master ticket'' <br> | #* The downtime must be recorded in the ''master ticket'' <br> | ||
|- valign="top" | |- valign="top" | ||
| 5 | | 5 | ||
| | | RC | ||
| | | | ||
If the service is a storage elements (SEs): | |||
*Once the SE is closed for write access the Resource Centre Operations Manager opens N child tickets of the procedure's ''master ticket'' to each of the N VO managers of the N VOs the SE supports. | |||
*Once the | |||
*The VOs are given up to 4 weeks - or the amount of time agreed in step '''3 bis''' - to retrieve their data from the Resource Centre. During these 4 weeks, the Resource Centre should make sure that the SE works for the different VOs to allow them to retrieve their files. The VO managers can specify any specific requirements in their child ticket. For instance: | *The VOs are given up to 4 weeks - or the amount of time agreed in step '''3 bis''' - to retrieve their data from the Resource Centre. During these 4 weeks, the Resource Centre should make sure that the SE works for the different VOs to allow them to retrieve their files. The VO managers can specify any specific requirements in their child ticket. For instance: | ||
**Request in the child ticket from the Resource Centre Operations Manager the time limit needed to retrieve data. | **Request in the child ticket from the Resource Centre Operations Manager the time limit needed to retrieve data. | ||
Line 161: | Line 155: | ||
**VO Manager MUST communicate to the Resource Centre - if possible using the GGUS child ticket - when the data moving is completed. | **VO Manager MUST communicate to the Resource Centre - if possible using the GGUS child ticket - when the data moving is completed. | ||
<br> | <br> | ||
|- valign="top" | |- valign="top" | ||
| | | 7 | ||
| OC | | OC | ||
| | | | ||
Line 181: | Line 171: | ||
<br> <!--|- valign="top" | <br> <!--|- valign="top" | ||
| | | 8 | ||
| OC | | OC | ||
| | | |
Revision as of 10:57, 3 February 2012
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
Documentation menu: | Home • | Manuals • | Procedures • | Training • | Other • | Contact ► | For: | VO managers • | Administrators |
Title | Service Decommissioning Procedure |
Document link | |
Version - last modified | |
Policy Group Acronym | OMB |
Policy Group Name | Operations Management Board |
Contact Person | operational-documentation@mailman.egi.eu |
Document Status | DRAFT |
Approved Date | N/A |
Procedure Statement | A procedure for the steps involved to decommission a Service operated by a Resource Centre in the EGI infrastructure. |
Grid Service Decommissioning Procedure
This procedure drafts the good practices between a Resource Centre (aka site) and its users when a grid service is being decommissioned.
Definitions
- Resource Centre refers to the definition in the "Resource Centre OLA".
- In this document, the term "site" is deprecated, and Resource Centre has been used in its place.
- Other entities involved in this procedure are defined in the EGI Glossary.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
Entities involved in the procedure
- Resource Centre Operations Manager: person who is responsible for initiating the decommissioning procedure by contacting the Resource Infrastructure Operations Manager.
- Resource Infrastructure Operations Manager (aka NGI manager) : person who is responsible for finding and agreement with the Resource Centre about the timeline, in order to minimize the impact on the user communities and infrastructure.
- Virtual Organizations (VO's): Data and other stateful objects of the supported VO's may be stored at the Resource Centre.
- Virtual Organizations (VO) managers: persons who are responsible for retrieving this data from the Resource Centre in due time. Tracking is done through their support unit in GGUS. If such support unit is not available, the VOs should be contacted directly using the contact information available in the VO ID card.
The Resource Infrastructure Operations Manager can determine the level of involvement of other actors together with the Resource Centre Operations Manager.
Contact information
- EGI Resource Infrastructure Providers are listed on the EGI web site
- A list of EGI Operations Centres with their respective contact information is available from the GOCDB
- EGI CSIRT: egi-csirt-team (at) mailman.egi.eu
- The list of VO's served by a specific Resource Centre and their ID cards can be retrieved from the Operations Portal.
- The VO managers and their contact information for a specific VO can be retrieved from the Operations Portal.
Actions and responsibilities
Resource Centre Operations Manager
- The Operations Centre is responsible for decommissioning the service.
- The Operations Centre is responsible for updating the corresponding entries in the EGI configuration repository GOCDB.
- The Resource Centre Operations Manager is REQUIRED to provide the necessary Resource Centre information needed to complete the decommission process, and he/she is responsible for its accuracy and maintenance.
Resource Infrastructure Operations Manager
- A Resource Infrastructure Provider is REQUIRED to be responsible for all Resource Centres within its respective jurisdiction. For this reason the Resource Infrastructure Provider is responsible for assuring that all the Resource Centres follow this procedure for services decommissioning.
VO's and VO managers
- give the users the relevant information about the decommissioning (deadlines, involved resources, files, how to handle it)
- follow-up and support users in their file migration procedures until the deadline
- inform Resource Centre about the status of the migration(s)
Workflow
Service Centre decommissioning
Steps
- Actions tagged RC are the responsibility of the Resource Centre Operations Manager.
- Actions tagged RIP are the responsibility of the Resource Infrastructure Operations Manager.
- Actions tagged OC are the responsibility of the Operations Centre
# | Responsible | Action |
---|---|---|
1 | RC |
|
2 | RC |
|
3 | RC |
|
3 bis | VO |
|
4 | RC |
|
5 | RC |
If the service is a storage elements (SEs):
|
7 | OC |
|
9 | RC |
|
10 | OC |
|
11 | OC |
|