Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "PROC11 Resource Centre Decommissioning"

From EGIWiki
Jump to navigation Jump to search
(6 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{Template:Op menubar}}  
{{Template:Op menubar}}  
{{Template:Doc_menubar}}  
{{Template:Doc_menubar}}  
[[Category:Deprecated]]
{| style="border:1px solid black; background-color:lightgrey; color: black; padding:5px; font-size:140%; width: 90%; margin: auto;"
| style="padding-right: 15px; padding-left: 15px;" |
|[[File:Alert.png]] This page is '''Deprecated'''; the content has been moved to https://confluence.egi.eu/display/EGIPP/PROC11+Resource+Centre+Decommissioning 
|}


{{TOC_right}}  
{{TOC_right}}  
Line 8: Line 13:
|Doc_title = Resource Centre Decommissioning
|Doc_title = Resource Centre Decommissioning
|Doc_link = [[PROC11|https://wiki.egi.eu/wiki/PROC11]]
|Doc_link = [[PROC11|https://wiki.egi.eu/wiki/PROC11]]
|Version = 19 August 2014
|Version = 8 June 2016
|Policy_acronym = OMB
|Policy_acronym = OMB
|Policy_name = Operations Management Board
|Policy_name = Operations Management Board
|Contact_group = operations-support@mailman.egi.eu
|Contact_group = operations@egi.eu
|Doc_status = Approved  
|Doc_status = Approved  
|Approval_date = 28/02/2012
|Approval_date = 28/02/2012
|Procedure_statement = A procedure for the steps involved to decommission Resource Centres (sites) in the EGI infrastructure.  
|Procedure_statement = A procedure for the steps involved to decommission Resource Centres (sites) in the EGI infrastructure.  
|Owner = Matthew Viljoen
}}
}}


Line 111: Line 117:
|  
|  
#The Resource Centre Operations Manager contacts her Resource Infrastructure Operations Manager that the Resource Centre is going to be decommissioned and together they agree on the plan for decommissioning it.  
#The Resource Centre Operations Manager contacts her Resource Infrastructure Operations Manager that the Resource Centre is going to be decommissioned and together they agree on the plan for decommissioning it.  
#*The Resource Centre Operations Manager opens a GGUS ticket, which will be used as ''Parent ticket'' to track the whole process. The ticket must remain in an open status until the site is closed in GOCDB. This ''Parent ticket'' can be used as parent ticket for the resource centre's services decommission procedures (see [[PROC12|PROC12]], step 1).
#*The Resource Centre Operations Manager opens a GGUS ticket to Operations Center Support Unit it belongs to, which will be used as ''Parent ticket'' to track the whole process. The ticket must remain in an open status until the site is closed in GOCDB. This ''Parent ticket'' can be used as parent ticket for the resource centre's services decommission procedures (see [[PROC12|PROC12]], step 1).


|- valign="top"
|- valign="top"
Line 117: Line 123:
| RC  
| RC  
|  
|  
#The Resource Centre Operations Manager should use the broadcast tool (https://operations-portal.egi.eu/broadcast) to announce to both VO managers and VO users of the VOs supported by the RC (and CCing central-operator-on-duty@mailman.egi.eu) that it is starting the decommissioning procedure:
#The Resource Centre Operations Manager should use the broadcast tool (https://operations-portal.egi.eu/broadcast) to announce to both VO managers and VO users of the VOs supported by the RC(excluding Ops and dteam VO) that it is starting the decommissioning procedure:
#*Announce a detailed (agreed) timeline for the decommissioning and that the Resource Centre will schedule downtimes of its resources or site downtime to prevent any further usage. In the timeline must be '''clearly''' listed the deadlines for the VO Managers' actions.  
#*Announce a detailed (agreed) timeline for the decommissioning and that the Resource Centre will schedule downtimes of its resources or site downtime to prevent any further usage. In the timeline must be '''clearly''' listed the deadlines for the VO Managers' actions.  
#*In the ticket should be announced also the list of all the resource centre's decommissioning services and the scheduled date of decommission (this supersedes [[PROC12|PROC12]] step 2).  
#*In the ticket should be announced also the list of all the resource centre's decommissioning services and the scheduled date of decommission (this supersedes [[PROC12|PROC12]] step 2).  
Line 176: Line 182:
| M. Krakowian
| M. Krakowian
| 19 August 2014
| 19 August 2014
| Change contact group -> Operations support  
| Change contact group -> Operations support
|-
|
| Alessandro Paolini
| 2016-06-08
| Changed contact group -> Operations 
|}
|}

Revision as of 16:12, 5 August 2021

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Documentation menu: Home Manuals Procedures Training Other Contact For: VO managers Administrators
Alert.png This page is Deprecated; the content has been moved to https://confluence.egi.eu/display/EGIPP/PROC11+Resource+Centre+Decommissioning
Title Resource Centre Decommissioning
Document link https://wiki.egi.eu/wiki/PROC11
Last modified 8 June 2016
Policy Group Acronym OMB
Policy Group Name Operations Management Board
Contact Group operations@egi.eu
Document Status Approved
Approved Date 28/02/2012
Procedure Statement A procedure for the steps involved to decommission Resource Centres (sites) in the EGI infrastructure.
Owner Matthew Viljoen



Overview

This procedure defines the good practices between a Resource Centre (aka site) and its users when the resource centre/site is being decommissioned.

It should be noted that the whole process of decommissioning a Resource Centre in an ordered manner will take up to four months. Note: the site hardware decommissioning can start after one month

Note: A separate document provides the process for Resource Centre Registration and Certification.

Definitions

Please refer to the EGI Glossary for the definitions of the terms used in this procedure.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Entities involved in the procedure

  • Resource Centre Operations Manager: person who is responsible for initiating the decommissioning procedure by contacting the Resource Infrastructure Operations Manager.
  • Resource Infrastructure Operations Manager (aka NGI operations manager) : person who is responsible for finding and agreement with the Resource Centre about the timeline, in order to minimize the impact on the user communities and infrastructure. Resource Infrastructure Operations Manager is responsible of taking care that this procedure and related procedures are properly followed.
  • Virtual Organizations (VO's): Data and other stateful objects of the supported VO's may be stored at the Resource Centre.
  • Virtual Organizations (VO) managers: persons who are responsible for retrieving this data from the Resource Centre in due time. Tracking is done through their support unit in GGUS. If such support unit is not available, the VOs should be contacted directly using the contact information available in the VO ID card.
  • Operations Centre: entity which is technically responsible for carrying out the main ticket and database updates.

The Resource Infrastructure Operations Manager can determine the level of involvement of other actors together with the Resource Centre Operations Manager.

Contact information

  • EGI Operations: operations (at) mailman.egi.eu
  • EGI Resource Infrastructure Providers are listed on the EGI web site
  • A list of EGI Operations Centres with their respective contact information is available from the GOCDB
  • EGI CSIRT: egi-csirt-team (at) mailman.egi.eu
  • The list of VO's served by a specific Resource Centre and their ID cards can be retrieved from the Operations Portal.
  • The VO managers and their contact information for a specific VO can be retrieved from the Operations Portal.

Actions and responsibilities

Resource Centre Operations Manager

  1. A Resource Centre Operations Manager is responsible for all Resource Centres (RC's) within its respective domain.
  2. The Resource Centre Operations Manager of a Resource Centre in case of RC decommission is REQUIRED
    • to contact the respective NGI if the Resource Centre is located in Europe,
    • to contact the respective Resource Infrastructure Provider active in a relevant geographical area if the Resource Centre is outside Europe, about the intention of the Resource Centre to decommission operation.
  3. The Resource Centre Operations Manager is REQUIRED to provide the necessary Resource Centre information needed to complete the decommission process, and he/she is responsible for its accuracy and maintenance.
  4. The Resource Centre Operations Manager MUST attend Resource Centre decommissioning applications and MUST provide feedback to the requesting partners in a timely manner to accept or reject the requests received.

Resource Infrastructure Operations Manager

  1. A Resource Infrastructure Operations Manager is REQUIRED to be responsible for all Resource Centres within its respective jurisdiction. For example, an NGI is responsible for all Resource Centres in its respective country.


VO's and VO managers

  1. give the users the relevant information about the decommissioning (deadlines, involved resources, files, how to handle it)
  2. follow-up and support users in their file migration procedures until the deadline
  3. inform Resource Centre about the status of the migration(s)

Operations Centre

  1. The Operations Centre is responsible for decommissioning Resource Centre.
  2. The Operations Centre is responsible for updating the corresponding entries in the EGI configuration repository GOCDB.
  3. The Operations Centre MUST keep Resource Centre information up to date and in all operations tools as needed, such as the local NAGIOS server for monitoring of certified Resource Centres, the local helpdesk (if available) for the registration of the Resource Centre support staff, etc.

Workflow

The various steps required by both the Resource Infrastructure Operations Manager and the Resource Centre Operations Manager are explained in the tables below. The procedure below covers the transition from the Certified to the Closed status. The transition from the Suspended to the Closed status can be derived analogously.

The general status flow that a Resource Centre is allowed to follow is illustrated by the following diagram. Information on Resource Centre status and on how to manipulate it is available from GOCDB Documentation.

SiteStatusFlow.png


A Resource Centre cannot be in Candidate state for more than two month, and Suspended state for longer than four months. After this period the Resource Centre SHOULD be closed.

Resource Centre decommissioning

Steps

  • Actions tagged RC are the responsibility of the Resource Centre Operations Manager.
  • Actions tagged RP are the responsibility of the Resource Infrastructure Operations Manager.
  • Actions tagged OC are the responsibility of the Operations Centre
# Responsible Action
1 RC
  1. The Resource Centre Operations Manager contacts her Resource Infrastructure Operations Manager that the Resource Centre is going to be decommissioned and together they agree on the plan for decommissioning it.
    • The Resource Centre Operations Manager opens a GGUS ticket to Operations Center Support Unit it belongs to, which will be used as Parent ticket to track the whole process. The ticket must remain in an open status until the site is closed in GOCDB. This Parent ticket can be used as parent ticket for the resource centre's services decommission procedures (see PROC12, step 1).
2 RC
  1. The Resource Centre Operations Manager should use the broadcast tool (https://operations-portal.egi.eu/broadcast) to announce to both VO managers and VO users of the VOs supported by the RC(excluding Ops and dteam VO) that it is starting the decommissioning procedure:
    • Announce a detailed (agreed) timeline for the decommissioning and that the Resource Centre will schedule downtimes of its resources or site downtime to prevent any further usage. In the timeline must be clearly listed the deadlines for the VO Managers' actions.
    • In the ticket should be announced also the list of all the resource centre's decommissioning services and the scheduled date of decommission (this supersedes PROC12 step 2).
    • The timeline is recorded in the Parent ticket (including the timelines of all the services).
    • The broadcast link is recorded in the Parent ticket.
    • The downtime should start no earlier than 15 days and no later than one month after the broadcast.
    • State that the aim is to make the status change to “suspended” in GOCDB within 6 (or 8) weeks from broadcast date.
3 RC, VO, RP
  1. The resource centre starts the Service Decommissioning Procedure (PROC12) for every production service of the site.
    • The procedures for the services can be run in parallel
    • Service decommissioning procedures can start from step 3, using this procedure parent ticket as parent ticket for all the decommissioning procedures.
4 OC
  1. Once the PROC12 step 7 -all services end the scheduled downtime- is completed for all services of the site:
    • The Resource Centre's status is changed to suspended.
    • This action must be recorded in the parent ticket.
  2. At this point the Resource Centre is no longer listed in the topBDIIs of EGI and cannot be reached by simply submitting a job. It might still be possible to directly access the Resource Centre for members of VOs which the Resource Centre supported. If hardware is closed down, the Resource Centre will need to address this, possibly informing these users that their data could be at risk.
5 RC
  1. Logs are to be kept at the Resource Centre, available for the period of time requested by the Grid Security Traceability and Logging Policy.
6 OC
  1. Resource Infrastructure Operations Manager should email the EGI operations team (operations 'at' egi.eu) and EGI CSIRT ( contact) at the end of the 90 days period informing about end of the logs retention period and that site is going to be closed. Revoke the roles of Resource Centre Administrator and people relevant to this Resource Centre in GOCDB and to the relevant CA if appropriate. Resource Infrastructure Operations Manager is to clean the VOMRS dteam server accordingly. In case there is no user left relevant to this very Resource Centre, the Resource Infrastructure Operations Manager has to inform his/her CA in order to close this entity officially to avoid keeping “ghost entities”.
  2. Site is closed in GOCDB, at the end of the logs retention period.
    • This action must be recorded in the parent ticket
  • NOTE: People will have to separately handle any subscriptions to mailing lists which have been initiated by Resource Centre Administrator and which were not triggered by contact definitions in the GOCDB.
7 OC
  1. Parent ticket is closed.
    • This operations can be performed only if all the service decommissioning procedures are completed

Revision history

Version Authors Date Comments
M. Krakowian 19 August 2014 Change contact group -> Operations support
Alessandro Paolini 2016-06-08 Changed contact group -> Operations