Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "PROC11 Resource Centre Decommissioning"

From EGIWiki
Jump to navigation Jump to search
(Remove deprecated content)
Tag: Replaced
 
(43 intermediate revisions by 6 users not shown)
Line 1: Line 1:
{{Template:Op menubar}}  
{{Template:Op menubar}}  
{{Template:Doc_menubar}}  
{{Template:Doc_menubar}}  
{{TOC_right}}
[[Category:Deprecated]]
 
{| style="border:1px solid black; background-color:lightgrey; color: black; padding:5px; font-size:140%; width: 90%; margin: auto;"
{| border="1"
| style="padding-right: 15px; padding-left: 15px;" |  
|-
|[[File:Alert.png]] This page is '''Deprecated'''; the content has been moved to https://confluence.egi.eu/display/EGIPP/PROC11+Resource+Centre+Decommissioning 
| '''Title'''
| ''Resource Centre Decommissioning Procedure''
|-
| '''Document link'''
|  
|-
| '''Version - last modified'''
| v 1.0
|-
| '''Policy Group Acronym'''  
| ''OMB''
|-
| '''Policy Group Name'''
| ''Operations Management Board''
|-
| '''Contact Person'''
| operational-documentation@mailman.egi.eu
|-
| '''Document Status'''
| ''APPROVED''
|-
| '''Approved Date'''
| 28/02/2012<br>
|-
| '''Procedure Statement'''
| ''A procedure for the steps involved to decommission Resource Centres (sites) in the EGI infrastructure. ''
|}
|}
= Resource Centre Decommissioning Procedure  =
This procedure drafts the good practices between a Resource Centre (aka site) and its users when the resource center/site is being decommissioned.
It should be noted that the whole process of decommissioning a Resource Centre in an ordered manner will take up to four months. ''Note: the site hardware decommissioning can start after one month'' <!--(<span style="background:#FFFF00">Peter: One month is the downtime period, after that the site is suspended and the SEs are no more published in the Top-BDII, lcg-ce/cp commands will not work on those SEs. In the last three months sysadmins are requested to keep the logs, not the full infrastructure of the site. And they need to be reachable (their contacts in gocdb) to ask them to provide logs if needed  7-12-2011</span>)-->
Note: A separate document provides the process for [[PROC09|Resource Centre Registration and Certification]].
= Definitions  =
*Please refer to the [[Glossary|EGI Glossary]] for the definitions of the terms used in this procedure.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
= Entities involved in the procedure  =
<!-- There are minimally two sets of players involved in this procedure -->
*'''Resource Centre Operations Manager''': person who is responsible for initiating the decommissioning procedure by contacting the Resource Infrastructure Operations Manager.
*'''Resource Infrastructure Operations Manager''' (aka NGI manager)&nbsp;: person who is responsible for finding and agreement with the Resource Centre about the timeline, in order to minimize the impact on the user communities and infrastructure. Resource Infrastructure Operations Manager is responsible of taking care that this procedure and related procedures are properly followed.
*'''Virtual Organizations (VO's)''': Data and other stateful objects of the supported VO's may be stored at the Resource Centre.
*'''Virtual Organizations (VO) managers''': persons who are responsible for retrieving this data from the Resource Centre in due time. Tracking is done through their support unit in GGUS. If such support unit is not available, the VOs should be contacted directly using the contact information available in the VO ID card.
*'''Operations Centre''': entity which is technically responsible for carrying out the main ticket and database updates.
The Resource Infrastructure Operations Manager can determine the level of involvement of other actors together with the Resource Centre Operations Manager.
= Contact information  =
*EGI Operations: operations (at) mailman.egi.eu
*EGI Resource Infrastructure Providers are listed on the EGI [https://www.egi.eu/infrastructure/Resource-providers/index.html web site]
*A list of EGI Operations Centres with their respective contact information is available from the [http://go.egi.eu/operations-centres GOCDB]
*EGI CSIRT: egi-csirt-team (at) mailman.egi.eu
*The list of VO's served by a specific Resource Centre and their ID cards can be retrieved from the [http://operations-portal.egi.eu/vo/rd Operations Portal].
*The VO managers and their contact information for a specific VO can be retrieved from the [http://operations-portal.egi.eu/vo Operations Portal].
= Actions and responsibilities  =
== Resource Centre Operations Manager  ==
#A Resource Infrastructure Provider is responsible for all Resource Centres (RC's) within its respective jurisdiction (for example, an NGI is responsible for all Resource Centres in its country). For this reason, the Resource Centre Operations Manager of a Resource Centre is REQUIRED
#*to contact the respective NGI if the Resource Centre is located in Europe,
#*to contact the respective Resource Infrastructure Provider active in a relevant geographical area if the Resource Centre is outside Europe, about the intention of the Resource Centre to decommission operation.<br>
#The Resource Centre Operations Manager is REQUIRED to provide the necessary Resource Centre information needed to complete the decommission process, and he/she is responsible for its accuracy and maintenance.<br>
<!--(<span style="background:#FFFF00">Res: How about an RC not being responsive any more?</span>) &lt;<span style="background:#FFFF00">Peter: If a site is no more responsive the RP Operations Center staff should provide -when available- the needed information. 7-12-2011</span>) -->
== Resource Infrastructure Operations Manager  ==
#A Resource Infrastructure Provider is REQUIRED to be responsible for all Resource Centres within its respective jurisdiction. For example, an NGI is responsible for all Resource Centres in its respective country.
#The Resource Infrastructure Operations Managers MUST attend Resource Centre decommissioning applications and MUST provide feedback to the requesting partners in a timely manner to accept or reject the requests received.
#The Resource Infrastructure Operations Manager MUST contact the relevant Operations Centre to start the Resource Centre decommissioning procedure.
== VO's and VO managers  ==
#give the users the relevant information about the decommissioning (deadlines, involved resources, files, how to handle it)
#follow-up and support users in their file migration procedures until the deadline
#inform Resource Centre about the status of the migration(s)
<!--(<span style="background:#FFFF00">Tristan: if the VO is not responsive any more then the decommissioning could happen after x reminders. Res: We'll need to specify this in more detail.)
</span> -->
== Operations Centre  ==
#The Operations Centre is responsible for decommissioning Resource Centre.
#The Operations Centre is responsible for updating the corresponding entries in the EGI configuration repository [[GOCDB|GOCDB]].
#The Operations Centre MUST keep Resource Centre information up to date and in all operations tools as needed, such as the local NAGIOS server for monitoring of certified Resource Centres, the local helpdesk (if available) for the registration of the Resource Centre support staff, etc.
= Workflow  =
The various steps required by both the Resource Infrastructure Operations Manager and the Resource Centre Operations Manager are explained in the tables below. The procedure below covers the transition from the ''Certified'' to the ''Closed'' status. The transition from the ''Suspended'' to the ''Closed'' status can be derived analogously. <!--(<span style="background:#FFFF00">Res: Should we provide more information to this specific case? More generally, should the procedure be split into a "suspension" phase and a "Closing" phase?</span>) -->
The general status flow that a Resource Centre is allowed to follow is illustrated by the following diagram. Information on Resource Centre status and on how to manipulate it is available from [https://wiki.egi.eu/wiki/GOCDB/Input_System_User_Documentation#Changing_Site_Certification_Status GOCDB Documentation].
[[Image:SiteStatusFlow.png|300px]]
<br>
A Resource Centre cannot be in '''Candidate''' state for more than two month, and '''Suspended''' state for longer than four months. After this period the Resource Centre SHOULD be closed.
== Resource Centre decommissioning  ==
=== Steps  ===
*Actions tagged '''RC''' are the responsibility of the Resource Centre Operations Manager.
*Actions tagged '''RP''' are the responsibility of the Resource Infrastructure Operations Manager.
*Actions tagged '''OC''' are the responsibility of the Operations Centre
{| cellspacing="0" cellpadding="5" border="1"
|-
! #
! Responsible
! Action
|- valign="top"
| 1
| RC
|
#The Resource Centre Operations Manager contacts her Resource Infrastructure Operations Manager that the Resource Centre is going to be decommissioned and together they agree on the plan for decommissioning it.
#* The Resource Centre Operations Manager opens a GGUS ticket, which will be used as ''master ticket'' to track the whole process. The ticket must remain in an open status until the site is closed in GOCDB. This ''master ticket'' can be used as master ticket for the resource centre's services decommission procedures (see PROC13, step 1).
|- valign="top"
| 2
| RC
|
#The Resource Centre Operations Manager announces through the broadcast tool to VO managers and users of all the VOs supported by the Resource Centre and to EGI Operations (COD) that it is starting the decommissioning procedure:
#*Announce a detailed (agreed) timeline for the decommissioning and that the Resource Centre will schedule downtimes of its resources  or site downtime to prevent any further usage. In the timeline must be '''clearly''' listed the deadlines for the VO Managers' actions.
#*In the ticket should be announced also the list of all the resource centre's decommissioning services and the scheduled date of decommission (this supersedes PROC13 step 2).
#*The timeline is recorded in the ''master ticket'' (including the timelines of all the services).
#*The broadcast link is recorded in the ''master ticket''.
#*The downtime should start no earlier than 15 days and no later than one month after the broadcast.
#*State that the aim is to make the status change to “suspended” in GOCDB within 6 (or 8) weeks from broadcast date. <br>
|- valign="top"
| 3
| RC, VO, RP
|
# The resource centre starts the Service Decommissioning Procedure ([[PROC13draft|PROC13]]) for every service of the site.
#* The procedures for the services can be run in parallel
#* Service decommissioning procedures can start from ''step 3'', using this procedure master ticket as master ticket for all the decommissioning procedures.
|- valign="top"
| 4
| OC
|
#Once the PROC13 step 7 -all services end the scheduled downtime- is completed for all services of the site:
#*The Resource Centre's status is changed to ''suspended''. 
#*This action must be recorded in the ''master ticket''.
#At this point the Resource Centre is no longer listed in the topBDIIs of EGI and cannot be reached by simply submitting a job. It might still be possible to directly access the Resource Centre for members of VOs which the Resource Centre supported. If hardware is closed down, the Resource Centre will need to address this, possibly informing these users that their data could be at risk.
<br> <!--|- valign="top"
| 5
| OC
|
#After one week, the status of the Resource Centre is set to ''closed''. The Resource Centre can then be considered as ''experimental'' or however the parent Resource Infrastructure Provider considers appropriate. Direct notifications through the Resource Centre Administrator’ roles within EGI will no longer occur.
#*NOTE: People will have to separately handle any subscriptions to mailing lists which have been initiated by Resource Centre Administrator and which were not triggered by contact definitions in the GOCDB.
(Peter: Imho, the whole step can be removed (the site is closed three month after this step), I would attach the mailing list subscription to another step. 9-12-2011)<br> 
<br> -->
|- valign="top"
| 6
| RC
|
#Logs are to be kept at the Resource Centre, available for the period of time requested by the [https://documents.egi.eu/document/81 Grid Security Traceability and Logging Policy].
|- valign="top"
| 7
| OC
|
#Resource Infrastructure Operations Manager is to communicate to EGI operations AND EGI CSIRT the end of the 90 days period. Revoke the roles of Resource Centre Administrator and people relevant to this Resource Centre in GOCDB and to the relevant CA ''if appropriate''. Resource Infrastructure Operations Manager is to clean the VOMRS dteam server accordingly. In case there is no user left relevant to this very Resource Centre, the Resource Infrastructure Operations Manager has to inform his/her CA in order to close this entity officially to avoid keeping “ghost entities”.
#Site is closed in ''GOCDB''
#* This action must be recorded in the ''master ticket''
*NOTE: People will have to separately handle any subscriptions to mailing lists which have been initiated by Resource Centre Administrator and which were not triggered by contact definitions in the GOCDB.<!--(<span style="background:#FFFF00">Vera: Helene, I'm really not sure that this is the right place for these steps.  Revoking the site admin roles should have happened in step 6.  Similarly with removal of dteam roles/entries.</span>) (<span style="background:#FFFF00">I think that as long as the site is "active" somehow the roles are needed to let them access op.tools. 7-12-2011 </span>)-->
|- valign="top"
| 8
| OC
|
# ''Master ticket'' is closed.
#* This operations can be performed only if all the service decommissioning procedures are completed
|}
= Revision history =

Latest revision as of 09:44, 15 April 2022