Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "PROC09 Resource Centre Registration and Certification"

From EGIWiki
Jump to navigation Jump to search
(Deprecate page)
Tag: Replaced
 
(130 intermediate revisions by 7 users not shown)
Line 1: Line 1:
{{Template:Op menubar}}  {{TOC_right}}  
{{Template:Op menubar}} {{Template:Doc_menubar}}


{| border="1"
{{DeprecatedAndMovedTo|new_location=https://confluence.egi.eu/display/EGIPP/PROC09+Resource+Centre+Registration+and+Certification
|-
}}
| '''Title'''
| ''Resource Centre Registration and Certification Procedure''
|-
| '''Document link'''
| https://wiki.egi.eu/wiki/PROC09
|-
| '''Version - last modified'''
| 1.0 - 17 May 2011<br>
|-
| '''Policy Group Acronym'''
| ''OMB''
|-
| '''Policy Group Name'''
| ''Operations Management Board''
|-
| '''Contact Person'''
| operational-documentation@mailman.egi.eu
|-
| '''Document Status'''
| ''APPROVED''
|-
| '''Approved Date'''
| 17 May 2011<br>
|-
| '''Procedure Statement'''
| ''A procedure for the steps involved to both register and certify new Resource Centres (sites) in the EGI infrastructure. The certification step can also be used to re-certify suspended Resource Centres (sites).''
|}


= Resource Centre Registration and Certification Procedure  =
[[Category:Operations_Procedures]]
 
Certification is a prerequisite for a [[#Definitions|Resource Centre]] (aka site) to become part of a Resource Infrastructure such as a National Grid Initiative (NGI), an EIRO, or a multi-country Resource Infrastructure.
 
This document describes the steps required
 
#to register and certify a new Resource Centre,
#to re-certify a Resource Centre which has been suspended.
 
Note: A separate document provides the [[PROC11|process for decommissioning a Resource Centre]].
 
Through its parent Resource Infrastructure, a certified Resource Centre becomes a member of the EGI Resource Infrastructure to make resources available to international user communities.
 
The main difference between a certified Resource Centre and an uncertified or test Resource Centre is that a certified Resource Centre provides and guarantees a minimum quality of service of the resources (currently expressed in terms of monthly availability and reliability): the certified Resource Centre must ensure problems are handled in a timely fashion and the certified Resource Centre must understand and adhere to a common set of policies and procedures. All the requirements can be found in the [https://documents.egi.eu/document/31 Resource Centre OLA].
 
= Definitions  =
 
*'''Resource Centre''' refers to the definition in the "[https://documents.egi.eu/document/31 Resource Centre OLA]".
 
:''In this document, the term "'''site'''" is '''deprecated''', and '''Resource Centre''' has been used in its place.''
 
*Other entities involved in this procedure are defined in the [[Glossary|EGI Glossary]].
 
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
 
= Entities involved in the procedure  =
 
<!-- There are minimally two sets of players involved in this procedure -->
 
*'''Resource Centre Operations Manager''': person who is responsible for initiating the certification process by applying for membership to a Resource Infrastructure.
*'''Resource Infrastructure Operations Manager''': person who is responsible for approving the integration of a new Resource Centre into the respective Infrastructure.
*'''Operations Centre''': entity which is technically responsible for carrying out the Resource Centre certification part of the procedure, once the membership is approved.
 
The Resource Infrastructure Operations Manager can determine with the Resource Centre Operations Manager the level of involvement of other actors.
 
= Contact information  =
 
*EGI Operations: operations (at) mailman.egi.eu
*EGI Resource infrastructure Providers are listed on the EGI [https://www.egi.eu/infrastructure/Resource-providers/index.html web site]
*A list of EGI Operations Centres with their respective contact information is available from the [http://go.egi.eu/operations-centres GOCDB]
*EGI CSIRT: egi-csirt-team (at) mailman.egi.eu
 
= Actions and responsibilities  =
 
== Resource Centre Operations Manager  ==
 
#A Resource infrastructure Provider is responsible for all Resource Centres within its respective jurisdiction (for example, an NGI is responsible for all Resource Centres in its country). For this reason, the Resource Centre Operations Manager of a new Resource Centre is REQUIRED
#*to contact the respective NGI if the Resource Centre is located in Europe,
#*to contact the respective Resource infrastructure Provider active in a relevant geographical area if the Resource Centre is outside Europe, about the intention of the Resource Centre to join the EGI infrastructure. If needed, EGI Operations can assist the Resource Centre Operations Manager to get in contact with the relevant partners (see the Contact information section).<br>
#The Resource Centre Operations Manager is REQUIRED to provide the necessary Resource Centre information needed to complete the registration process, and he/she is responsible for its accuracy and maintenance.<br>
#In order to be certified, the Resource Centre Operations Manager is responsible for reading, understanding and accepting the [https://documents.egi.eu/document/31 Resource Centre Operational Level Agreement], which defines the obligations of a Resource Centre and the commitment to deliver a minimum quality of service to its future users. Endorsement of the OLA implies - among other things - the acceptance of:
#*the [https://documents.egi.eu/document/86 Grid Security Policy]
#*the [https://documents.egi.eu/document/75 Grid Resource Centre Operations Policy]
#*the [https://documents.egi.eu/document/76 Resource Centre Registration Security Policy]
#*all other policies for all EGI participants from the [https://wiki.egi.eu/wiki/SPG:Documents Security Policy Group]
 
== Resource Infrastructure Operations Manager  ==
 
#A Resource infrastructure Provider is REQUIRED to be responsible for all Resource Centres within its respective jurisdiction. For example, an NGI is responsible for all Resource Centres in its respective country.
#The Resource Infrastructure Operations Managers MUST attend Resource Centre certification applications and MUST provide feedback to the requesting partners in a timely manner to accept or reject the requests received.
#If the Resource Centre needs to be certified, s/he MUST provide information to the Resource Centre Operations Manager about the Resource Centre OLA, and is responsible for keeping records of the Resource Centre Operations Manager agreement, as deemed suitable by the Resource infrastructure Provider (for example, through a signed e-mail agreement, a collection of signatories on a paper copy of the OLA, or other means).
#For the case where a request is accepted, the Resource Infrastructure Operations Manager MUST contact the relevant Operations Centre to start the Resource Centre registration as a candidate for the certification procedure. Registration is only needed for the case of new Resource Centres.
 
== Operations Centre  ==
 
#The Operations Centre is responsible for registering (if applicable) and for certifying the Resource Centre.
#The Operations Centre is responsible for registering an accepted Resource Centre in the EGI configuration repository [[GOCDB|GOCDB]].
#The Operations Centre MUST collect the mandatory information specified by the Resource Centre registration procedure, and MUST accurately input the data supplied into the GOCDB.
#The Operations Centre MUST integrate Resource Centre information in all operations tools as needed, such as the local NAGIOS server for monitoring of certified Resource Centres, the local helpdesk (if available) for the registration of the Resource Centre support staff, etc.
#In the case of an existing Resource Centre that is resuming certification after suspension for security reasons, the Operations Centre MUST contact the EGI CSIRT to verify that all requested repair operations have been successfully applied to fix the issue.
#*For other suspension cases, the Operations Center MUST ensure that the issue that caused the suspension has been resolved.
#The Operations Centre is responsible for verifying that all tests during the 3 calendar day certification process are successfully passed. The Operations Centre SHOULD only proceed with changing the Resource Centre status in the GOCDB to ''certified'' if this condition is met.
 
= Workflow  =
 
The various steps required by both the Resource Infrastructure Operations Manager and the Resource Centre Operations Manager are explained in the tables below. The first part for a '''new''' Resource Centre is the registration process. The actual certification process, in the second table, is applicable to both new and suspended Resource Centres.
 
The general status flow that a Resource Centre is allowed to follow is illustrated by the following diagram. Information on Resource Centre status and on how to manipulate it is available from [https://wiki.egi.eu/wiki/GOCDB/Input_System_User_Documentation#Changing_Site_Certification_Status GOCDB Documentation].
 
[[Image:SiteStatusFlow.png|300px|SiteStatusFlow.png]]
 
<br>
 
A Resource Centre '''cannot '''be in
 
*'''Candidate '''state for '''more than two months'''
*'''Suspended''' state for '''more than four months'''
 
After this period the Resource Centre SHOULD be closed.
 
== Resource Centre registration  ==
 
=== Requirements  ===
 
#A Resource Centre MUST be part of a Resource Infrastructure and gets operational services offered by a Operations Centre. If a provider is not yet available for your country, then an alternative existing Operations Centre can be contacted. A procedure exists for this, and it is documented in the [[PROC02|Operations Centre creation]] procedure. <!-- text extracted from the Resource Centre registration procedure, which will likely disappear in the future-->
#To satisfy Grid security requirements during the registration procedure the following information must be collected. The comprehensive list of required information is available ([[Operations/HOWTO01|here]]).
#*The full name of the Resource Centre.
#*An abbreviated name for the Resource Centre, which must be unique within the Grid, and preferably globally unique.
#*The name, email address and telephone number of the Resource Centre Operations Manager and Resource Centre Security Contact in accordance with the requirements of the [https://documents.egi.eu/document/75 Resource Centre Operations Policy].
#*The email address of a managed list for contact with Resource Centre Administrators at the Resource Centre. <!-- Resource Administrators replaced by Site Administrators-->
#*The email address of a managed list for contact with the Resource Centre security incident response team. <!--# A signed copy of the Site (Resource Centre) Operations Policy (https://documents.egi.eu/document/75).-->
 
Notes:
 
#If a Resource Centre wishes to leave the Grid or the Grid decides to remove the Resource Centre, the registration information MUST be kept by [[GOCDB|GOCDB]] for at least the same period defined for logging in the [https://documents.egi.eu/document/81 Traceability and Logging Policy]. Personal registration information of the Resource Centre Operations Manager and Security Contact of the Resource Centre leaving the Grid MUST NOT be retained for longer than one year. <!--"Review and acceptance procedures and any operational requirements should be documented in a Grid specific
document describing the implementation of the Resource Centre Registration Procedure." Comment: a maintenance procedure is currently missing. To check: what are the operational requirements? -->
#It is RECOMMENDED that email contacts for the Resource Centre Administrators and Security Officer(s) are mailing lists, and not individuals.The contacts information SHOULD be available at the moment of the Resource Centre registration in GOCDB.
 
<br>
 
=== Steps  ===
 
The following steps are only applicable if '''the Resource Centre is not already registered in GOCDB'''. They describe the steps for a Resource Centre Operations Manager that is requesting the respective Resource Centre to join the EGI infrastructure.
 
*Actions tagged '''RC''' are the responsibility of the Resource Centre Operations Manager.
*Actions tagged '''RP''' are the responsibility of the Resource Infrastructure Operations Manager.
*Actions tagged '''OC''' are the responsibility of the Operations Centre
 
{| cellspacing="0" cellpadding="5" border="1"
|-
! #
! Responsible
! Action
|- valign="top"
| 0
| RC
|
#Contact your Resource Infrastructure Operations Manager (contact information is available at [http://www.egi.eu/community/resource-providers/ http://www.egi.eu/community/resource-providers/]).
#Provide your Resource Infrastructure Operations Manager the required information according to the template available in the [[Operations/HOWTO01|Required information]] page.
 
|- valign="top"
| 1
| RP
|
#Parse the Resource Centre registration request, decide to accept or reject it, and communicate this result back to applicant.
#If the Resource Centre is accepted, notify the relevant Operations Centre, handle the Resource Centre information received, and put the Operations Centre in contact with the Resource Centre Operations Manager.
 
|- valign="top"
| 2
| OC
|
#The following actions can be done in parallel:
#*Forward all [[Operations/HOWTO02|necessary and required documentation]] to install and configure the Resource Centre services to the Resource Centre Operations Manager.
#*Communicate with the Operations Manager to clarify any doubts or questions. Include the Operations Centre ROD, CSIRT,&nbsp; or help-desk teams in the step if necessary.
 
|- valign="top"
| 3
| OC
|
#Add the Resource Centre to the [https://goc.egi.eu/ GOCDB ]and flag it as "Candidate". Note that all users with a GOCDB role at regional level can add a Resource Centre in scope (this includes Operations Manager, deputy and regional staff). Currently, GOCDB applies the same permissions to all of the "regional level roles".
#Notify the Resource Centre Operations Manager that they should request for [http://www.eugridpma.org/ grid certificate], register in [https://voms.hellasgrid.gr:8443/vo/dteam/vomrs Dteam VO], register themself in the [https://goc.egi.eu/ GOCDB ]and request the Resource Centre Administrator role. Approve it when done.
 
|- valign="top"
| 4
| RC
|
#Complete any missing information for the Resource Centre's entry in the GOCDB, including services that are to be integrated into the infrastructure.
#Request in the GOCDB (or ask the relevant Resource Centre security staff to request) the mandatory Resource Centre Security Officer role. A security expert is the most appropriate actor for this role. See the [https://wiki.egi.eu/wiki/GOCDB/Input_System_User_Documentation#Understanding_and_manipulating_roles GOCDB Input System User Documentation] for more information on roles.
#Accept or deny all the requested roles under the Resource Centre scope. Note: If the Resource Centre Operations Manager can not approve roles, they should request the Operations Centre to do so. This is a current flaw in GOCDB.
#Notify the Operations Centre that the Resource Centre information update is concluded.
 
|- valign="top"
| 5
| RC or OC
|
#Check whether the Resource Centre appears in the "Notified Site" field in [https://ggus.eu/ws/ticket_search.php https://ggus.eu/ws/ticket_search.php]
#Note that this step should happen automatically when the Resource Centre is correctly entered into the GOCDB. If this is still not visible 2 days after the GOCDB entries have been created, the Operations Centre should be informed and should then contact GGUS administrators through [https://ggus.eu/pages/ticket.php GGUS].
#A new Resource Centre Administrator should register in GGUS ([https://ggus.eu/admin/get_account.php?accounttype=support https://ggus.eu/admin/get_account.php?accounttype=support]) but not specify any role, unless directed to by the Operations Centre.
 
|- valign="top"
| 6
| OC
|
#Check that the Resource Centre's information is correct (Resource Centre (site) roles and any other additional information.)
#Check that contacts receive email (if they are mailing lists, check that outside EGI members are allowed to post there). Site administrator MUST reply to the test email.<br>
#Check that the required services for a Resource Centre are properly registered. Note that for Resource Centre adopting APEL, by registering a new glite-APEL node in GOCDB as gLite-APEL service including the correct DN, the APEL broker Access Control List gets automatically updated and Resource Centres can start publishing usage records in about two hours (for more information see the [https://twiki.cern.ch/twiki/bin/view/EMI/Glite-APELInstallation gLite-APEL documentation]).
#Check domain names and forward and reverse DNS.
 
|- valign="top"
| 7
| OC
|
#Any other Operations Centre-specific requirements (e.g. join a certain VO and/or mailing list, etc.)
 
|- valign="top"
| 8
| OC
|
#If all previous actions have been completed with success, notify the Resource Centre Operations Manager that the Registration is completed, and contact the Resource Infrastructure Operations Manager to notify that a new candidate Resource Centre exists and is ready to be certified.
 
|}
 
After the successful completion of all these steps, the registration phase is completed and the Resource Centre is ready for the start of the <span class="il">certification</span> phase.
 
== Resource Centre certification  ==
 
=== Requirements  ===
 
#The Resource Centre Certification procedure is only applicable for '''both Resource Centres in "Candidate" or "Suspended"''' status state.<br>
#The following procedure is only applicable if '''the Resource Centre is already registered in GOCDB'''.
#In order to enter certification the Resource Centre Operations Managers SHALL accept the [https://documents.egi.eu/document/31 Resource Centre OLA].
#A Resource Centre can successfully pass certification only if the conditions required by the [https://documents.egi.eu/document/31 Resource Centre OLA] are met.
 
=== Steps  ===
 
The following is a detailed description of the steps required for the transition from the "Uncertified" to the "Certified" state of the Resource Centre.
 
*Actions tagged '''RC''' are the responsibility of the Resource Centre Operations Manager.
*Actions tagged '''RP''' are the responsibility of the Resource Infrastructure Operations Manager.
*Actions tagged '''OC''' are the responsibility of the Operations Centre
 
{| cellspacing="0" cellpadding="5" border="1"
|-
! #
! Responsible
! Action
|- valign="top"
| 0
| RP
|
#The Resource Infrastructure Operations Manager contacts the Resource Centre Operations Manager to request the subscription of the [https://documents.egi.eu/public/ShowDocument?docid=31 Resource Centre OLA].
 
|- valign="top"
| 1
| RC
|
#The Resource Centre Operations Manager notifies the Resource Infrastructure Operations Manager that the Resource Centre OLA is accepted (if the Resource Centre is has not already endorsed it before for example in case of a suspended Resource Centre), and the Resource Centre is ready to start certification.
 
|- valign="top"
| 2
| RP
|
#The Resource Infrastructure Operations Manager contacts the Operations Centre asking to start the certification process.
 
|- valign="top"
| 3
| OC
|
#If the Resource Centre is in the "Candidate" or "Suspended" state, then flag the Resource Centre as "Uncertified". If it was in the "Suspended" state then check that the reason for suspension has been cleared. If the suspension cause is a security issue, then the EGI CSIRT needs to be contacted to verify that all requested repair operations were successully applied by the Resource Centre Administrators to fix the issue that caused suspension. See [[SAM#Monitoring_uncertified_sites|instructions]] on how to monitor uncertified RCs.
 
|- valign="top"
| 4
| OC
|
#Add Resource Centre contact information to any regional mailing list and provide access to regional tools as required
 
|- valign="top"
| 5
| OC
|
#Check that the GIIS (gLite: BDII) is working, and publishing coherent values, namely:
#*the correct NGI is being published in GlueSiteOtherInfo (see manual MAN01 [[MAN1 How to publish Site Information|How to Publish Site Information]]).
#all services are registered in GOCDB according to the requirements of the [https://documents.egi.eu/document/31 Resource Centre OLA], these are published and ALSO that services published in the GOCDB are valid.
#the [[OPS vo|OPS VO]] (monitoring) and the [[Dteam vo|DTEAM VO]] (troubleshooting) are configured and supported by the Resource Centre.
#regional VOs are configured and supported as needed by the Operations Centre.
#the Resource Centre is integrated in any regional tool as needed (for example, the regional accounting infrastructure if present).
 
There are detailed examples for how to do this in [[Operations/HOWTO03|GIIS/BDII check]].
 
|- valign="top"
| 6
| OC
|
#Check that the registered services are fully functional by performing manual tests. e.g. from the UI or the Operations Centre monitoring infrastructure for uncertified Resource Centres. Note that monitoring of uncertified Resource Centres through the NGI Nagios production service is possible ([[SAM#Monitoring_uncertified_sites|instructions]]). Contact the Resource Centre admins if there are problems, and ensure that they fix them. Include the ROD, CSIRT and help-desk teams if necessary. Iterate this step with the Resource Centre admins until tests pass successfully. The prime tests to check are:
#*network connectivity.
#*CE job submission.
#*SE data transfer
 
Details for submitting manual tests can be found at [[Operations/HOWTO04|Grid manual tests]].
 
|- valign="top"
| 7
| OC
|
#If all preliminary tests are passed for 3 consecutive calendar days, declare an initial maintenance downtime and switch the Resource Centre status to Certified. This ensures that Resource Centre will appear in NAGIOS and GSTAT.
 
|- valign="top"
| 8
| OC
|
#The downtime should not be closed until the Resource Centre appears in all operational tools '''and''' accounting data is properly published. The major tools that are relevant are:
#*Regional NAGIOS (NAGIOS)
#**And all Nagios tests are passed
#*Operations [https://operations-portal.egi.eu/dashboard Dashboard] (Dashboard-Siteview)
#*[http://gridview.cern.ch/GRIDVIEW/same_index.php GridView]
#*[http://gstat.egi.eu/ GSTAT]
#**GSTAT is not in an error state. Note: There may be some problems with this tool and ARC Resource Centres.
#*[https://grid-monitoring.cern.ch/myegi/ MyEGI]
 
If there are problems with a specific tool, open GGUS tickets to the relevant Support Units. Wait at least two days after the switch to the ''Certified'' status to open the ticket, the propagation of the new status to the operational tools or the publication of accounting data may take one or two days.<br>
 
|- valign="top"
| 9
| OC
|
#Notify the Resource Centre Operations Manager that the Resource Centre is certified<br>
 
|- valign="top"
| 10
| OC
|
#The NGI can broadcast that a new Resource Centre is now part of the EGI infrastructure. This step is OPTIONAL.
 
|}
 
After the successful completion of these steps, the Resource Centre is considered as "Certified". <!--
= Revision history  =
 
{| cellspacing="0" cellpadding="5" border="1" align="center"
|-
! Version
! Authors
! Date
! Comments
|-
| 1.11
| Peter Solagna
| 2011-05-17
| According to OMB comments: Modified the maximum duration of different site statuses. Removed the two days suggested period of downtime. The definition of Resource Centre will point to the Site OLA.&nbsp;
|-
| 1.1
| Peter Soalgna
| 2011-05-1
| Updated cert step 6: downtime period lasts at least two days. Moved Cert step #10 to #4. Changed ''"If there is no suitable provider for your country, it maybe that the an Operations Centre MUST first be created."'' with ''"If a''
provider is not yet available for your country, then an alternative existing Operations Centre can be contacted."''. Now site responsiveness through its mail contacts is requested from the being of the certification process.''
 
|-
| 0.8
| Tiziana Ferrari
| 2011-03-11
| Updated introduction, adopted MUST SHALL etc. terminology, proposed some changes to terminology, added a section with a list of responsibilities, added a few comments into the text to request clarifications.
|-
| 0.7
| Vera Hansper
| 2011-02-02
| Updated introduction to include roles, etc. and added required documentation link for policies
|}
--> <br>
 
= Revision History  =
 
*7/09/2012: (editorial, M.&nbsp;Krakowian) typos and adding links where necessary<br>
*25/10/2011: (editorial, T. Ferrari) Replacement of RIP with "RP" standing for Resource infrastructure Provider
 
{{Template:Creative_commons}}
 
[[Category:Procedures]]

Latest revision as of 15:45, 24 August 2021