Difference between revisions of "PROC09 Resource Centre Registration and Certification"
(→Steps) |
(→Steps) |
||
Line 226: | Line 226: | ||
# If all preliminary tests are passed for 3 consecutive calendar days, declare an initial maintenance downtime and switch the Resource Centre status to Certified. This ensures that Resource Centre will appear in NAGIOS and GSTAT. | # If all preliminary tests are passed for 3 consecutive calendar days, declare an initial maintenance downtime and switch the Resource Centre status to Certified. This ensures that Resource Centre will appear in NAGIOS and GSTAT. | ||
|-valign=top | |-valign=top | ||
| 7 || OS || #After two days check that the Resource Centre appears in all operational tools. If there are problems with a specific tool, open GGUS tickets to the relevant Support Units. The major tools that are relevant are: | | 7 || OS || | ||
#After two days check that the Resource Centre appears in all operational tools. If there are problems with a specific tool, open GGUS tickets to the relevant Support Units. The major tools that are relevant are: | |||
#* Regional NAGIOS (NAGIOS) | #* Regional NAGIOS (NAGIOS) | ||
#* Operations Dashboard (Dashboard-Siteview) | #* Operations Dashboard (Dashboard-Siteview) | ||
Line 233: | Line 234: | ||
#* SAM/Site Functional Tests <Tiziana comment: what does SAM mean in this context? MyEGI?> | #* SAM/Site Functional Tests <Tiziana comment: what does SAM mean in this context? MyEGI?> | ||
|-valign=top | |-valign=top | ||
| 8 || OS|| # Ensure that, before the end of the maintenance downtime | | 8 || OS|| | ||
# Ensure that, before the end of the maintenance downtime | |||
#* all Nagios tests (see above) are passed AND | #* all Nagios tests (see above) are passed AND | ||
#* accounting data is properly published. | #* accounting data is properly published. |
Revision as of 16:48, 18 March 2011
Title | Resource Centre Registration and Certification Procedure |
Document link | to be determined |
Version/Last modified | 1.0/14:08, 18 March 2011 (UTC) |
Policy Group Acronym | OMB |
Policy Group Name | Operations Management Board |
Contact Person | Vera Hansper |
Document Status | DRAFT |
Approved Date | |
Procedure Statement | A procedure for the steps involved to both register and certify new Resource Centres (sites) in the EGI infrastructure. The certification step can also be used to re-certify suspended Resource Centres (sites). |
Introduction
Certification is a prerequisite for a Resource Centre (aka site) to become part of a Resource Infrastructure such as a National Grid Initiative (NGI) and EIRO, or multi-country Resource Infrastructure.
This document describes the steps required
- to register and certify a new Resource Centre,
- to re-certify a Resource Centre which has been suspended.
Note: A separate document provides the process for decommissioning a Resource Centre.
Through its parent Resource Infrastructure, a certified Resource Centre becomes member of the EGI Resource Infrastructure to make resources available to international user communities.
A certified Resource Centre guarantees a minimum quality of service of these resources (currently expressed in terms of monthly availability and reliability): the Resource Centre must ensure troubles are handled in a timely fashion and the Resource Centre must understand and adhere to a common set of policies and procedures. This compares to an uncertified, or test Resource Centre, which does not provide a guarantee on the availability or usability of it's resources.
Definitions
- Resource Centre. The Resource Centre, also known as Site, is the smallest resource administration domain in EGI. It can be either localized or geographically distributed. It provides local resources and the Grid functional capabilities necessary to make those resources accessible to authorized users such as Security, Information, Storage, Data Access, Compute etc. Access is granted by exposing common interfaces to users.
- Note: In this document, the term "site" is deprecated, and Resource Centre has been used in its place.
- Other entities involved in this procedure are defined in the EGI Glossary.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
Entities involved in the procedure
- Resource Centre Operations Manager, who is responsible of initiating the certification process by applying for membership to a Resource Infrastructure
- Resource Infrastructure Operations Manager, who is responsible of approving the integration of a new Resource Centre into the respective Infrastructure
- Operations Centre, the entity who is technically responsible of carrying out the Resource Centre certification part of the procedure, once the membership is approved
The Resource Infrastructure Operations Manager can determine with the Resource Centre Operations Manager the level of involvement of other actors.
Contact information
- EGI Operations: operations (at) mailman.egi.eu
- EGI Resource Infrastructure Providers are listed on the EGI web site
- Operations Centres with the respective contact information is available from GOCDB
- EGI CSIRT: egi-csirt-team (at) mailman.egi.eu
Actions and responsibilities
Resource Centre Operations Manager
- A Resource Infrastructure Provider is responsible for all Resource Centres within the respective jurisdiction (for example, an NGI is responsible for all Resource Centres in its country). For this reason, the Resource Centre Operations Manager of a new Resource Centre is REQUIRED
- to contact the respective NGI if the Resource Centre is located in Europe,
- to contact the respective Resource Infrastructure Provider active in a relevant geographical area if the Resource Centre is outside Europe,
about the intention to join the EGI infrastructure. If needed, EGI Operations can assist the Resource Centre Operations Manager to get in contact with the relevant partners (see the Contact information section).
- The Resource Centre Operations Manager is REQUIRED to provide the necessary Resource Centre information needed to complete the registration process, is responsible for its accuracy and maintenance.
- In order to be certified, the Resource Centre Operations Manager is responsible for reading, understanding and accepting the Resource Centre Operational Level Agreement, which defines the obligations of a Resource Centre and the commitment to deliver a minimum quality of service to its future users. Endorsement of the OLA implies - among other things - the acceptance of:
- the Grid Security Policy
- the Grid Resource Centre Operations Policy
- the Resource Centre Registration Security Policy
- all other policies for all EGI participants from the Security Policy Group
Resource Infrastructure Operations Manager
- A Resource Infrastructure Provider is REQUIRED to be responsible for all Resource Centres within its respective jurisdiction. For example, an NGI is responsible for all Resource Centres in its respective country.
- The Resource Infrastructure Operations Managers MUST attend Resource Centre certification applications and MUST provide feedback to the requesting partners in a timely manner to accept or reject the requests received.
- If the Resource Centre needs to be certified, s/he MUST provide information to the Resource Centre Operations Manager about the Resource Centre OLA, and is responsible for keeping records of the Resource Centre Operations Manager agreement, as deemed suitable by the Resource Infrastructure Provider (for example, through a signed e-mail agreement, the collection of signatories on a paper copy of the OLA, or other means).
- For the case where a request is accepted, the Resource Infrastructure Operations Manager MUST contact the relevant Operations Centre to start the Resource Centre registration as a candidate and the certification procedure. Registration is only needed for the case of new Resource Centres.
Operations Centre
- The Operations Centre is responsible for registering (if applicable) and for certifying the Resource Centre.
- The Operations Centre is responsible for registering an accepted Resource Centre in the EGI configuration repository GOCDB.
- The Operations Centre MUST collect the mandatory information specified by the Resource Centre registration procedure, and MUST accurately input the supplied data into the GOCDB.
- The Operations Centre MUST integrate Resource Centre information in all operations tools as needed, such as the local NAGIOS server for monitoring of certified Resource Centres, the local helpdesk (if available) for the registration of the Resource Centre support staff, etc.
- For the case of an existing Resource Centre that is starting certification after suspension for security reasons, the Operations Centre MUST contact the EGI CSIRT to verify that all requested repair operations have been successfully applied to fix the issue.
- For other suspension cases, the Operations Center MUST ensure that the issue that caused the suspension has been resolved.
- The Operations Centre is responsible of verifying that all tests during the 3 calendar day certification process are successfully passed. The Operations Centre SHALL proceed with changing the Resource Centre status in GOCDB to certified only if this condition is met.
Workflow
The various steps required by both the Resource Infrastructure Operations Manager and the Resource Centre Operations Manager are explained in the tables below. The first part for a new Resource Centre is the registration process. The actual certification process, in the second table, is applicable to both new and suspended Resource Centres.
The general status flow that a Resource Centre is allowed to follow is illustrated by the following diagram. Information on Resource Centre status and on how to manipulate it available from GOCDB Documentation.
Resource Centre registration
Requirements
- A Resource Centre MUST be part of a Resource Infrastructure and MUST be operated by an Operations Centre. If there is no suitable provider for your country, it may be that the an Operations Centre MUST first be created. A procedure exists for this, and it is documented in the Operations Centre creation procedure.
- To satisfy Grid security requirements a Resource Centre registration procedure must capture and maintain at least the following information. The comprehensive list of required information is available (here).
- The full name of the Resource Centre.
- An abbreviated name of the Resource Centre, which must be unique within the Grid, and preferably globally unique.
- The name, email address and telephone number of the Resource Centre Operations Manager and Resource Centre Security Contact in accordance with the requirements of the Resource Centre Operations Policy.
- The email address of a managed list for contact with Resource Centre Administrators at the Resource Centre.
- The email address of a managed list for contact with the Resource Centre security incident response team.
- If a Resource Centre wishes to leave the Grid or the Grid decides to remove the Resource Centre, the registration information MUST be kept by GOCDB for at least the same period defined for logging in the Traceability and Logging Policy. Personal registration information of the Resource Centre Operations Manager and Security Contact of the Resource Centre leaving the Grid MUST NOT be retained for longer than one year.
- It is RECOMMENDED that email contacts for the Resource Centre Administrators and Security Officer(s) are mailing lists, and not individuals.
<Comment: additional constraints - if any - on information that is registered need to be specified here>
Steps
The following steps are only applicable if the Resource Centre is not already registered in GOCDB. They describe the steps for a Resource Centre Operations Manager that is requesting the respective Resource Centre to join the EGI infrastructure.
- Actions tagged RC are the responsibility of the Resource Centre Operations Manager.
- Actions tagged RIP are the responsibility of the Resource Infrastructure Operations Manager.
- Actions tagged OC are the responsibility of the Operations Centre
# | Responsible | Action |
---|---|---|
0 | RC |
|
1 | RIP |
|
2 | OC |
|
3 | OC |
|
4 | RC |
|
5 | RC or OC |
|
6 | OC |
|
7 | OC |
|
8 | OC |
|
After the successful completion of all these steps, the Resource Centre is considered as to be in the "Candidate" state and is ready for the certification process.
Resource Centre certification
Requirements
- The Resource Centre Certification procedure is only applicable for both Resource Centres in "Candidate" or "Suspended" status state and for suspended Resource Centres.
- In order to enter certification the Resource Centre Operations Managers SHALL accept the Resource Centre OLA.
- A Resource Centre can successfully pass certification only if the conditions required by the Resource Centre OLA are met.
Steps
The following is a detailed description of the steps required for the transition from the "Uncertified" to the "Certified" state of the Resource Centre.
- Actions tagged RC are the responsibility of the Resource Centre Operations Manager.
- Actions tagged RIP are the responsibility of the Resource Infrastructure Operations Manager.
- Actions tagged OC are the responsibility of the Operations Centre
# | Responsible | Action |
---|---|---|
0 | RIP |
|
1 | RC |
|
2 | RIP |
|
3 | OS |
|
4 | OS |
There are detailed examples for how to do this in SiteCertMan/GIIS_BDII_check. |
5 | OS |
Details for submitting manual tests can be found at SiteCertMan/Grid_manual_tests. |
6 | OS |
|
7 | OS |
|
8 | OS |
|
9 | OS |
|
10 | OS |
|
11 | OS |
|
After the successful completion of these steps, the Resource Centre is considered as "Certified".
Revision history
Version | Authors | Date | Comments |
---|---|---|---|
0.8 | Tiziana Ferrari | 2011-03-11 | Updated introduction, adopted MUST SHALL etc. terminology, proposed some changes to terminology, added a section with a list of responsibilities, added a few comments into the text to request clarifications. |
0.7 | Vera Hansper | 2011-02-02 | Updated introduction to include roles, etc. and added required documentation link for policies |