Difference between revisions of "PROC09 Resource Centre Registration and Certification"
Line 55: | Line 55: | ||
<!-- There are minimally two sets of players involved in this procedure --> | <!-- There are minimally two sets of players involved in this procedure --> | ||
* '''Resource Centre (Site) Operations Manager''', who is responsible of initiating the certification process by applying for membership to a Resource Infrastructure | * '''Resource Centre (or Site) Operations Manager''', who is responsible of initiating the certification process by applying for membership to a Resource Infrastructure | ||
* '''Resource Infrastructure Operations Manager''', who is responsible of approving the integration of a new Resource Centre into the respective Infrastructure | * '''Resource Infrastructure Operations Manager''', who is responsible of approving the integration of a new Resource Centre into the respective Infrastructure | ||
* '''Operations Centre (ROD)''', who is technically responsible of carrying out the Resource Centre certification part of the procedure, once the membership is approved | * '''Operations Centre (ROD)''', who is technically responsible of carrying out the Resource Centre certification part of the procedure, once the membership is approved |
Revision as of 18:39, 10 March 2011
Title | Site Certification Procedure |
Document link | to be determined |
Last modified | |
Policy Group Acronym | |
Policy Group Name | Operational Documentation |
Contact Person | Vera Hansper |
Document Status | DRAFT |
Approved Date | |
Procedure Statement | A procedure for the steps involved to both register and certify new sites in the EGI infrastructure. The certification step can also be used to re-certify suspended sites. |
Introduction
Certification is a pre-requisite for a Resource Centre (aka site) to become part of a Resource Infrastructure such as a National Grid Initiative (NGI) and EIRO (in Europe), or multi-country Resource Infrastructure.
This document describes the steps required
- to register and certify a new site,
- to re-certify a site which has been suspended.
Note: A separate document provides the process for decommissioning a site.
Through its parent Resource Infrastructure, a certified Resource Centre becomes member of the EGI Resource Infrastructure to make resources available to international user communities.
A certified site guarantees a minimum quality of service of service of these resources (currently expressed in terms of monthly availability and reliability), it must ensure troubles are handled in a timely fashion, it must understand and adhere to a common set of policies and procedures. This compares to an uncertified, or test Resource Centre, which does not provide a guarantee on the availability or usability of it's resources.
Definitions
The entities involved in this procedureare defined in the EGI Glossary.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
Entities involved in the procedure
- Resource Centre (or Site) Operations Manager, who is responsible of initiating the certification process by applying for membership to a Resource Infrastructure
- Resource Infrastructure Operations Manager, who is responsible of approving the integration of a new Resource Centre into the respective Infrastructure
- Operations Centre (ROD), who is technically responsible of carrying out the Resource Centre certification part of the procedure, once the membership is approved
The Resource Infrastructure Operations Manager can determine with the Site Operations Manager the level of involvement of these other actors.
Site registration and certification procedure
As NGIs are responsible for all site within their jurisdiction, the site needs to contact it's NGI manager about their intention to join the EGI infrastructure. If this is a new site, then the site must be registered as a candidate for consideration into the EGI infrastructure by the NGI manager. This step is not applicable to a suspended site.
The various steps then required by both the NGI manager and the Site Operations Manager are explained in the tables below. The first part for a new site is the registration process and it is important to note that a Site Security Officer is required for each site and that mailing lists for both the Site's CSIRT and to contact site administrators is required. Details of the required information are also found in a link in step one of the first table (SiteCertMan/Required_information) and which are entered into EGI's infrastructure data base, the GOCDB.
Further, before a site can be certified, it is important that the Site Operations Manager reads and accepts the Grid Site Operations Policy, the Site Registration Security Policy and the NGI/SITE OLA. The links for these are found in the required documentation section.
There are also a number of steps which require the integration of the site with monitoring tools, and during the certification process, the site should become registered into the NGI's NAGIOS instance. Once the site has passed all tests consecutively for 2 to 3 days, it can be marked as certified. The actual certification process, in the second table, is applicable to both new and suspended sites.
The general status flow that a site is allowed to follow is neatly given by the following:
One final point: It is highly recommended that email contacts for the site's administrators and security officer(s) are mailing lists, and not individuals.
Site registration procedure
These steps describe what a site that is willing to join the EGI infrastructure needs to do and is applicable for a site not already registered in the GOCDB.
Actions falling on the NGI are the responsibility of the NGI manager. Actions falling on the Site are the responsibility of the Site Manager/Representative.
Note that a site MUST be part of an NGI/Group of NGIs, and if there is no suitable NGI for your country, it may be that the NGI must first be created. In this case, please see [this NewNGIs_creation link] for how to create a new NGI.
# | Responsible | Action |
---|---|---|
1 | Site |
|
2 | NGI |
The following actions can be done in parallel:
|
3 | NGI |
|
4 | Site |
|
5 | Site or NGI |
|
6 | NGI |
|
7 | NGI |
|
8 | NGI |
|
After the successful completion of all these steps, the site is considered as to be in the "Candidate" state and is ready for the certification process.
Site certification procedure
The Site Certification procedure is applicable for both new sites which have reached the "Candidate" state and for suspended sites, The following is a detailed description of the steps required for the transition from the "Uncertified" to the "Certified" state of the site.
# | Responsible | Action |
---|---|---|
1 | Site |
|
2 | NGI |
|
3 | NGI | Check that the GIIS (gLite: BDII) is working, and publishing coherent values, namely:
There are detailed examples for how to do this in SiteCertMan/GIIS_BDII_check. |
4 | NGI |
Check that the registered services are fully functional by performing manual tests. e.g. from the UI or a dedicated SAM/Nagios testbed infrastructure provided by the NGI. There is an example of how to create a testbed nagios at this page. Contact the site admins if there are problems, and ensure that they fix them. Include the ROD and help-desk teams if necessary. Iterate this step with the site admins until tests pass. The prime tests to check are:
Details for submitting manual tests can be found at SiteCertMan/Grid_manual_tests. |
5 | NGI |
|
6 | NGI | After two days check that the site appears in all operational tools. If there are problems with a specific tool, open GGUS tickets to the relevant Support Units. The major tools that are relevant are:
|
7 | NGI | Ensure that, before the end of the maintenance downtime
|
8 | NGI |
|
9 | NGI |
|
10 | NGI | (Optional?) The NGI can broadcast that a new site is now part of the EGI infrastructure. |
After the successful completion of these steps, the site is considered as "Certified".
Revision history
Version | Authors | Date | Comments |
---|---|---|---|
0.7 | Vera Hansper | 2011-02-02 | Updated introduction to include roles, etc. and added required documentation link for policies |