Difference between revisions of "PROC09 Resource Centre Registration and Certification"
Line 169: | Line 169: | ||
#'''Forward all documentation''': | #'''Forward all documentation''': | ||
#*[[HOWTO02|necessary to be read and accept]] | #*[[HOWTO02|necessary to be read and accept]] | ||
#*[[ | #*[[Administrator Documentation|documentation how to install and configure the Resource Centre services]] | ||
#Clarify any doubts or questions. | #Clarify any doubts or questions. | ||
Line 218: | Line 218: | ||
|} | |} | ||
After the successful completion of all these steps, the registration phase is completed and the Resource Centre is ready for the start of the <span class="il">certification</span> phase. | After the successful completion of all these steps, the registration phase is completed and the Resource Centre is ready for the start of the <span class="il">certification</span> phase. | ||
== Resource Centre certification == | == Resource Centre certification == | ||
Line 280: | Line 280: | ||
'''Check:''' | '''Check:''' | ||
#'''GOC DB: '''All services are registered in | #'''GOC DB: '''All services are registered in GOCDB according to the requirements of the [https://documents.egi.eu/document/31 Resource Centre OLA], these are published and ALSO that services published in the GOCDB are valid. | ||
#'''Information system''': Check | #'''Information system''': Check that the GIIS (gLite, Globus, cloud: BDII) is working, and publishing coherent values | ||
#*There are detailed examples for how to do this in [[HOWTO03|GIIS/BDII check]], namely: the correct NGI is being published in GlueSiteOtherInfo (see manual MAN01 [[MAN01|How to Publish Site Information]]<br> | #*There are detailed examples for how to do this in [[HOWTO03|GIIS/BDII check]], namely: the correct NGI is being published in GlueSiteOtherInfo (see manual MAN01 [[MAN01|How to Publish Site Information]]<br> | ||
#*Site has to pass [http://gridinfo.web.cern.ch/glue/glue-validator-guide glue validator manual test] | #*Site has to pass [http://gridinfo.web.cern.ch/glue/glue-validator-guide glue validator manual test] | ||
#'''Monitoring and troubleshooting''' should be possible: | #'''Monitoring and troubleshooting''' should be possible: | ||
#*Glite, ARC: the [[OPS vo|OPS VO]] (monitoring) and the [[Dteam vo|DTEAM VO]] (troubleshooting) are configured and supported by the Resource Centre. | #*Glite, ARC, cloud: the [[OPS vo|OPS VO]] (monitoring) and the [[Dteam vo|DTEAM VO]] (troubleshooting) are configured and supported by the Resource Centre. | ||
#*UNICORE: all certificates used by monitoring users must have the 'user' role assigned using any attribute source supported by the Resource Centre and must be mapped to a local account with the same permissions as assigned to an ordinary infrastructure user. | #*UNICORE: all certificates used by monitoring users must have the 'user' role assigned using any attribute source supported by the Resource Centre and must be mapped to a local account with the same permissions as assigned to an ordinary infrastructure user. | ||
#*Globus: | #*Globus: | ||
Line 295: | Line 295: | ||
#'''Accounting''' | #'''Accounting''' | ||
#*Site successfully [[MAN09#Testing_procedure_-_sending_trial_records|sent trial accounting records]] | #*Site successfully [[MAN09#Testing_procedure_-_sending_trial_records|sent trial accounting records]] | ||
#*Host Certificate DN should be send to APEL-ADMINS@stfc.ac.uk | |||
#'''OPS, Dteam are configured and supported. Regional VOs''' are configured and supported as needed. | #'''OPS, Dteam are configured and supported. Regional VOs''' are configured and supported as needed. | ||
#'''Site is integrated in any regional tool as needed '''(for example, the regional accounting infrastructure if present).<br> | #'''Site is integrated in any regional tool as needed '''(for example, the regional accounting infrastructure if present).<br> | ||
Line 314: | Line 315: | ||
|- valign="top" | |- valign="top" | ||
| 7 | | 7<br> | ||
| RC<br> | |||
| | |||
This step also apples to certified Resource Centers which introduce cloud resources for the first time. | |||
Fill the[https://www.surveymonkey.com/s/Cloud_Security_Assessment_for_Resource_Centres ''security survey'' ] and forward the required information to the CSIRT. | |||
*The purpose of the survey is to assess that the technology used to provide cloud services fulfils the EGI security policies and procedures. | |||
|- valign="top" | |||
| 8<br> | |||
| CSIRT | | CSIRT | ||
| | | | ||
Line 321: | Line 332: | ||
*The security assessment is performed by the NGI security officers using the tools provided by, and with assistance of the EGI CSIRT. | *The security assessment is performed by the NGI security officers using the tools provided by, and with assistance of the EGI CSIRT. | ||
Cloud:<br> | |||
Checks that '''the Resource Centre passes the basic security assessment tests'''<br> | |||
*The security assessment is performed by the the EGI CSIRT. | |||
*Site administrator should fill in [https://documents.egi.eu/secure/ShowDocument?docid=2114 EGI Federated Cloud Security - Questionnaire for sites deploying cloud technology] | |||
<br> | |||
|- valign="top" | |- valign="top" | ||
| | | 9<br> | ||
| OC | | OC | ||
| | | | ||
'''If all preliminary tests are passed for 3 consecutive calendar days''', declare an initial maintenance downtime and switch the Resource Centre status to 'Certified'. | '''If all preliminary tests are passed for 3 consecutive calendar days''', declare an initial maintenance downtime and switch the Resource Centre status to 'Certified'. | ||
*This ensures that Resource Centre will appear in NAGIOS | *This ensures that Resource Centre will appear in NAGIOS. | ||
*The target 'Infrastructure' value should be set to 'Production'. | *The target 'Infrastructure' value should be set to 'Production'. | ||
|- valign="top" | |- valign="top" | ||
| | | 10<br> | ||
| OC | | OC | ||
| | | | ||
Line 337: | Line 355: | ||
*appears in all operational tools<br> | *appears in all operational tools<br> | ||
**Regional NAGIOS (NAGIOS) | **Grid: Regional NAGIOS (NAGIOS) <br> | ||
**Cloud: [https://cloudmon.egi.eu/nagios/ Cloud NAGIOS ](NAGIOS) | |||
**GGUS - the Resource Centre appears in the "Notified Site" field - [https://ggus.eu/ws/ticket_search.php GGUS search] | **GGUS - the Resource Centre appears in the "Notified Site" field - [https://ggus.eu/ws/ticket_search.php GGUS search] | ||
***And all Nagios tests are passed | ***And all Nagios tests are passed | ||
**[https://grid-monitoring.cern.ch/myegi/ MyEGI] | *accounting data is properly published<br> | ||
*[https://grid-monitoring.cern.ch/myegi/ MyEGI] | |||
<br> If there are problems with a specific tool, open GGUS tickets to the relevant Support Units. | <br> If there are problems with a specific tool, open GGUS tickets to the relevant Support Units. | ||
Line 348: | Line 367: | ||
|- valign="top" | |- valign="top" | ||
| | | 11 | ||
| OC | | OC | ||
| '''Notify the Resource Centre Operations Manager that the Resource Centre is certified'''<br> <br> | | '''Notify the Resource Centre Operations Manager that the Resource Centre is certified'''<br> <br> | ||
|- valign="top" | |- valign="top" | ||
| | | 12 | ||
| OC | | OC | ||
| | | | ||
Line 361: | Line 380: | ||
|} | |} | ||
<u>After the successful completion of these steps, the Resource Centre is considered as "Certified".</u> | <u>After the successful completion of these steps, the Resource Centre is considered as "Certified".</u> | ||
= Revision History = | = Revision History = | ||
Line 377: | Line 396: | ||
| RC Certification steps: Step 5 added part concerning QCG | | RC Certification steps: Step 5 added part concerning QCG | ||
|- | |- | ||
| | | <br> | ||
| M. Krakowian | | M. Krakowian | ||
| 19 August 2014 | | 19 August 2014 | ||
| Change contact group - | | Change contact group -> Operations support | ||
|} | |} | ||
[[Category:Operations_Procedures]] | [[Category:Operations_Procedures]] |
Revision as of 16:10, 1 October 2014
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
Documentation menu: | Home • | Manuals • | Procedures • | Training • | Other • | Contact ► | For: | VO managers • | Administrators |
Title | Resource Centre Registration and Certification |
Document link | https://wiki.egi.eu/wiki/PROC09 |
Last modified | 2.0 19 August 2014 |
Policy Group Acronym | OMB |
Policy Group Name | Operations Management Board |
Contact Group | operations-support@mailman.egi.eu |
Document Status | Approved |
Approved Date | 30.10.2012 |
Procedure Statement | A procedure for the steps involved to both register and certify new Resource Centres (sites) in the EGI infrastructure. The certification step can also be used to re-certify suspended Resource Centres (sites). |
Owner | Owner of procedure |
Overview
Certification is a verification process for a Resource Centre (aka site) to become part of a Resource Infrastructure such as a National Grid Initiative (NGI), an EIRO, or a multi-country Resource Infrastructure.
This document describes the steps required to
- register and certify a new Resource Centre,
- re-certify a Resource Centre which has been suspended.
A separate document provides the process for decommissioning a Resource Centre.
Through its parent Resource Infrastructure, a certified Resource Centre becomes a member of the EGI Resource Infrastructure to make resources available to international user communities.
The main difference between a certified Resource Centre and an uncertified or test Resource Centre is that a certified Resource Centre provides and guarantees a minimum quality of service of the resources (currently expressed in terms of monthly availability and reliability). All the requirements can be found in the Resource Centre OLA.
Definitions
- Resource Centre refers to the definition in the "Resource Centre OLA".
- In this document, the term "site" is deprecated, and Resource Centre has been used in its place.
Please refer to the EGI Glossary for the definitions of the terms used in this procedure.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
Entities involved in the procedure
Following entities are involved in the process described in the procedure:
- Resource Centre Operations Manager
- A person who is responsible for initiating the certification process by applying for membership to a Resource Infrastructure. e.g site administrator
- Resource Infrastructure Operations Manager
- A person who is responsible for approving the integration of a new Resource Centre into the respective Infrastructure. e.g. NGI manager.
- EGI Resource infrastructure Providers are listed on the EGI web site
- Operations Centre (Resource Infrastructure)
- An entity which is technically responsible for carrying out the Resource Centre certification part of the procedure, once the membership is approved.
- A list of EGI Operations Centres with their respective contact information is available from the GOCDB (access restricted - grid certificate needed)
- Response Team
- EGI entity which is technically responsible for carrying out the security certification.
- contact
The Resource Infrastructure Operations Manager can determine with the Resource Centre Operations Manager the level of involvement of other actors.
Prerequisites and responsibilities
Resource Centre Operations Manager
Resource Center Operations Manager is:
- responsible for all Resource Centres within its respective jurisdiction. For this reason, the Resource Centre Operations Manager is REQUIRED to
- contact the respective Operations Center if the Resource Centre is located in Europe,
- contact the respective Resource infrastructure Provider active in a relevant geographical area if the Resource Centre is outside Europe.
- If needed, EGI Operations can assist the Resource Centre Operations Manager to get in contact with the relevant partner.
- If needed, EGI Operations can assist the Resource Centre Operations Manager to get in contact with the relevant partner.
- REQUIRED to provide the necessary Resource Centre information needed to complete the registration process, and he/she is responsible for its accuracy and maintenance.
- responsible for reading, understanding and accepting:
- the Resource Centre Operational Level Agreement (the obligations of a Resource Centre)
- the Grid Security Policy
- the Grid Resource Centre Operations Policy
- the Resource Centre Registration Security Policy
- all other policies for all EGI participants from the Security Policy Group
- the Resource Centre Operational Level Agreement (the obligations of a Resource Centre)
To become Site administrator go through following steps.
Resource Infrastructure Operations Manager
Resource Infrastructure Operations Manager:
- is REQUIRED to be responsible for all Resource Centres within its respective jurisdiction.
- MUST attend Resource Centre certification applications and MUST provide feedback to the requesting partners in a timely manner to accept or reject the requests received.
- is responsible for keeping records of the Resource Centre Operations Manager OLA agreement, as deemed suitable by the Resource infrastructure Provider
- for example, through a signed e-mail agreement, a collection of signatories on a paper copy of the OLA, or other means.
Operations Centre
The Operations Center:
- is responsible for registering (if applicable) and certifying the Resource Centre.
- (In the case of re-certification)MUST ensure that the issue that caused the suspension has been resolved
- (After suspension for security reason)MUST contact the EGI CSIRT to verify that all requested repair operations have been successfully applied to fix the issue.
Resource Center status Workflow
The general status flow that a Resource Centre is allowed to follow is illustrated by the following diagram.
Information on Resource Centre status and on how to manipulate it is available from GOCDB Documentation.
Timelines
A Resource Centre cannot be in
- Candidate state for more than two months
- Suspended state for more than four months
After this period the Resource Centre SHOULD be closed.
Resource Centre registration
Requirements
A Resource Centre MUST
- find a rescpective Resource Infrastructure which will provide operational services to the Resource Center. If a provider is not yet available for your country, then an alternative existing Operations Centre can be contacted.
- provide required information: HOWTO01 Site Certification Required Information.
Notes: If a Resource Centre wishes to leave the Grid or the Grid decides to remove the Resource Centre, the registration information MUST be kept by GOCDB for at least the same period defined for logging in the Traceability and Logging Policy. Personal registration information of the Resource Centre Operations Manager and Security Contact of the Resource Centre leaving the Grid MUST NOT be retained for longer than one year.
Steps
The following steps are only applicable if the Resource Centre is not already registered in GOCDB.
- Actions tagged RC are the responsibility of the Resource Centre Operations Manager.
- Actions tagged RP are the responsibility of the Resource Infrastructure Operations Manager.
- Actions tagged OC are the responsibility of the Operations Centre
# | Responsible | Action |
---|---|---|
0 | RC |
Contact your Resource Infrastructure Operations Manager (contact information is available at EGI web site).
|
1 | RP |
Accept or reject registration request and communicate this result back to applicant.
|
2 | OC |
Include the Operations Centre ROD, CSIRT, or help-desk teams in the step if necessary. |
3 | OC |
|
4 | RC |
|
5 | OC |
Check GOC DB that the Resource Centre's information is correct.
|
6 | OC |
Any other Operations Centre-specific requirements (e.g. join a certain VO and/or mailing list, etc.) |
7 | OC |
If all previous actions have been completed with success, notify the Resource Centre Operations Manager that the Registration is completed, and contact the Resource Infrastructure Operations Manager to notify that a new candidate Resource Centre exists and is ready to be certified. |
After the successful completion of all these steps, the registration phase is completed and the Resource Centre is ready for the start of the certification phase.
Resource Centre certification
Requirements
- The Resource Centre Certification procedure is only applicable for both Resource Centres in "Candidate" or "Suspended" status state in GOC DB.
- A Resource Centre can successfully pass certification only if the conditions required by the Resource Centre OLA are met.
Steps
The following is a detailed description of the steps required for the transition from the "Candidate"/"Suspended" to the "Certified" state of the Resource Centre.
- Actions tagged RC are the responsibility of the Resource Centre Operations Manager.
- Actions tagged RP are the responsibility of the Resource Infrastructure Operations Manager.
- Actions tagged OC are the responsibility of the Operations Centre
- Actions tagged CSIRT are the responsibility of the Computer Security Incident Response Team
# | Responsible | Action |
---|---|---|
0 | RP |
The Resource Infrastructure Operations Manager contacts the Resource Centre Operations Manager to request the subscription of the Resource Centre OLA. |
1 | RC |
The Resource Centre Operations Manager notifies the Resource Infrastructure Operations Manager that the Resource Centre OLA is accepted (if the Resource Centre is has not already endorsed it before for example in case of a suspended Resource Centre), and the Resource Centre is ready to start certification. |
2 | RP |
The Resource Infrastructure Operations Manager contacts the Operations Centre asking to start the certification process. |
3 | OC |
If the Resource Centre is in the "Candidate" or "Suspended" state, then flag the Resource Centre as "Uncertified".
|
4 | OC |
Add Resource Centre contact information to any regional mailing list and provide access to regional tools as required. |
5 | OC |
Check:
|
6 | OC |
Check that the registered services are fully functional by performing manual tests.
Details for submitting manual tests can be found at Manual tests. |
7 |
RC |
This step also apples to certified Resource Centers which introduce cloud resources for the first time. Fill thesecurity survey and forward the required information to the CSIRT.
|
8 |
CSIRT |
Checks that the Resource Centre passes the basic security assessment tests, consisting of a suite of nagios security probes and the patch status monitoring tool (eg. Pakiti). Especially the Resource Centre MUST NOT reveal critical vulnerabilities as defined from SVG/CSIRT
Cloud: Checks that the Resource Centre passes the basic security assessment tests
|
9 |
OC |
If all preliminary tests are passed for 3 consecutive calendar days, declare an initial maintenance downtime and switch the Resource Centre status to 'Certified'.
|
10 |
OC |
The downtime should not be closed until the Resource Centre
Wait at least two days after the switch to the 'Certified' status to open the ticket, the propagation of the new status to the operational tools or the publication of accounting data may take one or two days. |
11 | OC | Notify the Resource Centre Operations Manager that the Resource Centre is certified |
12 | OC |
The Operation Center can broadcast that a new Resource Centre is now part of the EGI infrastructure. This step is OPTIONAL. |
After the successful completion of these steps, the Resource Centre is considered as "Certified".
Revision History
Version | Authors | Date | Comments |
---|---|---|---|
Malgorzata | 18.03 | RC Certification steps: Step 5 added part concerning QCG | |
M. Krakowian | 19 August 2014 | Change contact group -> Operations support |