Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "PROC09 Resource Centre Registration and Certification"

From EGIWiki
Jump to navigation Jump to search
(34 intermediate revisions by 5 users not shown)
Line 11: Line 11:
|Approval_date = 30.10.2012
|Approval_date = 30.10.2012
|Procedure_statement = A procedure for the steps involved to both register and certify new Resource Centres (sites) in the EGI infrastructure. The certification step can also be used to re-certify suspended Resource Centres (sites).
|Procedure_statement = A procedure for the steps involved to both register and certify new Resource Centres (sites) in the EGI infrastructure. The certification step can also be used to re-certify suspended Resource Centres (sites).
|Owner = Matthew Viljoen
}}  
}}  


Line 54: Line 55:
;Resource Infrastructure Operations Manager  
;Resource Infrastructure Operations Manager  
:A person who is responsible for approving the integration of a new Resource Centre into the respective Infrastructure. e.g. NGI manager.<br>  
:A person who is responsible for approving the integration of a new Resource Centre into the respective Infrastructure. e.g. NGI manager.<br>  
:EGI Resource infrastructure Providers are listed on the EGI [https://www.egi.eu/infrastructure/Resource-providers/index.html web site]  
:EGI Resource infrastructure Providers are listed on the EGI [https://www.egi.eu/federation/data-centres/ web site]  
;Operations Centre (Resource Infrastructure)  
;Operations Centre (Resource Infrastructure)  
:An entity which is technically responsible for carrying out the Resource Centre certification part of the procedure, once the membership is approved.  
:An entity which is technically responsible for carrying out the Resource Centre certification part of the procedure, once the membership is approved.  
:A list of EGI Operations Centres with their respective contact information is available from the [http://go.egi.eu/operations-centres GOCDB] (access restricted - grid certificate needed)  
:A list of EGI Operations Centres with their respective contact information is available from the [http://go.egi.eu/NGIs-list GOCDB] (access restricted - grid certificate needed)  
;Response Team  
;Response Team  
:EGI&nbsp;entity which is technically responsible for carrying out the security certification.  
:EGI&nbsp;entity which is technically responsible for carrying out the security certification.  
:[[Security#Contact|contact]]
:[[Security#Contact|contact]]


<br> The Resource Infrastructure Operations Manager can determine with the Resource Centre Operations Manager the level of involvement of other actors.  
<br> The Resource Infrastructure Operations Manager can determine with the Resource Centre Operations Manager the level of involvement of other actors.


= Prerequisites and responsibilities  =
= Prerequisites and responsibilities  =
Line 74: Line 75:
#*contact the respective Resource infrastructure Provider active in a relevant geographical area if the Resource Centre is outside Europe.  
#*contact the respective Resource infrastructure Provider active in a relevant geographical area if the Resource Centre is outside Europe.  
#**If needed, EGI Operations can assist the Resource Centre Operations Manager to get in contact with the relevant partner.<br>  
#**If needed, EGI Operations can assist the Resource Centre Operations Manager to get in contact with the relevant partner.<br>  
#REQUIRED to provide [[PROC09 Resource Centre Registration and Certification#Requirements|the necessary Resource Centre information]] needed to complete the registration process, and he/she is responsible for its accuracy and maintenance. <br>
#REQUIRED to provide [[PROC09 Resource Centre Registration and Certification#Requirements|the necessary Resource Centre information]] needed to complete the registration process, and he/she is responsible for its accuracy and maintenance.
#*on a yearly basis, he/she has to review the information registered on GOC-DB regarding his/her Resource Centre, in particular:
#**E-Mail
#**telephone numbers
#**CSIRT E-Mail
#**people and roles
#**service endpoints
#responsible for reading, understanding and accepting:  
#responsible for reading, understanding and accepting:  
#*the [https://documents.egi.eu/document/31 Resource Centre Operational Level Agreement] (the obligations of a Resource Centre)<br>  
#*the [https://documents.egi.eu/document/31 Resource Centre Operational Level Agreement] (the obligations of a Resource Centre)<br>  
Line 82: Line 89:
#*all other policies for all EGI participants from the [[SPG:Documents|Security Policy Group]]
#*all other policies for all EGI participants from the [[SPG:Documents|Security Policy Group]]


'''<big>To become Site administrator go through [[EGI Operations Start Guide#Joining_operations|'''following steps''']]</big>'''.  
'''<big>To become Site administrator go through [[EGI Operations Start Guide#Joining_operations|'''following steps''']]</big>'''.


== Resource Infrastructure Operations Manager  ==
== Resource Infrastructure Operations Manager  ==
Line 100: Line 107:
#(In the case of re-certification)MUST ensure that the issue that caused the suspension has been resolved  
#(In the case of re-certification)MUST ensure that the issue that caused the suspension has been resolved  
#*(After suspension for security reason)MUST contact the EGI CSIRT to verify that all requested repair operations have been successfully applied to fix the issue.
#*(After suspension for security reason)MUST contact the EGI CSIRT to verify that all requested repair operations have been successfully applied to fix the issue.
 
#is responsible for maintaining accurate the information registered on GOC-DB regarding the NGI itself and the RCs:
<br>
#*on a yearly basis, he/she is requested to review in particular:
#**E-Mail
#**ROD E-Mail
#**Security E-Mail
#**people and roles
#**the status of the "not certified" RCs, in according to the [https://wiki.egi.eu/wiki/PROC09#Resource_Center_status_Workflow RC Status Workflow]
#**Service Groups


= Resource Center status Workflow  =
= Resource Center status Workflow  =
Line 153: Line 166:
| RC  
| RC  
|  
|  
'''Contact your Resource Infrastructure Operations Manager''' (contact information is available at [http://www.egi.eu/community/resource-providers/ EGI web site]).  
'''Contact your Resource Infrastructure Operations Manager''' (contact information is available on [https://goc.egi.eu/portal/index.php?Page_Type=NGIs GOCDB]).  


*Provide the required information according to the template available in the [[HOWTO01|Required information]] page.
*Provide the required information according to the template available in the [[HOWTO01|Required information]] page.
Line 184: Line 197:
#Approve in [https://goc.egi.eu/ GOCDB] 'Resource Centre Operations Manager' role.&nbsp;  
#Approve in [https://goc.egi.eu/ GOCDB] 'Resource Centre Operations Manager' role.&nbsp;  
#Approve in [https://goc.egi.eu/ GOCDB] 'Resource Centre Security Officer' role.&nbsp;
#Approve in [https://goc.egi.eu/ GOCDB] 'Resource Centre Security Officer' role.&nbsp;
#If the RC is joining the Federated Cloud with some services, open a GGUS ticket to EGI Operations asking to flag the RC with the scope tag "FedCloud".


|- valign="top"
|- valign="top"
Line 191: Line 205:
#'''Complete any missing information for the Resource Centre's entry in the GOCDB'''  
#'''Complete any missing information for the Resource Centre's entry in the GOCDB'''  
#*It includes [https://wiki.egi.eu/wiki/GOCDB/Input_System_User_Documentation#Service_Endpoints services] that are to be integrated into the infrastructure:
#*It includes [https://wiki.egi.eu/wiki/GOCDB/Input_System_User_Documentation#Service_Endpoints services] that are to be integrated into the infrastructure:
#** The
#** '''The production services must have the flags "''Production''" and "''Monitored''"''';
#**If the RC is joining the Federated Cloud, flag the relevant services with the scope tag "FedCloud";
#***This can be done after the FedCloud tag is assigned to the site;
#** the required information for registering a service is shown in [[HOWTO20]].
#Notify the Operations Centre that the Resource Centre information update is concluded.  
#Notify the Operations Centre that the Resource Centre information update is concluded.  
#Note: If the external RC is considering buying host certs, please make sure they source them from an EGI recognised authority. [http://www.eugridpma.org/members/worldmap/ The local national CA] (usually for free or at flat rate) is likely the best source of their SSL certificates.
#Note: If the external RC is considering buying host certs, please make sure they source them from an EGI recognised authority. [http://www.eugridpma.org/members/worldmap/ The local national CA] (usually for free or at flat rate) is likely the best source of their SSL certificates.
Line 275: Line 292:


*If it was in the "Suspended" state then check that the reason for suspension has been cleared.  
*If it was in the "Suspended" state then check that the reason for suspension has been cleared.  
**If the suspension cause is a security issue, then the EGI CSIRT needs to be contacted to verify that all requested repair operations were successfully applied by the Resource Centre Administrators to fix the issue that caused suspension. See [[SAM#Monitoring_uncertified_sites|instructions]] on how to monitor uncertified RCs.
 


|- valign="top"
|- valign="top"
Line 311: Line 328:
| OC  
| OC  
|  
|  
Check that the registered '''services are fully functional by performing [https://wiki.egi.eu/wiki/HOWTO04 manual tests]'''.<br>  
Check that the registered '''services are fully functional either by performing [https://wiki.egi.eu/wiki/HOWTO04 manual tests] or by checking on the [https://argo-mon-uncert.cro-ngi.hr/nagios/ dedicated nagios server]'''.<br>  


*Note that monitoring of uncertified Resource Centres through the ARGO/SAM framework is not yet possible.  
*Note that the uncertified Resource Centres, for being correctly monitored, have to fill in the proper services information into GOC-DB: please follow the [[HOWTO21]].
*Contact the Resource Centre admins if there are problems, and ensure that they fix them. Include the ROD, CSIRT and help-desk teams if necessary. Iterate this step with the Resource Centre admins until tests pass successfully.&nbsp;
*On the ARGO development instance it is displayed the [http://egi.ui.argo.grnet.gr/egi/CriticalUncert uncertified RCs report]
 
Contact the Resource Centre admins if there are problems, and ensure that they fix them. Include the ROD, CSIRT and help-desk teams if necessary. Iterate this step with the Resource Centre admins until tests pass successfully.&nbsp;


Details for manual tests can be found at [[HOWTO04|Manual tests]].  
Details for manual tests can be found at [[HOWTO04|Manual tests]].  
Line 338: Line 357:


*appears in all operational tools<br>  
*appears in all operational tools<br>  
**HTC:&nbsp;ARGO/SAM NAGIOS: [https://argo-mon.egi.eu/nagios/ server 1] and [https://argo-mon2.egi.eu/nagios/ server 2] <br>  
**&nbsp;ARGO/SAM NAGIOS: [https://argo-mon.egi.eu/nagios/ server 1] and [https://argo-mon2.egi.eu/nagios/ server 2] <br>  
**Cloud:&nbsp;[https://cloudmon.egi.eu/nagios/ Cloud NAGIOS ](NAGIOS)
**GGUS - the Resource Centre appears in the "Notified Site" field (GGUS queries GOC-DB once per day, during the night) - [https://ggus.eu/ws/ticket_search.php GGUS search]  
**GGUS - the Resource Centre appears in the "Notified Site" field (GGUS queries GOC-DB once per day, during the night) - [https://ggus.eu/ws/ticket_search.php GGUS search]  
*All Nagios tests are passed  
*All Nagios tests are passed  
Line 373: Line 391:
! Comments
! Comments
|-
|-
| <br>
| <br>
| Alessandro Paolini
| 2019-03-25
| updated the link to the uncertified RCs report page: http://egi.ui.argo.grnet.gr/egi/CriticalUncert
|-
| <br>
| Alessandro Paolini
| 2018-09-05
| updated the link to the uncertified RCs report page
|-
|
| Vincenzo Spinoso
| 2018-04-05
| Removed outdated link (and corresponding sentence) from Step 3 of the Resource Centre Certification procedure: ''"If the suspension cause is a security issue, then the EGI CSIRT needs to be contacted to verify that all requested repair operations were successfully applied by the Resource Centre Administrators to fix the issue that caused suspension. See instructions on how to monitor uncertified RCs. "''
|-
| <br>
| Alessandro Paolini
| 2017-09-20
| Prerequisites and responsibilities section: the Resource Centre Operations Manager and the Operations Centre are requested to review on a yearly basis the information registered on GOC-DB
|-
| <br>
| Alessandro Paolini
| 2017-03-13
| RC certification part:
step 6: it was enabled the monitoring of the uncertified sites, mentioned the HOWTO for adding the proper information on GOC-DB
 
|-
| <br>
| Alessandro Paolini
| 2017-02-24
| RC registration part:
in step 3 and 4, added a substep for flagging the FedCloud sites and services.
 
|-
| <br>
| Alessandro Paolini
| 2016-12-14
| RC certification part:
in step 9, removed the reference to cloudmon.egi.eu because it was dismissed.
 
|-
| <br>
| Alessandro Paolini
| 2016-10-28
| RC registration part:
in step 4, specified that the production services must have the flags "production" and "monitored"; added also a linkh to the manual HOWTO20.
 
|-
| <br>  
| Alessandro Paolini  
| Alessandro Paolini  
| 2016-10-26  
| 2016-10-26  
| RC certification part:
| RC certification part:  
*modified step 6: it is not yet possible monitoring the uncertified RCs with the new centralised monitoring framework
*modified step 6: it is not yet possible monitoring the uncertified RCs with the new centralised monitoring framework  
*modified step 9: updated the ARGO/SAM servers link, deleted the reference to MyEGI
*modified step 9: updated the ARGO/SAM servers link, deleted the reference to MyEGI
|-
|-
| <br>
| <br>  
| Alessandro Paolini  
| Alessandro Paolini  
| 2016-06-08  
| 2016-06-08  
Line 386: Line 453:
|-
|-
| <br>  
| <br>  
| Alessandro Paolini
| Alessandro Paolini  
| 2016-03-22
| 2016-03-22  
| RC Registration steps: added step 6. RC Certification steps: modified step 7.
| RC Registration steps: added step 6. RC Certification steps: modified step 7.
|-
|-

Revision as of 09:11, 6 March 2020

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Documentation menu: Home Manuals Procedures Training Other Contact For: VO managers Administrators



Title Resource Centre Registration and Certification
Document link https://wiki.egi.eu/wiki/PROC09
Last modified 8.6.2016
Policy Group Acronym OMB
Policy Group Name Operations Management Board
Contact Group operations@egi.eu
Document Status Approved
Approved Date 30.10.2012
Procedure Statement A procedure for the steps involved to both register and certify new Resource Centres (sites) in the EGI infrastructure. The certification step can also be used to re-certify suspended Resource Centres (sites).
Owner Matthew Viljoen



Overview

Certification is a verification process for a Resource Centre (aka site) to become part of a Resource Infrastructure such as a National Grid Initiative (NGI), an EIRO, or a multi-country Resource Infrastructure.


This procedure applies to Grid and/or Cloud Resource Centres.


This document describes the steps required to

  1. register and certify a new Resource Centre,
  2. re-certify a Resource Centre which has been suspended.

A separate document provides the process for decommissioning a Resource Centre.

Through its parent Resource Infrastructure, a certified Resource Centre becomes a member of the EGI Resource Infrastructure to make resources available to international user communities.

The main difference between a certified Resource Centre and an uncertified or test Resource Centre is that a certified Resource Centre provides and guarantees a minimum quality of service of the resources (currently expressed in terms of monthly availability and reliability). All the requirements can be found in the Resource Centre OLA.

Definitions

In this document, the term "site" is deprecated, and Resource Centre has been used in its place.

Please refer to the EGI Glossary for the definitions of the terms used in this procedure.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Entities involved in the procedure

Following entities are involved in the process described in the procedure:


Resource Centre Operations Manager
A person who is responsible for initiating the certification process by applying for membership to a Resource Infrastructure. e.g site administrator
Resource Infrastructure Operations Manager
A person who is responsible for approving the integration of a new Resource Centre into the respective Infrastructure. e.g. NGI manager.
EGI Resource infrastructure Providers are listed on the EGI web site
Operations Centre (Resource Infrastructure)
An entity which is technically responsible for carrying out the Resource Centre certification part of the procedure, once the membership is approved.
A list of EGI Operations Centres with their respective contact information is available from the GOCDB (access restricted - grid certificate needed)
Response Team
EGI entity which is technically responsible for carrying out the security certification.
contact


The Resource Infrastructure Operations Manager can determine with the Resource Centre Operations Manager the level of involvement of other actors.

Prerequisites and responsibilities

Resource Centre Operations Manager

Resource Center Operations Manager is:

  1. responsible for all Resource Centres within its respective jurisdiction. For this reason, the Resource Centre Operations Manager is REQUIRED to
    • contact the respective Operations Center if the Resource Centre is located in Europe,
    • contact the respective Resource infrastructure Provider active in a relevant geographical area if the Resource Centre is outside Europe.
      • If needed, EGI Operations can assist the Resource Centre Operations Manager to get in contact with the relevant partner.
  2. REQUIRED to provide the necessary Resource Centre information needed to complete the registration process, and he/she is responsible for its accuracy and maintenance.
    • on a yearly basis, he/she has to review the information registered on GOC-DB regarding his/her Resource Centre, in particular:
      • E-Mail
      • telephone numbers
      • CSIRT E-Mail
      • people and roles
      • service endpoints
  3. responsible for reading, understanding and accepting:

To become Site administrator go through following steps.

Resource Infrastructure Operations Manager

Resource Infrastructure Operations Manager:

  1. is REQUIRED to be responsible for all Resource Centres within its respective jurisdiction.
  2. MUST attend Resource Centre certification applications and MUST provide feedback to the requesting partners in a timely manner to accept or reject the requests received.
  3. is responsible for keeping records of the Resource Centre Operations Manager OLA agreement, as deemed suitable by the Resource infrastructure Provider
    • for example, through a signed e-mail agreement, a collection of signatories on a paper copy of the OLA, or other means.

Operations Centre

The Operations Center:

  1. is responsible for registering (if applicable) and certifying the Resource Centre.
  2. (In the case of re-certification)MUST ensure that the issue that caused the suspension has been resolved
    • (After suspension for security reason)MUST contact the EGI CSIRT to verify that all requested repair operations have been successfully applied to fix the issue.
  3. is responsible for maintaining accurate the information registered on GOC-DB regarding the NGI itself and the RCs:
    • on a yearly basis, he/she is requested to review in particular:
      • E-Mail
      • ROD E-Mail
      • Security E-Mail
      • people and roles
      • the status of the "not certified" RCs, in according to the RC Status Workflow
      • Service Groups

Resource Center status Workflow

The general status flow that a Resource Centre is allowed to follow is illustrated by the following diagram.

Information on Resource Centre status and on how to manipulate it is available from GOCDB Documentation.

SiteStatusFlow.png


Timelines

A Resource Centre cannot be in

  • Candidate state for more than two months
  • Suspended state for more than four months

After this period the Resource Centre SHOULD be closed.

Resource Centre registration

Requirements

A Resource Centre MUST

  1. find a rescpective Resource Infrastructure which will provide operational services to the Resource Center. If a provider is not yet available for your country, then an alternative existing Operations Centre can be contacted. 
  2. provide required information: HOWTO01 Site Certification Required Information.


Notes: If a Resource Centre wishes to leave the Grid or the Grid decides to remove the Resource Centre, the registration information MUST be kept by GOCDB for at least the same period defined for logging in the Traceability and Logging Policy. Personal registration information of the Resource Centre Operations Manager and Security Contact of the Resource Centre leaving the Grid MUST NOT be retained for longer than one year.

Steps

The following steps are only applicable if the Resource Centre is not already registered in GOCDB.

  • Actions tagged RC are the responsibility of the Resource Centre Operations Manager.
  • Actions tagged RP are the responsibility of the Resource Infrastructure Operations Manager.
  • Actions tagged OC are the responsibility of the Operations Centre
# Responsible Action
0 RC

Contact your Resource Infrastructure Operations Manager (contact information is available on GOCDB).

  • Provide the required information according to the template available in the Required information page.
1 RP

Accept or reject registration request and communicate this result back to applicant.

  • If the Resource Centre is accepted, notify the relevant Operations Centre, handle the Resource Centre information received, and put the Operations Centre in contact with the Resource Centre Operations Manager.
2 OC
  1. Forward all documentation:
  2. Clarify any doubts or questions.

Include the Operations Centre ROD, CSIRT,  or help-desk teams in the step if necessary.

3 OC
  1. Add the Resource Centre to the GOCDBand flag it as "Candidate".
  2. Approve in GOCDB 'Resource Centre Operations Manager' role. 
  3. Approve in GOCDB 'Resource Centre Security Officer' role. 
  4. If the RC is joining the Federated Cloud with some services, open a GGUS ticket to EGI Operations asking to flag the RC with the scope tag "FedCloud".
4 RC
  1. Complete any missing information for the Resource Centre's entry in the GOCDB
    • It includes services that are to be integrated into the infrastructure:
      • The production services must have the flags "Production" and "Monitored";
      • If the RC is joining the Federated Cloud, flag the relevant services with the scope tag "FedCloud";
        • This can be done after the FedCloud tag is assigned to the site;
      • the required information for registering a service is shown in HOWTO20.
  2. Notify the Operations Centre that the Resource Centre information update is concluded.
  3. Note: If the external RC is considering buying host certs, please make sure they source them from an EGI recognised authority. The local national CA (usually for free or at flat rate) is likely the best source of their SSL certificates.
5 OC

Check GOC DB that the Resource Centre's information is correct.

  • Resource Centre (site) roles and any other additional information.
  • Check that contacts receive email (if they are mailing lists, check that outside EGI members are allowed to post there). Site administrator MUST reply to the test email.
  • Check that the required services for a Resource Centre are properly registered.
    • Note that for Resource Centres sending accounting data from grid jobs, a “glite-APEL” node must be registered in GOCDB. For Resource Centres sending accounting data from Cloud systems, a “eu.egi.cloud.accounting” node must be registered in GOCDB. The entry must include the correct DN to allow the message broker and accounting repository’s Access Control Lists to get automatically updated. Resource Centres can start publishing usage records after about four hours.
  • Check domain names and forward and reverse DNS.
6
RC ONLY for Cloud RCs - Security survey: follow Security Resource Centre Certification Procedure
7 OC

Any other Operations Centre-specific requirements (e.g. join a certain VO and/or mailing list, etc.)

8 OC

If all previous actions have been completed with success, notify the Resource Centre Operations Manager that the Registration is completed, and contact the Resource Infrastructure Operations Manager to notify that a new candidate Resource Centre exists and is ready to be certified.

After the successful completion of all these steps, the registration phase is completed and the Resource Centre is ready for the start of the certification phase.

Resource Centre certification

Requirements

  1. The Resource Centre Certification procedure is only applicable for both Resource Centres in "Candidate" or "Suspended" status state in GOC DB.
  2. A Resource Centre can successfully pass certification only if the conditions required by the Resource Centre OLA are met.

Steps

The following is a detailed description of the steps required for the transition from the "Candidate"/"Suspended" to the "Certified" state of the Resource Centre.

  • Actions tagged RC are the responsibility of the Resource Centre Operations Manager.
  • Actions tagged RP are the responsibility of the Resource Infrastructure Operations Manager.
  • Actions tagged OC are the responsibility of the Operations Centre


# Responsible Action
0 RP

The Resource Infrastructure Operations Manager contacts the Resource Centre Operations Manager to request the subscription of the Resource Centre OLA.

1 RC

The Resource Centre Operations Manager notifies the Resource Infrastructure Operations Manager that the Resource Centre OLA is accepted (if the Resource Centre is has not already endorsed it before for example in case of a suspended Resource Centre), and the Resource Centre is ready to start certification.

2 RP

The Resource Infrastructure Operations Manager contacts the Operations Centre asking to start the certification process.

3 OC

If the Resource Centre is in the "Candidate" or "Suspended" state, then flag the Resource Centre as "Uncertified".

  • If it was in the "Suspended" state then check that the reason for suspension has been cleared.


4 OC

Add Resource Centre contact information to any regional mailing list and provide access to regional tools as required.

5 OC

Check:

  1. GOC DB: All services are registered in GOCDB according to the requirements of the Resource Centre OLA, these are published and ALSO that services published in the GOCDB are valid.
  2. Information system: Check that the GIIS (gLite, Globus, cloud: BDII) is working, and publishing coherent values
  3. Monitoring and troubleshooting should be possible:
    • Glite, ARC, cloud: the OPS VO (monitoring) and the DTEAM VO (troubleshooting) are configured and supported by the Resource Centre.
    • UNICORE: all certificates used by monitoring users must have the 'user' role assigned using any attribute source supported by the Resource Centre and must be mapped to a local account with the same permissions as assigned to an ordinary infrastructure user.
    • Globus:
    1. The OPS VO and DTEAM VO are configured and supported for those resource centres supporting the VO concept (via the lcas lcmaps adaptors for Globus)
    2. Robot certificates from the RP should be entered in the gridmap files to enable these checks. Please contact your RP for more information about this.
    • QCG:
    1. All certificates used by monitoring users must be mapped to a local account with the same permissions as assigned to an ordinary infrastructure user.
    2. The certificate of the regional instance of QCG-Broker must be authorized.
  4. Accounting
  5. OPS, Dteam are configured and supported. Regional VOs are configured and supported as needed.
  6. Site is integrated in any regional tool as needed (for example, the regional accounting infrastructure if present).
6 OC

Check that the registered services are fully functional either by performing manual tests or by checking on the dedicated nagios server.

  • Note that the uncertified Resource Centres, for being correctly monitored, have to fill in the proper services information into GOC-DB: please follow the HOWTO21.
  • On the ARGO development instance it is displayed the uncertified RCs report

Contact the Resource Centre admins if there are problems, and ensure that they fix them. Include the ROD, CSIRT and help-desk teams if necessary. Iterate this step with the Resource Centre admins until tests pass successfully. 

Details for manual tests can be found at Manual tests.

7
RC ONLY for HTC RCs - Security monitoring: follow Security Resource Centre Certification Procedure
8
OC

If all preliminary tests are passed for 3 consecutive calendar days, declare an initial maintenance downtime and switch the Resource Centre status to 'Certified'.

  • This ensures that Resource Centre will appear in NAGIOS.
  • The target 'Infrastructure' value should be set to 'Production'.
9
OC

The downtime should not be closed until the Resource Centre

  • appears in all operational tools
    •  ARGO/SAM NAGIOS: server 1 and server 2
    • GGUS - the Resource Centre appears in the "Notified Site" field (GGUS queries GOC-DB once per day, during the night) - GGUS search
  • All Nagios tests are passed
  • accounting data is properly published


If there are problems with a specific tool, open GGUS tickets to the relevant Support Units.

Wait at least two days after the switch to the 'Certified' status to open the ticket, the propagation of the new status to the operational tools or the publication of accounting data may take one or two days.

10 OC Notify the Resource Centre Operations Manager that the Resource Centre is certified

11 OC

The Operation Center can broadcast that a new Resource Centre is now part of the EGI infrastructure.

This step is OPTIONAL.

After the successful completion of these steps, the Resource Centre is considered as "Certified".

Revision History

Version Authors Date Comments

Alessandro Paolini 2019-03-25 updated the link to the uncertified RCs report page: http://egi.ui.argo.grnet.gr/egi/CriticalUncert

Alessandro Paolini 2018-09-05 updated the link to the uncertified RCs report page
Vincenzo Spinoso 2018-04-05 Removed outdated link (and corresponding sentence) from Step 3 of the Resource Centre Certification procedure: "If the suspension cause is a security issue, then the EGI CSIRT needs to be contacted to verify that all requested repair operations were successfully applied by the Resource Centre Administrators to fix the issue that caused suspension. See instructions on how to monitor uncertified RCs. "

Alessandro Paolini 2017-09-20 Prerequisites and responsibilities section: the Resource Centre Operations Manager and the Operations Centre are requested to review on a yearly basis the information registered on GOC-DB

Alessandro Paolini 2017-03-13 RC certification part:

step 6: it was enabled the monitoring of the uncertified sites, mentioned the HOWTO for adding the proper information on GOC-DB


Alessandro Paolini 2017-02-24 RC registration part:

in step 3 and 4, added a substep for flagging the FedCloud sites and services.


Alessandro Paolini 2016-12-14 RC certification part:

in step 9, removed the reference to cloudmon.egi.eu because it was dismissed.


Alessandro Paolini 2016-10-28 RC registration part:

in step 4, specified that the production services must have the flags "production" and "monitored"; added also a linkh to the manual HOWTO20.


Alessandro Paolini 2016-10-26 RC certification part:
  • modified step 6: it is not yet possible monitoring the uncertified RCs with the new centralised monitoring framework
  • modified step 9: updated the ARGO/SAM servers link, deleted the reference to MyEGI

Alessandro Paolini 2016-06-08 Changed contact group -> Operations

Alessandro Paolini 2016-03-22 RC Registration steps: added step 6. RC Certification steps: modified step 7.

Malgorzata 18.03 RC Certification steps: Step 5 added part concerning QCG

M. Krakowian 2014-08-19 Change contact group -> Operations support

M Krakowian 2014-10-01 Merge Proc18 (CLoud site certification) and Proc09

M Krakowian
2014-12-05
Step 7 add reference to Security Resource Centre Certification Procedure