PROC19
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
Documentation menu: | Home • | Manuals • | Procedures • | Training • | Other • | Contact ► | For: | VO managers • | Administrators |
Title | Introducing new stacks and middleware in EGI Production Infrastructure |
Document link | https://wiki.egi.eu/wiki/PROC19 |
Last modified | |
Policy Group Acronym | OMB |
Policy Group Name | Operations Management Board |
Contact Group | operations-support@mailman.egi.eu |
Document Status | |
Approved Date | |
Procedure Statement | A procedure for the steps to introduce new stack (Cloud platform) or middleware (HTC Platform) in EGI Production Infrastructure. |
Owner | Owner of procedure |
Under construction
Overview
To assure production quality of EGI Infrastructure every stack (Cloud platform) or middleware (HTC Platform) supported by Production Resource Centres needs to fulfil certain requirements. The goal of this procedure is to assure that EGI Infrastructure is fully supported by operations tools.
Definitions
- cloud stack: software for creating, managing, and deploying infrastructure cloud services.
- grid middleware: software which allows the users to execute jobs in grid infrastructure.
Please refer to the EGI Glossary for the definitions of the terms used in this procedure.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
Entities involved in the procedure
Please see PROC09#Entities_involved_in_the_procedure
Prerequisites and responsibilities
Please see PROC09#Prerequisites_and_responsibilities
Resource Center status Workflow
Please see PROC09#Resource_Center_status_Workflow
Resource Centre registration
Requirements
Please see PROC09#Requirements
Steps
The following steps are only applicable if the Resource Centre is not already registered in GOCDB.
- Actions tagged RC are the responsibility of the Resource Centre Operations Manager.
- Actions tagged RP are the responsibility of the Resource Infrastructure Operations Manager.
- Actions tagged OC are the responsibility of the Operations Centre
# | Responsible | Action |
---|---|---|
0 | RC |
Contact your Resource Infrastructure Operations Manager (contact information is available at EGI web site).
|
1 | RP |
Accept or reject registration request and communicate this result back to applicant.
|
2 | OC |
Include the Operations Centre ROD, CSIRT, or help-desk teams in the step if necessary. |
3 | OC |
|
4 | RC |
|
5 | OC |
Check GOC DB that the Resource Centre's information is correct.
|
6 | OC |
Any other Operations Centre-specific requirements (e.g. join a certain VO and/or mailing list, etc.) |
7 | OC |
If all previous actions have been completed with success, notify the Resource Centre Operations Manager that the Registration is completed, and contact the Resource Infrastructure Operations Manager to notify that a new candidate Resource Centre exists and is ready to be certified. |
After the successful completion of all these steps, the registration phase is completed and the Resource Centre is ready for the start of the certification phase.
Resource Centre certification
Requirements
- The Resource Centre Certification procedure is only applicable for both Resource Centres in "Candidate" or "Suspended" status state in GOC DB.
- A Resource Centre can successfully pass certification only if the conditions required by the Resource Centre OLA are met.
Steps
The following is a detailed description of the steps required for the transition from the "Candidate"/"Suspended" to the "Certified" state of the Resource Centre.
- Actions tagged RC are the responsibility of the Resource Centre Operations Manager.
- Actions tagged RP are the responsibility of the Resource Infrastructure Operations Manager.
- Actions tagged OC are the responsibility of the Operations Centre
- Actions tagged CSIRT are the responsibility of the Computer Security Incident Response Team
# | Responsible | Action |
---|---|---|
0 | RP |
The Resource Infrastructure Operations Manager contacts the Resource Centre Operations Manager to request the subscription of the Resource Centre OLA. |
1 | RC |
The Resource Centre Operations Manager notifies the Resource Infrastructure Operations Manager that the Resource Centre OLA is accepted (if the Resource Centre is has not already endorsed it before for example in case of a suspended Resource Centre), and the Resource Centre is ready to start certification. |
2 | RP |
The Resource Infrastructure Operations Manager contacts the Operations Centre asking to start the certification process. |
3 | OC |
If the Resource Centre is in the "Candidate" or "Suspended" state, then flag the Resource Centre as "Uncertified".
|
4 | OC |
Check:
|
5 | RC |
Fill the security survey (placeholder, survey not available yet) and forward the required information to the CSIRT.
|
6 | CSIRT |
Checks that the Resource Centre passes the basic security assessment tests
This step also apples to certified Resource Centers which introduce cloud resources for the first time. |
7 | OC |
If all preliminary tests are passed for 3 consecutive calendar days, declare an initial maintenance downtime and switch the Resource Centre status to 'Certified'.
|
8 | OC |
The downtime should not be closed until the Resource Centre
Wait at least two days after the switch to the 'Certified' status to open the ticket, the propagation of the new status to the operational tools or the publication of accounting data may take one or two days. |
9 | OC | Notify the Resource Centre Operations Manager that the Resource Centre is certified |
10 | OC |
The Operation Center can broadcast that a new Resource Centre is now part of the EGI infrastructure. This step is OPTIONAL. |
After the successful completion of these steps, the Resource Centre is considered as "Certified".
Revision History
Version | Authors | Date | Comments |
---|---|---|---|