Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "PROC19"

From EGIWiki
Jump to navigation Jump to search
(Deprecate page)
Tag: Replaced
 
(138 intermediate revisions by 6 users not shown)
Line 1: Line 1:
{{Template:Op menubar}} {{Template:Doc_menubar}} {{TOC_right}}
{{Template:Op menubar}} {{Template:Doc_menubar}}
 
[[Category:Deprecated]]
{{Ops_procedures
{| style="border:1px solid black; background-color:lightgrey; color: black; padding:5px; font-size:140%; width: 90%; margin: auto;"
|Doc_title = Introducing new stacks and middleware in EGI Production Infrastructure
| style="padding-right: 15px; padding-left: 15px;" |  
|Doc_link = [[PROC09|https://wiki.egi.eu/wiki/PROC19]]
|[[File:Alert.png]] This page is '''Deprecated'''; the content has been moved to https://confluence.egi.eu/display/EGIPP/PROC19+Integration+of+new+cloud+management+framework+or+middleware+stack+in+the+EGI+Infrastructure
|Version =  
|Policy_acronym = OMB
|Policy_name = Operations Management Board
|Contact_group = operations-support@mailman.egi.eu
|Doc_status =
|Approval_date =
|Procedure_statement = A procedure for the steps to introduce new stack (Cloud platform) or middleware (HTC Platform) in EGI Production Infrastructure.
}}
 
<br>
 
<u>'''Under construction'''</u>
 
= Overview  =
 
To assure production quality of EGI infrastructure every new stac
 
= Definitions  =
 
*'''Cloud Resource Centre''' refers to the Resource Center definition in the "[https://documents.egi.eu/document/31 Resource Centre OLA]". In addition Resource Center is planing to provide Cloud production quality resources.
 
:''In this document, the term "'''site'''" is '''deprecated''', and '''Resource Centre''' has been used in its place.''
 
Please refer to the [[Glossary|EGI Glossary]] for the definitions of the terms used in this procedure.<br>
 
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
 
= Entities involved in the procedure  =
 
Please see [[PROC09#Entities_involved_in_the_procedure]]
 
= Prerequisites and responsibilities  =
 
Please see [[PROC09#Prerequisites_and_responsibilities]]
 
<br>
 
= Resource Center status Workflow  =
 
Please see [[PROC09#Resource_Center_status_Workflow]]
 
== Resource Centre registration  ==
 
=== Requirements  ===
 
Please see [[PROC09#Requirements]]
 
=== Steps  ===
 
The following steps are only applicable if '''the Resource Centre is not already registered in GOCDB'''. <br>
 
*Actions tagged '''RC''' are the responsibility of the Resource Centre Operations Manager.
*Actions tagged '''RP''' are the responsibility of the Resource Infrastructure Operations Manager.
*Actions tagged '''OC''' are the responsibility of the Operations Centre
 
{| class="wikitable"
|-
! #
! Responsible
! Action
|- valign="top"
| 0
| RC
|
'''Contact your Resource Infrastructure Operations Manager''' (contact information is available at [http://www.egi.eu/community/resource-providers/ EGI web site]).
 
*Provide the required information according to the template available in the [[HOWTO01|Required information]] page.
 
|- valign="top"
| 1
| RP
|
'''Accept or reject registration request''' and communicate this result back to applicant.
 
*If the Resource Centre is accepted, notify the relevant Operations Centre, handle the Resource Centre information received, and put the Operations Centre in contact with the Resource Centre Operations Manager.
 
|- valign="top"
| 2
| OC
|
#'''Forward all documentation''':
#*[[HOWTO02|necessary to be read and accept]]
#*documentation how to install and configure the Resource Centre services
#Clarify any doubts or questions.
 
Include the Operations Centre ROD, CSIRT,&nbsp; or help-desk teams in the step if necessary.
 
|- valign="top"
| 3
| OC
|
#'''Add the Resource Centre to the [https://goc.egi.eu/ GOCDB]'''and flag it as "Candidate".
#*Only Regional Management level users (D') can add a site to the NGI and can update the certification status of the site, see [[GOCDB/Input System User Documentation#Roles]]
#Notify the Resource Centre Operations Manager that he/she should [[EGI Operations Start Guide#Joining_operations|Join operations]]
#*In the [https://goc.egi.eu/ GOCDB] he/she should request the 'Resource Centre Operations Manager' role. Approve it when done.
#Notify the Resource Centre Operations Manager that person responsible for security should [[EGI Operations Start Guide#Joining_operations|Join operations]]
#*In the [https://goc.egi.eu/ GOCDB] he/she should request the 'Resource Centre Security Officer' role. Approve it when done.
 
|- valign="top"
| 4
| RC
|
#'''Complete any missing information for the Resource Centre's entry in the GOCDB'''
#*It includs services that are to be integrated into the infrastructure. See [[Fedcloud-tf:WorkGroups:Scenario5#GOCDB|instruction]]
#Notify the Operations Centre that the Resource Centre information update is concluded.
#Note: If the external RC is considering buying host certs, please make sure they source them from an EGI recognised authority. [http://www.eugridpma.org/members/worldmap/ The local national CA] (usually for free or at flat rate) is likely the best source of their SSL certificates.
 
|- valign="top"
| 5
| OC
|
'''Check [http://goc.egi.eu/ GOC DB]''' that the Resource Centre's information is correct.
 
*Resource Centre (site) roles and any other additional information.
*Check that contacts receive email (if they are mailing lists, check that outside EGI members are allowed to post there). Site administrator MUST reply to the test email.<br>
*Check that the required services for a Resource Centre are properly registered.<br>
*Check domain names and forward and reverse DNS.
 
|- valign="top"
| 6
| OC
|
'''Any other Operations Centre-specific requirements''' (e.g. join a certain VO and/or mailing list, etc.)
 
|- valign="top"
| 7
| OC
|
If all previous actions have been completed with success, notify the Resource Centre Operations Manager that the Registration is completed, and contact the Resource Infrastructure Operations Manager to notify that a new candidate Resource Centre exists and is ready to be certified.
 
|}
|}
After the successful completion of all these steps, the registration phase is completed and the Resource Centre is ready for the start of the <span class="il">certification</span> phase.
== Resource Centre certification  ==
=== Requirements  ===
#The Resource Centre Certification procedure is only applicable for '''both Resource Centres in "Candidate" or "Suspended"''' status state in GOC DB.<br>
#A Resource Centre can successfully pass certification only if the conditions required by the [https://documents.egi.eu/document/31 Resource Centre OLA] are met.
=== Steps  ===
The following is a detailed description of the steps required for the transition from the "Candidate"/"Suspended" to the "Certified" state of the Resource Centre.
*Actions tagged '''RC''' are the responsibility of the Resource Centre Operations Manager.
*Actions tagged '''RP''' are the responsibility of the Resource Infrastructure Operations Manager.
*Actions tagged '''OC''' are the responsibility of the Operations Centre
*Actions tagged '''CSIRT''' are the responsibility of the Computer Security Incident Response Team
{| class="wikitable"
|-
! #
! Responsible
! Action
|- valign="top"
| 0
| RP
|
The Resource Infrastructure Operations Manager contacts the Resource Centre Operations Manager to request '''the subscription of the [https://documents.egi.eu/public/ShowDocument?docid=31 Resource Centre OLA]'''.
|- valign="top"
| 1
| RC
|
The Resource Centre Operations Manager notifies the Resource Infrastructure Operations Manager that '''the Resource Centre OLA is accepted''' (if the Resource Centre is has not already endorsed it before for example in case of a suspended Resource Centre), and the Resource Centre is ready to start certification.
|- valign="top"
| 2
| RP
|
'''The Resource Infrastructure Operations Manager contacts the Operations Centre asking to start the certification process.'''
|- valign="top"
| 3
| OC
|
If the Resource Centre is in the "Candidate" or "Suspended" state, then '''flag the Resource Centre as "Uncertified".'''
*If it was in the "Suspended" state then check that the reason for suspension has been cleared.
**If the suspension cause is a security issue, then the EGI CSIRT needs to be contacted to verify that all requested repair operations were successfully applied by the Resource Centre Administrators to fix the issue that caused suspension. See [[SAM#Monitoring_uncertified_sites|instructions]] on how to monitor uncertified RCs.
|- valign="top"
| 4
| OC
|
'''Check:'''
#'''GOC&nbsp;DB:&nbsp;'''All services are registered in GOCDB according to the requirements of the [https://documents.egi.eu/document/31 Resource Centre OLA], these are published and ALSO that services published in the GOCDB are valid.
#'''Information system''':&nbsp;Check that the eu.egi.cloud.information.bdii is working, and publishing coherent values'''* Propose to remove it *'''
#*Proposal to eliminate this step since the information system is not production level, yet ''(Peter S. 12 February 2014) '' <br>
#'''Accounting '''
#*Host Certificate DN should be send to APEL-ADMINS@stfc.ac.uk
#'''Monitoring and troubleshooting''' should be possible:
#*the [[OPS vo|OPS VO]] (monitoring) and the [[Dteam vo|DTEAM VO]] (troubleshooting) are configured and supported by the Resource Centre.
#'''OPS, Dteam and regional VOs''' are configured and supported as needed.
#'''Site is integrated in any regional tool as needed '''(for example, the regional accounting infrastructure if present).<br>
|- valign="top"
| 5
| RC
|
Fill the ''security survey'' (placeholder, survey not available yet) and forward the required information to the CSIRT.
*The purpose of the survey is to assess that the technology used to provide cloud services fulfills the EGI security policies and procedures.
*'''This is an additional step, not yet approved by OMB. Please, ignore until the survey is available''' (Peter S. 12 Feb 2014)
|- valign="top"
| 6
| CSIRT
|
Checks that '''the Resource Centre passes the basic security assessment tests'''<br>
*The security assessment is performed by the the EGI CSIRT.
*Site administrator should fill in [https://documents.egi.eu/secure/ShowDocument?docid=2114 EGI Federated Cloud Security - Questionnaire for sites deploying cloud technology]
This step also apples to certified Resource Centers which introduce cloud resources for the first time.
|- valign="top"
| 7
| OC
|
'''If all preliminary tests are passed for 3 consecutive calendar days''', declare an initial maintenance downtime and switch the Resource Centre status to 'Certified'.
*This ensures that Resource Centre will appear in NAGIOS and GSTAT. '''* Propose to remove it *'''
#*Proposal to eliminate this step since the information system is not production level, yet ''(Peter S. 12 February 2014) '' <br>
*The target 'Infrastructure' value should be set to 'Production'.
|- valign="top"
| 8
| OC
|
'''The downtime '''should not be closed until the Resource Centre
*appears in all operational tools<br>
**[https://cloudmon.egi.eu/nagios/ Cloud NAGIOS ](NAGIOS)
***And all Nagios tests are passed
**GGUS - the Resource Centre appears in the "Notified Site" field - [https://ggus.eu/ws/ticket_search.php GGUS search]
**[https://grid-monitoring.cern.ch/myegi/ MyEGI]
*accounting data is properly published.<br>
<br> If there are problems with a specific tool, open GGUS tickets to the relevant Support Units.
Wait at least two days after the switch to the 'Certified' status to open the ticket, the propagation of the new status to the operational tools or the publication of accounting data may take one or two days.<br>
|- valign="top"
| 9
| OC
| '''Notify the Resource Centre Operations Manager that the Resource Centre is certified'''<br> <br>
|- valign="top"
| 10
| OC
|
'''The Operation Center can broadcast '''that a new Resource Centre is now part of the EGI infrastructure.
This step is OPTIONAL.
|}
<u>After the successful completion of these steps, the Resource Centre is considered as "Certified".</u>
= Revision History  =
{| class="wikitable"
|-
! Version
! Authors
! Date
! Comments
|-
| <br>
|
|
| <br>
|}
[[Category:Operations_Procedures]]

Latest revision as of 16:19, 23 August 2022