Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "PROC19"

From EGIWiki
Jump to navigation Jump to search
Line 4: Line 4:


{{Ops_procedures
{{Ops_procedures
|Doc_title = Introducing new cloud stack and grid middleware in EGI Production Infrastructure
|Doc_title = Integration of new cloud stack and grid middleware in EGI Production Infrastructure
|Doc_link = [[PROC19|https://wiki.egi.eu/wiki/PROC19]]
|Doc_link = [[PROC19|https://wiki.egi.eu/wiki/PROC19]]
|Version =  
|Version =  
Line 12: Line 12:
|Doc_status = DRAFT
|Doc_status = DRAFT
|Approval_date =  
|Approval_date =  
|Procedure_statement = A procedure for the steps to introduce new cloud stack (Cloud platform) or grid middleware (Grid Platform) in EGI Production Infrastructure.
|Procedure_statement = A procedure for the steps to integrate new cloud stack (Cloud platform) or grid middleware (Grid Platform) in EGI Production Infrastructure.
}}  
}}  


Line 34: Line 34:
*'''Technology Product team leader (TPL)''': person representing and leading Technology Product team  
*'''Technology Product team leader (TPL)''': person representing and leading Technology Product team  
*'''EGI Operations''' '''(EGIOps)'''  
*'''EGI Operations''' '''(EGIOps)'''  
*'''Operations Centre (OC)'''
*'''Operations Centre (OC)'''  
*'''Resource Centre (RC)'''
*'''Resource Centre (RC)'''  
*'''[[Operations Management Board|Operations Management Board]]''': EGI operations policy board
*'''[[Operations Management Board|Operations Management Board]]''': EGI operations policy board


Line 59: Line 59:
|-
|-
| 1  
| 1  
| Applicant<br>
| Applicant<br>  
| Opens a [https://ggus.eu/ GGUS] ticket to Operations to start the process. <pre>Subject: Request for integration of XXX to EGI Production Infrastructure
| Opens a [https://ggus.eu/ GGUS] ticket to Operations to start the process. <pre>Subject: Request for integration of XXX to EGI Production Infrastructure


Line 78: Line 78:
</pre>
</pre>
|-
|-
| 2
| 2  
| EGIOps  
| EGIOps  
| Operations contacts the OMB to request the approval of the request.
| Operations contacts the OMB to request the approval of the request.
Line 113: Line 113:
|  
|  
| 3  
| 3  
| OC
| OC  
|  
|  
If the request is deemed valid, a GGUS ticket is sent to [[GGUS:SLM-FAQ|Service Level Management]](SLM) Support Unit.  
If the request is deemed valid, a GGUS ticket is sent to [[GGUS:SLM-FAQ|Service Level Management]](SLM) Support Unit.  

Revision as of 12:55, 14 October 2014

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Documentation menu: Home Manuals Procedures Training Other Contact For: VO managers Administrators


Baustelle.png This page is under construction.



Title Integration of new cloud stack and grid middleware in EGI Production Infrastructure
Document link https://wiki.egi.eu/wiki/PROC19
Last modified
Policy Group Acronym OMB
Policy Group Name Operations Management Board
Contact Group operations-support@mailman.egi.eu
Document Status DRAFT
Approved Date
Procedure Statement A procedure for the steps to integrate new cloud stack (Cloud platform) or grid middleware (Grid Platform) in EGI Production Infrastructure.
Owner Owner of procedure



Overview

To assure production quality of EGI Infrastructure every stack (Cloud platform) or middleware (Grid Platform) supported by Production Resource Centres needs to fulfil certain requirements. The goal of this procedure is to assure EGI Infrastructure compliance.

Definitions

  • cloud stack: software for creating, managing, and deploying infrastructure cloud services.
  • grid middleware: software which allows the users to execute jobs in grid infrastructure.


Please refer to the EGI Glossary for the definitions of the terms used in this procedure.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Entities involved in the procedure

  • Technology Product team leader (TPL): person representing and leading Technology Product team
  • EGI Operations (EGIOps)
  • Operations Centre (OC)
  • Resource Centre (RC)
  • Operations Management Board: EGI operations policy board

Steps

Sending a request

The requested can be send by:

  1. Operations Centre
  2. EGI Operations
  3. Technology Provider team leader

Resource Centre can also request integration of new cloud stack or grid middleware. Such request should be first approved by Operations Centre, it belongs to. In such case OC is responsible to create a ticket on behalf of RC.


Step Action on Action
1 Applicant
Opens a GGUS ticket to Operations to start the process.
Subject: Request for integration of XXX to EGI Production Infrastructure

Dear Operations,

We would like to request for setting XXX test an operations test

Prerequisite data:
* name of nagios probe:
* name of service on which the test runs: 
* link to documentation page:
* motivation (which part of the infrastructure will be improved by making XXX test 
 or description of users' problems which will be avoided in future - provide list 
 of GGUS tickets is possible)

Best Regards
XXX
2 EGIOps Operations contacts the OMB to request the approval of the request.

Request Validation

Step Substep Action on Action
1 1 RC

As soon as the problem is detected, notify your NGI operations centre by opening a GGUS ticket.

Please address the ticket to your Operations Centre support unit, who is responsible of validating the request. In the GGUS ticket you must mention:

  1. the starting and ending time of the problem (including day and hour in UTC)
  2. the Site, NGI/federation of NGIs affected by the problem
  3. the VO affected by the problem (must be the OPS VO)
  4. a description of the problem
2 TPL
The NGI operations centre validates the request.
3 OC

If the request is deemed valid, a GGUS ticket is sent to Service Level Management(SLM) Support Unit.

The SLM support team will take care of discussing all requests received with the SAM team.

4 SLM SU
  1. validating the reported problems
  2. discuss the reported problems with the SAM Support Unit if needed
  3. notify the SAM SU about the requests received through a new parent ticket is submitted to SAM with the children tickets of the validated requests
5
SLM SU
  1. The SAM Support Unit is responsible of checking the requests and of regenerating the results. For the accepted requests all Nagios metric results for any site and service are set to unknown status from the beginning of the hour reported in the starting time to one hour after the ending time. This is to cover late results that could have arrived later. the availability and reliability of other sites won't be affected, as unknown periods are not considered in the computation.
  2. New monthly availability statistics will be recomputed for that particular period, Site, NGI/federation of NGIs.
  3. A new report will be made available 10 days after the first publication of the report.
  4. After publication of the new report, all child GGUS tickets will be closed.
6 SLM SU Close the parent ticket.




Integration steps

Integration covers following areas:

  1. Management - integration with Grid Configuration Database [GOCDB] (http://goc.egi.eu/)
    • e.g. creation of a service type
  2. Monitoring - integration with Service Availability Monitoring [SAM]
    • development of monitoring probes
    • integration of probes into SAM release
    • integration of probes with Ava/Rel profile
  3. Accounting - integration with EGI Accounting Infrastructure
    • enabling all the accounting data to be collected in one place for a unified view
  4. Support - integration with EGI Helpdesk [GGUS] (http://ggus.eu)
    • creation of support unit and using it to provide support
  5. Dashboard - integration with Operations Portal (http://operations-portal.egi.eu/)
    • integration of probes into Operations Tests pool
  6. Documentation - integration with existing operations procedures
    • integration with Resource Center OLA
    • integration with Resource Center registration and certyfication procedure
      • creation of how to manually test the middleware
  7. Information System - integration with EGI Information system
# Responsible Action
0 RC

Contact your Resource Infrastructure Operations Manager (contact information is available at EGI web site).

  • Provide the required information according to the template available in the Required information page.
1 RP

Accept or reject registration request and communicate this result back to applicant.

  • If the Resource Centre is accepted, notify the relevant Operations Centre, handle the Resource Centre information received, and put the Operations Centre in contact with the Resource Centre Operations Manager.
2 OC
  1. Forward all documentation:
  2. Clarify any doubts or questions.

Include the Operations Centre ROD, CSIRT,  or help-desk teams in the step if necessary.

3 OC
  1. Add the Resource Centre to the GOCDBand flag it as "Candidate".
  2. Notify the Resource Centre Operations Manager that he/she should Join operations
    • In the GOCDB he/she should request the 'Resource Centre Operations Manager' role. Approve it when done.
  3. Notify the Resource Centre Operations Manager that person responsible for security should Join operations
    • In the GOCDB he/she should request the 'Resource Centre Security Officer' role. Approve it when done.
4 RC
  1. Complete any missing information for the Resource Centre's entry in the GOCDB
    • It includs services that are to be integrated into the infrastructure. See instruction
  2. Notify the Operations Centre that the Resource Centre information update is concluded.
  3. Note: If the external RC is considering buying host certs, please make sure they source them from an EGI recognised authority. The local national CA (usually for free or at flat rate) is likely the best source of their SSL certificates.
5 OC

Check GOC DB that the Resource Centre's information is correct.

  • Resource Centre (site) roles and any other additional information.
  • Check that contacts receive email (if they are mailing lists, check that outside EGI members are allowed to post there). Site administrator MUST reply to the test email.
  • Check that the required services for a Resource Centre are properly registered.
  • Check domain names and forward and reverse DNS.
6 OC

Any other Operations Centre-specific requirements (e.g. join a certain VO and/or mailing list, etc.)

7 OC

If all previous actions have been completed with success, notify the Resource Centre Operations Manager that the Registration is completed, and contact the Resource Infrastructure Operations Manager to notify that a new candidate Resource Centre exists and is ready to be certified.

After the successful completion of all these steps, the registration phase is completed and the Resource Centre is ready for the start of the certification phase.

Revision History

Version Authors Date Comments