Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "PROC02 Operations Centre creation"

From EGIWiki
Jump to navigation Jump to search
(42 intermediate revisions by 5 users not shown)
Line 1: Line 1:
{{Template:Op menubar}}
{{Template:Op menubar}} {{Template:Doc_menubar}} {{TOC_right}}  
{{Template:Doc_menubar}}
[[Category:Operations Procedures]]
{{TOC_right}}  


{{Ops_procedures
{{Ops_procedures
|Doc_title = Operations Centre creation  
|Doc_title = Operations Centre creation  
|Doc_link = [[PROC02|https://wiki.egi.eu/wiki/PROC02]]
|Doc_link = [[PROC02|https://wiki.egi.eu/wiki/PROC02]]
|Version = 2.07 - 15.11.2011
|Version = 19 August 2014
|Policy_acronym = OMB/COD
|Policy_acronym = OMB
|Policy_name = Operations Management Board/Central Operator on Duty
|Policy_name = Operations Management Board
|Contact_group = manager-central-operator-on-duty@mailman.egi.eu  
|Contact_group = operations@egi.eu
|Doc_status = Approved
|Doc_status = Approved
|Approval_date = 26.10.2010  
|Approval_date = 26.10.2010  
|Procedure_statement = The purpose of this document is to clearly describe the actions and the relative steps to be undertaken for integrating a Operations Centre into the EGI operational structure.
|Procedure_statement = The purpose of this document is to clearly describe the actions and the relative steps to be undertaken for integrating a Operations Centre into the EGI operational structure.
}}
|Owner = Matthew Viljoen
}}  


= Overview  =
= Overview  =
Line 22: Line 20:
= Definitions  =
= Definitions  =


'''The Integration Process Coordinator''' (IPC) is the entity responsible for integrating a new Operations Centre within EGI. The IPC can be the EGEE parent ROC of the Operations Centre (if still operational), or COD.
'''The Integration Process Coordinator''' (IPC) is the entity responsible for integrating a new Operations Centre within EGI.  


Please refer to the [[Glossary|EGI Glossary]] for the definitions of the terms used in this procedure.
Please refer to the [[Glossary|EGI Glossary]] for the definitions of the terms used in this procedure.  


The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.  


= Validation  =
= Validation  =
Line 32: Line 30:
First step for validating an Operations Centre is to be politically approved as formed from an official partner of the EGI infrastructure.  
First step for validating an Operations Centre is to be politically approved as formed from an official partner of the EGI infrastructure.  


After this step, the Operations Centre needs to be technically validated. The Central Operator on Duty team - in charge of EGI oversight - is responsible for performing this validation. However, if the Operations Centre is a part of an ex EGEE-ROC structure which is still operating, and which is willing to perform the validation, then the ROC can perform the validation itself.
After this step, the Operations Centre needs to be technically validated. The Operations team is responsible for performing this validation. <br>


== Political Validation  ==
== Political Validation  ==


CASE 1. If an Operations Centre is already represented within the EGI Council and is ready to move from an EGEE ROC to an operational Operations Centre, we recommend that the Operations Centre political representative within the EGI Council notifies the EGI Chief Operations Officer that the respective Operations Centre is entering its validation cycle. At this point, technical validation can start.  
CASE 1. If the organizaiton running an Operations Centre is already represented within the EGI Council, the respective council member notifies the Director (director@egi.eu) and the Service Deploy team of the EGI Foundation (operations@mailman.egi.eu) that the respective Operations Centre is entering its validation cycle. At this point, tne technical validation can start.  


CASE 2. If an Operations Centre is not represented within the EGI Council, and it is willing to be represented there, the Operations Centre needs to submit a request for admission to the Council. After the Operations Centre has been accepted by the Council, CASE 1 applies.
CASE 2. If the organization running an Operations Centre is not represented within the EGI Council, director@egi.eu has to be contacted formalize the collaboration (e.g. through a EGI membership or MoU). When the type of collaboration is agreed, CASE 1 applies.  
 
CASE 3. If a new Operations Centre is not represented within the EGI Council and is not interested in being part of it, but would still like to be a consumer of the EGI Global Services, then an MoU must be established with EGI. Once an MoU is in place technical validation can start.  


== Technical Validation  ==
== Technical Validation  ==


#The Operations Centre Operations manager(s) sends an email to the Chief Operations Officer (COO) that the Council was informed about the creation of a new Operations Centre, and (if applicable) also that the Council has approved it. The Operations Centre Operations manager(s) should also indicate the IPC responsible for the validation in the email.
#The EGI Foundation Director informs operations@mailman.egi.eu about the agreed partnership. The IPC is appointed by the Service Delivery Lead. The Operations Centre Operations manager(s) is notified about the IPC contact.  


#The Chief Operations Officer opens a [[GGUS]] ticket to the IPC to start the validation process.
#The EGI Service Delivery team opens a [[GGUS]] ticket to the IPC to start the validation process.


= Start of the integration  =
= Start of the integration  =


*The integration of a new Operations Centre starts when the COO opens an Operations Centre validation ticket to the IPC (via GGUS).
*The integration of a new Operations Centre starts when the Service Delivery Team opens an Operations Centre validation ticket to the IPC (via GGUS).


*Once the COO ticket is filed, the IPC can start the validation process. In order to trigger the actions described in this document the IPC creates a set of new child tickets that are assigned to the individual partners that are responsible for the various validation steps. Thereby, the integration process should be as transparent as possible to all parties involved. The required actions are described below.
*Once the ticket is filed, the IPC can start the validation process. In order to trigger the actions described in this document the IPC creates a set of new child tickets that are assigned to the individual partners that are responsible for the various validation steps. Thereby, the integration process should be as transparent as possible to all parties involved. The required actions are described below.


<br> An example/template for the Operations Centre creation ticket is provided here:  
<br> An example/template for the Operations Centre creation ticket is provided here:  
Line 77: Line 73:
Before opening an '''Operations Centre creation''' GGUS ticket, the Operations Centre should:  
Before opening an '''Operations Centre creation''' GGUS ticket, the Operations Centre should:  


#Make sure your NGI is able to fulfill RP OLA https://documents.egi.eu/secure/ShowDocument?docid=463  
#Make sure your operational organization is able to fulfil Resource Provider (RP) OLA https://documents.egi.eu/secure/ShowDocument?docid=463  
#Decide about the Operations Centre name. Name for European Operations Centres should start with "NGI_"  
#Decide about the Operations Centre name. In case of national European Operations Centres, the name should start with "NGI_"  
#Decide whether to use the Operations Centre's own help desk system or use GGUS directly. If the Operations Centre wants to set up their own system they need to provide an interface for interaction to GGUS with the local ticketing system and follow the recommendations available at https://ggus.eu/pages/ggus-docs/interfaces/docu_ggus_interfaces.php.  
#Decide whether to use the Operations Centre's own help desk system or use GGUS directly. If the Operations Centre wants to set up their own system they need to provide an interface for interaction to GGUS with the local ticketing system and follow the recommendations available at https://ggus.eu/pages/ggus-docs/interfaces/docu_ggus_interfaces.php.  
#Set the following contact points:  
#Set the following contact points:  
Line 87: Line 83:
##ROD team mailing list (including people responsible for monitoring and supporting the Operations Centre infrastructure)  
##ROD team mailing list (including people responsible for monitoring and supporting the Operations Centre infrastructure)  
##Mailing list for GGUS tickets IF GGUS is used directly for the helpdesk system  
##Mailing list for GGUS tickets IF GGUS is used directly for the helpdesk system  
#All certified Operations Centre sites need to be under Nagios monitoring. The Nagios monitoring infrastructure can be directly operated by the Operations Centre (see https://tomtools.cern.ch/confluence/display/SAMDOC/SAM-Nagios+Administrator+Guide and [[PROC05_Validation_of_a_Operations_Centre_Nagios]]), or alternative by a third party Operations Centre.  
#All certified Operations Centre sites need to be monitored by [https://wiki.egi.eu/wiki/ARGO ARGO monitoring infrastructure] operated centrally by EGI.
#Fill the FAQ document for the Operations Centre. The template is provided by the GGUS team: [[GGUS:FAQ Responsible Units|GGUS:FAQ_Responsible_Units]]  
#Fill the FAQ document for the Operations Centre. The template is provided by the GGUS team: [[GGUS:FAQ Responsible Units|GGUS:FAQ_Responsible_Units]]  
#Staff in the Operations Centre that should be granted a management role (manager, deputy manager, security officer) should first register a user account in the GOCDB. The user registration procedure is described in the GOCDB user documentation at https://twiki.cern.ch/twiki/bin/view/EMI/Glite-APELInstallation, section 4.1.1  
#Staff in the Operations Centre that should be granted a management role (manager, deputy manager, security officer) should first register a user account in the GOCDB. The user registration procedure is described in the GOCDB user documentation at https://twiki.cern.ch/twiki/bin/view/EMI/Glite-APELInstallation, section 4.1.1  
Line 93: Line 89:
#People who are responsible for operations should be subscribed to following mailing lists (unless differently specified):<br>  
#People who are responsible for operations should be subscribed to following mailing lists (unless differently specified):<br>  
##Operations Centre manager:'''<br>''' noc-managers [at] mailman.egi.eu - registration through https://mailman.egi.eu/mailman/listinfo'''<br>'''  
##Operations Centre manager:'''<br>''' noc-managers [at] mailman.egi.eu - registration through https://mailman.egi.eu/mailman/listinfo'''<br>'''  
##ROD team:'''<br>'''All-operator-on-duty [at] mailman.egi.eu - list which integrate all NGI ROD mailing list. NGI ROD mailing list will be add as a result of ROD certification procedure'''<br>'''
##ROD team:'''<br>'''All-operator-on-duty [at] mailman.egi.eu - list which integrate all Operations Centres ROD mailing list. <br>Mailing list is populated automatically from GOCDB. New Operations Centres should make sure to record accurate information in GOCDB.  
##Regional Dashboard administrator:'''<br>'''ops-portal [at] mailman.egi.eu - registration through https://mailman.egi.eu/mailman/listinfo<br>Additionally Regional Dashboard administrator need to ask dashboard team (cic-information [at] in2p3.fr) to declare him in GOC DB.<br>
##Regional Nagios administrator:'''<br>'''tool-admins [at] mailman.egi.eu - registration through https://mailman.egi.eu/mailman/listinfo'''<br>'''
##Regional Helpdesk administrator:'''<br>'''ggus-if-devs [at] cern.ch - mailing list designed for coordination of changes in the interface<br>To register please send a request (through for example GGUS system) to GGUS support staff.  
##Regional Helpdesk administrator:'''<br>'''ggus-if-devs [at] cern.ch - mailing list designed for coordination of changes in the interface<br>To register please send a request (through for example GGUS system) to GGUS support staff.  
##Security staff:'''<br>'''ngi-security-contacts [at] mailman.egi.eu, NGI security officers subscribe to this mailing list<br>site-security-contacts [at] mailam.egi.eu, Site (only certified site) security officers subscribe to this mailing list<br><br>Both mailing lists are populated automatically from GOCDB. New Operations Centres should make sure to record accurate information in GOCDB.  
##Security staff:'''<br>'''ngi-security-contacts [at] mailman.egi.eu, NGI or OC security officers subscribe to this mailing list<br>site-security-contacts [at] mailam.egi.eu, Site (only certified site) security officers subscribe to this mailing list<br><br>Both mailing lists are populated automatically from GOCDB. New Operations Centres should make sure to record accurate information in GOCDB.  
##(Recommended) Site administrators:'''<br> '''<span class="gI">LCG-ROLLOUT [at] jiscmail.ac.uk</span> - list gathers all site admins and is designed for technical discussions - membership not mandatory but useful'''<br> '''Subscription is possible through https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=LCG-ROLLOUT&amp;A=1'''<br>'''
##(Recommended) Site administrators:'''<br> '''<span class="gI">LCG-ROLLOUT [at] jiscmail.ac.uk</span> - list gathers all site admins and is designed for technical discussions - membership not mandatory but useful'''<br> '''Subscription is possible through https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=LCG-ROLLOUT&amp;A=1'''<br>'''


= Steps  =
= Steps  =


{| width="100%"  
{| width="100%"
|-
|-
| width="70%" style="vertical-align: top;" |  
| width="70%" style="vertical-align: top;" |  
Line 112: Line 106:
<br> Validation steps:  
<br> Validation steps:  


{| class="wikitable"
{| class="wikitable"
|-
|-
! Step  
! Step  
Line 125: Line 119:
| IPC  
| IPC  
| Verification of the validity of the request (were all needed data provided?)  
| Verification of the validity of the request (were all needed data provided?)  
|  
| <br>
| To check is all needed information are provided to make the process quick
| To check is all needed information are provided to make the process quick
|-
|-
Line 132: Line 126:
| IPC  
| IPC  
| Create child tickets in the order given as follows:  
| Create child tickets in the order given as follows:  
|  
| <br>
| <br>
| <br>
|-
|-
Line 146: Line 140:
*Operations Centre security officer:
*Operations Centre security officer:


| To register new&nbsp; NGI in official EGI database
| To register new&nbsp; OC in official EGI database
|-
|-
| <br>  
| <br>  
Line 153: Line 147:
| '''Enter the new Operations Centre in the''' '''Operations dashboard''' (also add ROD mailing list)  
| '''Enter the new Operations Centre in the''' '''Operations dashboard''' (also add ROD mailing list)  
| ROD email list  
| ROD email list  
| To allow new NGI to use operations dashboard which is <span lang="en" id="result_box" class="short_text"><span class="hps">required to perform grid oversight activity/span&gt;</span></span>
| To allow new OC to use operations dashboard which is <span lang="en" id="result_box" class="short_text"><span class="hps">required to perform grid oversight activity/span&gt;</span></span>
|-
|-
| <br>  
| <br>  
Line 162: Line 156:


|  
|  
NGI should fill in <span class="solution"><nowiki>https://wiki.egi.eu/wiki/GGUS:NGI_XXX_FAQ</nowiki>
OC should fill in <span class="solution"><nowiki>https://wiki.egi.eu/wiki/GGUS:NGI_XXX_FAQ</nowiki> or <nowiki>https://wiki.egi.eu/wiki/GGUS:XXX_FAQ</nowiki></span>  
</span>  


Instruction how to create GGUS wiki page [[GGUS:FAQ_Responsible_Units ]]
Instruction how to create GGUS wiki page [[GGUS:FAQ Responsible Units]]  


| To make possible to ceate tickets in official EGI&nbsp;ticketing system to new NGI
| To make possible to ceate tickets in official EGI&nbsp;ticketing system to new OC
|-
|-
| <br>  
| <br>  
| 4  
| 4  
| COD
| Operations
| '''Certification of new ROD team'''  
| '''Certification of new ROD team'''  
[[Grid operations oversight/WI01|Procedure_to_handle_new_ROD_certification_GGUS_tickets]]  
[[Grid operations oversight/WI01|Procedure_to_handle_new_ROD_certification_GGUS_tickets]]  
Line 192: Line 185:
https://documents.egi.eu/secure/ShowDocument?docid=463  
https://documents.egi.eu/secure/ShowDocument?docid=463  


|  
| <br>
| <br>
| <br>
|-
|-
Line 219: Line 212:


|  
|  
To make possible to register operations staff from new NGI to register in Dteam VO.  
To make possible to register operations staff from new OC to register in Dteam VO.  


This VO gather all operations staff and allow them to access the grid.  
This VO gather all operations staff and allow them to access the grid.  
Line 227: Line 220:
| <br>  
| <br>  
| The newly created Operations Centre  
| The newly created Operations Centre  
|  
| '''Final confirmation that the new Operations Centre can start the operations'''  
''(This step only applies to Operations Centres running an Operations Centre Nagios instance)''
 
'''Include the NAGIOS host in the GOCDB '''as a 'ngi.SAM' service.
 
|
| To validate Nagios instance it has to be first &nbsp; registered in GOC&nbsp;DB&nbsp;
|-
| 6
| <br>  
| <br>  
| ''<br>''
| The confirmation means that OC will provide all services required in OLA
The newly created Operations Centre / IPC
 
|
'''Configure the Operations Cente in the Nagios instance'''
 
|
| <br>
|-
|-
| 7<br>
| rowspan="2" | 6
| <br>
| Nagios team
|
''(This step only applies to Operations Centres running an Operations Centre Nagios instance)''
 
'''Include the Operations Centre level Nagios in the central ops-monitor Nagios instance. '''
 
|
| <br>
|-
| 8
| <br>
| The newly created Operations Centre
| '''Final confirmation that the new Operations Centre can start the operations'''
|
| The confirmation means that NGI will provide all services required in OLA
|-
| rowspan="2" | 9
| rowspan="2" | <br>  
| rowspan="2" | <br>  
| GOCDB  
| GOCDB  
|  
|  
*''[If the Operations Centre was part of a ROC]''
*''[If the Operations Centre was part of a OC]''


'''GOCDB transfers related sites''' from the source ROC to the new Operations Centre structure.<br>  
'''GOCDB transfers related sites''' from the source ROC to the new Operations Centre structure.<br>  


| <br>  
| <br>  
| The sites moving across to the new NGI in the ticket indicated in the ticket.
| The sites moving across to the new OC in the ticket indicated in the ticket.
|-
|-
| The newly created Operations Centre  
| The newly created Operations Centre  
|  
|  
*''[If the Operations Centre wasn't part of a ROC]''
*''[If the Operations Centre wasn't part of a OC]''


'''Newly created Operations Centre should insert at least one sites'''  
'''Newly created Operations Centre should insert at least one sites'''  
Line 284: Line 244:
| Nagios system need at least one site to be validated.
| Nagios system need at least one site to be validated.
|-
|-
| 10
| 7
| <br>  
| <br>  
| The newly created Operations Centre  
| The newly created Operations Centre  
|  
|  
''[If the Operations Centre was part of a ROC]'' '''Transfer all open operational tickets to the new Operations Centre in GGUS'''.  
''[If the Operations Centre was part of a OC]'' '''Transfer all open operational tickets to the new Operations Centre in GGUS'''.  


|  
| <br>
| To <span lang="en" class="short_text" id="result_box"><span class="hps">ensure that non of the GGUS tickets were forgotten durign the process.</span></span>
| To <span lang="en" class="short_text" id="result_box"><span class="hps">ensure that non of the GGUS tickets were forgotten durign the process.</span></span>
|-
|-
| 11
| 8
| <br>  
| <br>  
| IPC
| The newly created Operations Centre  
|
| '''Check that all the sites are visible in ARGO and that alarms show up in Operations portal'''  
''[This step only applies to Operations Centres running an Operations Centre Nagios instance]''
 
'''Validation process of the new Operations Centre Nagios''', as described at :
 
[[PROC05_Validation_of_a_Operations_Centre_Nagios]].
 
|
| <br>
|-
| 12
| <br>
| Nagios team
| '''Validation that sites/Operations Centre shown up correctly in Central DBs'''  
|
| <br>
|-
| 13
| <br>  
| <br>  
| ''<br>''
Nagios team
|
''[If the Operations Centre was part of a ROC]''
''(This step only applies to Operations Centres running an Operations Centre Nagios instance)''
'''Migrating alerts from ROC to Operations Centre Nagios instance. '''
|
| <br>
| <br>
|-
|-
| 14<br>  
| 9<br>  
| OPTIONAL  
| OPTIONAL  
| The newly created Operations Centre  
| The newly created Operations Centre  
| '''All sites should configured GIIS''' according to the instructions at:  
| '''All sites should configured GIIS''' according to the instructions at:  
[[MAN01|MAN1_How_to_publish_Site_Information]]
[[MAN01|MAN1_How_to_publish_Site_Information]]  
 
change the old information from:
 
  GlueSiteOtherInfo: EGEE_ROC=XXX
GlueSiteOtherInfo: GRID=EGEE
 
to:


   GlueSiteOtherInfo: EGI_NGI=&lt;Operations Centre name&gt;
   GlueSiteOtherInfo: EGI_NGI=&lt;Operations Centre name&gt;
Line 346: Line 271:
This step can be performed at any time from this point.  
This step can be performed at any time from this point.  


|  
| <br>
| To confirm that all sites publish proper data in information system about new NGI.
| To confirm that all sites publish proper data in information system about new NGI.
|-
|-
| 15<br>  
| 10<br>  
| <br>  
| <br>  
| The newly created Operations Centre  
| The newly created Operations Centre  
|  
|  
''[If the Operations Centre was part of a ROC]''  
''[If the Operations Centre was part of a OC]''  


'''Inform managers of regional VOs to change the VO scope of their VOs''' from ROC (regional) to the relevant Operations Centre (national). This action require only confirmation from Operations Centre manager that information was passed.  
'''Inform managers of regional VOs to change the VO scope of their VOs''' to the relevant Operations Centre (national). This action require only confirmation from Operations Centre manager that information was passed.  


Information which should be pass to VO managers: "The Vo scope can be changes by creating a ticket to CIC portal SU in GGUS."  
Information which should be pass to VO managers: "The Vo scope can be changes by creating a ticket to Operations portal SU in GGUS."  


|  
| <br>
| To spread among Vo informati
| To spread among Vo information
|-
|-
| 16
| 11
| <br>  
| <br>  
| The newly created Operations Centre  
| The newly created Operations Centre  
| '''NGI_XX_SERVICES service group'''  
| '''NGI_XX_SERVICES service group''' / '''XX_SERVICES service group'''
The newly created Operations Centre should create in GOC DB NGI_XX_SERVICES service group and attached services listed on page: [[NGI_services_in_GOCDB]]
The newly created Operations Centre should create in GOC DB NGI_XX_SERVICES (if national entity) or XX_SERVICES (if international entity) service group and attached services listed on page: [[NGI services in GOCDB]]  


operations @ egi.eu should be informed by mail about newly created service group and asked to updated the wiki page table.
| <br>
|  
| <br>
|-
| 12
| <br>
| The newly created Operations Centre
| '''Create a GGUS ticket to "Operations"''' SU&nbsp;with information about newly created service group and official OC Top-BDII, SAM, Argus instance. Ask to add OC manager to OC managers mailing list.&nbsp;
| <br>
|  
|  
''&nbsp;''
|-
|-
| 17
| 13
| <br>  
| <br>  
| IPC  
| IPC  
| '''Final checks by the IPC. '''  
| '''Final checks by the IPC. '''  
|  
| <br>
|  
|  
''Were all steps taken and finished properly?&nbsp;''  
''Were all steps taken and finished properly?&nbsp;''  


|-
|-
| 18
| 14
| <br>  
| <br>  
| The newly created Operations Centre
| Operations <br>
| '''Final checks '''should be verified and then the information that the Operations Centre is ready is broadcast by Operations Centre officials.  
|  
''(This broadcast should be sent to VO managers and NOC/ROC managers)''  
'''Final checks '''should be verified.
 
The information that the Operations Centre is ready should be sent in Monthly broadcast and announced during OMB&nbsp;by EGI&nbsp;Operations team.  
 
''(This broadcast should be sent to VO managers ('''except Ops and Dteam VO''') and NOC/ROC managers)''  


See the template below for an indication of the message content.  
See the template below for an indication of the message content.  
Line 400: Line 337:
Best regards,
Best regards,
</pre>  
</pre>  
|  
| <br>
| <br>
| <br>
|}
|}
Line 407: Line 344:


{| class="wikitable"
{| class="wikitable"
|-  
|-
! Version  
! Version  
! Authors  
! Authors  
! Date  
! Date  
! Comments
! Comments
|-
| 2.12
| Tiziana Ferrari
| 2020-07-01
| Update of steps involving the council and of terminology related to roles within the Service Delivery Team.
|-
| 2.11
| Alessandro Paolini
| 2016-12-16
| The monitoring is operated centrally, no more need of regional nagios servers. Procedure modified accordingly.
|-
| 2.10
| Alessandro Paolini
| 2016-06-08
| made distinction between NGI (national entity) and Operations Centre (international entity, it could include also more NGIs). To check how to modify the step 14. To  check how to modify the related wiki in step 16 (https://wiki.egi.eu/wiki/NGI_services_in_GOCDB).
|-
| 2.09
| Alessandro Paolini
| 2016-06-07
| "EGI Operations Support" was decommissioned, changed all the references to "Operations"
|-
| 2.08
| Malgorzata Krakowian
| 2014-10-06
| step 17 - providing information to Operations about NGI core services
|-
|-
| 2.07<br>  
| 2.07<br>  
Line 457: Line 419:
| 2010-10-26  
| 2010-10-26  
| Approved by OMB
| Approved by OMB
|-
|
| M. Krakowian
| 19 August 2014
| Change contact group -&gt; Operations support
|}
|}
[[Category:Operations_Procedures]]

Revision as of 08:05, 1 July 2020

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Documentation menu: Home Manuals Procedures Training Other Contact For: VO managers Administrators



Title Operations Centre creation
Document link https://wiki.egi.eu/wiki/PROC02
Last modified 19 August 2014
Policy Group Acronym OMB
Policy Group Name Operations Management Board
Contact Group operations@egi.eu
Document Status Approved
Approved Date 26.10.2010
Procedure Statement The purpose of this document is to clearly describe the actions and the relative steps to be undertaken for integrating a Operations Centre into the EGI operational structure.
Owner Matthew Viljoen


Overview

The purpose of this document is to clearly describe actions and relative steps to be undertaken for integrating an Operations Centre into the EGI operational structure.

Definitions

The Integration Process Coordinator (IPC) is the entity responsible for integrating a new Operations Centre within EGI.

Please refer to the EGI Glossary for the definitions of the terms used in this procedure.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Validation

First step for validating an Operations Centre is to be politically approved as formed from an official partner of the EGI infrastructure.

After this step, the Operations Centre needs to be technically validated. The Operations team is responsible for performing this validation.

Political Validation

CASE 1. If the organizaiton running an Operations Centre is already represented within the EGI Council, the respective council member notifies the Director (director@egi.eu) and the Service Deploy team of the EGI Foundation (operations@mailman.egi.eu) that the respective Operations Centre is entering its validation cycle. At this point, tne technical validation can start.

CASE 2. If the organization running an Operations Centre is not represented within the EGI Council, director@egi.eu has to be contacted formalize the collaboration (e.g. through a EGI membership or MoU). When the type of collaboration is agreed, CASE 1 applies.

Technical Validation

  1. The EGI Foundation Director informs operations@mailman.egi.eu about the agreed partnership. The IPC is appointed by the Service Delivery Lead. The Operations Centre Operations manager(s) is notified about the IPC contact.
  1. The EGI Service Delivery team opens a GGUS ticket to the IPC to start the validation process.

Start of the integration

  • The integration of a new Operations Centre starts when the Service Delivery Team opens an Operations Centre validation ticket to the IPC (via GGUS).
  • Once the ticket is filed, the IPC can start the validation process. In order to trigger the actions described in this document the IPC creates a set of new child tickets that are assigned to the individual partners that are responsible for the various validation steps. Thereby, the integration process should be as transparent as possible to all parties involved. The required actions are described below.


An example/template for the Operations Centre creation ticket is provided here:

Subject: Creation of <Operations Centre name>


Required information for the creation of the Operations Centre:
Management mailing list : management@xxx.org
Operations Centre Operations manager contact data : Person Surname (email) +phone contact,
Deputy: Person Surname (email) +phone contact,

Operations Centre security officer contact data : Person Surname (email) +phone contact,
Operations Centre security mailing list : abuse@xxx.org

ROD team mailing list : ngi-support@xxx.org
Operations Centre nagios monitoring system details : https://mon-ngi.xxx.org/nagios
Mailing list for GGUS tickets if using GGUS directly : xxxticket@xxx.org

The FAQ document for the Operations Centre provided by the GGUS team as described in [[GGUS:FAQ_Responsible_Units]] 

Pre-requisites

Before opening an Operations Centre creation GGUS ticket, the Operations Centre should:

  1. Make sure your operational organization is able to fulfil Resource Provider (RP) OLA https://documents.egi.eu/secure/ShowDocument?docid=463
  2. Decide about the Operations Centre name. In case of national European Operations Centres, the name should start with "NGI_"
  3. Decide whether to use the Operations Centre's own help desk system or use GGUS directly. If the Operations Centre wants to set up their own system they need to provide an interface for interaction to GGUS with the local ticketing system and follow the recommendations available at https://ggus.eu/pages/ggus-docs/interfaces/docu_ggus_interfaces.php.
  4. Set the following contact points:
    1. Management mailing list
    2. Operations Centre operations manager contact data
    3. Operations Centre security officer contact data
    4. Operations Centre security mailing list
    5. ROD team mailing list (including people responsible for monitoring and supporting the Operations Centre infrastructure)
    6. Mailing list for GGUS tickets IF GGUS is used directly for the helpdesk system
  5. All certified Operations Centre sites need to be monitored by ARGO monitoring infrastructure operated centrally by EGI.
  6. Fill the FAQ document for the Operations Centre. The template is provided by the GGUS team: GGUS:FAQ_Responsible_Units
  7. Staff in the Operations Centre that should be granted a management role (manager, deputy manager, security officer) should first register a user account in the GOCDB. The user registration procedure is described in the GOCDB user documentation at https://twiki.cern.ch/twiki/bin/view/EMI/Glite-APELInstallation, section 4.1.1
  8. Staff in Operations Centre is familiar with Operational Procedures
  9. People who are responsible for operations should be subscribed to following mailing lists (unless differently specified):
    1. Operations Centre manager:
      noc-managers [at] mailman.egi.eu - registration through https://mailman.egi.eu/mailman/listinfo
    2. ROD team:
      All-operator-on-duty [at] mailman.egi.eu - list which integrate all Operations Centres ROD mailing list.
      Mailing list is populated automatically from GOCDB. New Operations Centres should make sure to record accurate information in GOCDB.
    3. Regional Helpdesk administrator:
      ggus-if-devs [at] cern.ch - mailing list designed for coordination of changes in the interface
      To register please send a request (through for example GGUS system) to GGUS support staff.
    4. Security staff:
      ngi-security-contacts [at] mailman.egi.eu, NGI or OC security officers subscribe to this mailing list
      site-security-contacts [at] mailam.egi.eu, Site (only certified site) security officers subscribe to this mailing list

      Both mailing lists are populated automatically from GOCDB. New Operations Centres should make sure to record accurate information in GOCDB.
    5. (Recommended) Site administrators:
      LCG-ROLLOUT [at] jiscmail.ac.uk - list gathers all site admins and is designed for technical discussions - membership not mandatory but useful
      Subscription is possible through https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=LCG-ROLLOUT&A=1

Steps

Some steps of the process can be done in parallel as they are independent, so all steps grouped within the same task number can be performed concurrently (several different child tickets will be created in order to speed up the process). The general idea is that these tickets must be closed before being able to move on to the next step.

Operations Centre creation.jpg
Creation procedure for new Operations Centre if the Operations Centre wasn't part of a ROC


Validation steps:

Step Substep Action on Action Required data Notes
1
IPC Verification of the validity of the request (were all needed data provided?)
To check is all needed information are provided to make the process quick
2
IPC Create child tickets in the order given as follows:


1 GOCDB Creation of a new Operations Centre entry in the (with no site attached).
  • Operations Centre name: <Operations Centre name>
  • Operations Centre management mailing list: foo@bar.org
  • Operations Centre security mailing list:
  • Operations Centre Operations manager:
  • Operations Centre security officer:
To register new  OC in official EGI database

2 Operations Portal Enter the new Operations Centre in the Operations dashboard (also add ROD mailing list) ROD email list To allow new OC to use operations dashboard which is required to perform grid oversight activity/span>

3 GGUS Create a new support unit in GGUS


OC should fill in https://wiki.egi.eu/wiki/GGUS:NGI_XXX_FAQ or https://wiki.egi.eu/wiki/GGUS:XXX_FAQ

Instruction how to create GGUS wiki page GGUS:FAQ Responsible Units

To make possible to ceate tickets in official EGI ticketing system to new OC

4 Operations Certification of new ROD team

Procedure_to_handle_new_ROD_certification_GGUS_tickets

Include in the GGUS ticket for :

  • Country
  • Operations Centre acronym
  • ROD email list
To verify if the future ROD team members are properly trained to perform their duties
3
The newly created Operations Centre

Confirmation that Operations Centre read, accept and agree to fulfill the RP OLA 

https://documents.egi.eu/secure/ShowDocument?docid=463



4
Dteam VO manager Create a branch/group in Dteam VO for Operations Centre and assign DN of people who will be dteam VO representative for the Operations Centre, Operations Centre Group owner and Operations Centre Group manager.

Responsibilities and terminology

VO representative: A person that can approve or deny dteam VO membership requests. This person is selected by the applicant during the registration phase.

Group owner: A person that can approve or deny Group membership requests. In addition he can create subgroups within his Group.

Group manager: A person that can approve or deny Group membership requests.

These responsibilities can be assigned to the same person(s).

How to assign the child ticket: assign the ticket to "VO support" after the selection of "dteam" in the concerned VO field

This step is not blocking the process and can be done in parallel. It is required to finish this step before closing parent ticket.

Following data about each responsible person:
  • Name and Surname
  • DN
  • email

To make possible to register operations staff from new OC to register in Dteam VO.

This VO gather all operations staff and allow them to access the grid.

5
The newly created Operations Centre Final confirmation that the new Operations Centre can start the operations
The confirmation means that OC will provide all services required in OLA
6
GOCDB
  • [If the Operations Centre was part of a OC]

GOCDB transfers related sites from the source ROC to the new Operations Centre structure.


The sites moving across to the new OC in the ticket indicated in the ticket.
The newly created Operations Centre
  • [If the Operations Centre wasn't part of a OC]

Newly created Operations Centre should insert at least one sites


Nagios system need at least one site to be validated.
7
The newly created Operations Centre

[If the Operations Centre was part of a OC] Transfer all open operational tickets to the new Operations Centre in GGUS.


To ensure that non of the GGUS tickets were forgotten durign the process.
8
The newly created Operations Centre Check that all the sites are visible in ARGO and that alarms show up in Operations portal

9
OPTIONAL The newly created Operations Centre All sites should configured GIIS according to the instructions at:

MAN1_How_to_publish_Site_Information

 GlueSiteOtherInfo: EGI_NGI=<Operations Centre name>
GlueSiteOtherInfo: GRID=EGI

This step can be performed at any time from this point.


To confirm that all sites publish proper data in information system about new NGI.
10

The newly created Operations Centre

[If the Operations Centre was part of a OC]

Inform managers of regional VOs to change the VO scope of their VOs to the relevant Operations Centre (national). This action require only confirmation from Operations Centre manager that information was passed.

Information which should be pass to VO managers: "The Vo scope can be changes by creating a ticket to Operations portal SU in GGUS."


To spread among Vo information
11
The newly created Operations Centre NGI_XX_SERVICES service group / XX_SERVICES service group

The newly created Operations Centre should create in GOC DB NGI_XX_SERVICES (if national entity) or XX_SERVICES (if international entity) service group and attached services listed on page: NGI services in GOCDB



12
The newly created Operations Centre Create a GGUS ticket to "Operations" SU with information about newly created service group and official OC Top-BDII, SAM, Argus instance. Ask to add OC manager to OC managers mailing list. 

 

13
IPC Final checks by the IPC.

Were all steps taken and finished properly? 

14
Operations

Final checks should be verified.

The information that the Operations Centre is ready should be sent in Monthly broadcast and announced during OMB by EGI Operations team.

(This broadcast should be sent to VO managers (except Ops and Dteam VO) and NOC/ROC managers)

See the template below for an indication of the message content.

Subject: <Operations Centre name> is operational

Dear All,

We would like to announce that <Operations Centre name> is now fully operational 
and that we have finished its integration procedure. All necessary operational 
teams and tools are established in our Operations Centre and we are ready for production. 
This Operations Centre is visible in all operational tools as <Operations Centre name> 
and is responsible for all <COUNTRY> sites.

Best regards,


Revision history

Version Authors Date Comments
2.12 Tiziana Ferrari 2020-07-01 Update of steps involving the council and of terminology related to roles within the Service Delivery Team.
2.11 Alessandro Paolini 2016-12-16 The monitoring is operated centrally, no more need of regional nagios servers. Procedure modified accordingly.
2.10 Alessandro Paolini 2016-06-08 made distinction between NGI (national entity) and Operations Centre (international entity, it could include also more NGIs). To check how to modify the step 14. To check how to modify the related wiki in step 16 (https://wiki.egi.eu/wiki/NGI_services_in_GOCDB).
2.09 Alessandro Paolini 2016-06-07 "EGI Operations Support" was decommissioned, changed all the references to "Operations"
2.08 Malgorzata Krakowian 2014-10-06 step 17 - providing information to Operations about NGI core services
2.07
Malgorzata Krakowian
2011-11-15
Reordered to have all Nagios related steps as close as possible. Added new column with explanation why the step should be taken.
2.06 Malgorzata Krakowian 2011-11-02 Step for RP OLA acceptance added; New point to prerequisites about RP OLA added
2.05 Malgorzata Krakowian 2011-09-28 Cleaning; GGUS require wiki page not a faq document etc.
2.04 Malgorzata Krakowian 2011-04-1 Step concerning SAMAP tool was removed due to tool decommission
2.03 Gonçalo Borges 2011-03-31 Operations Centre creation process visualization
2.02 T. Ferrari 2011-03-17 Assignment of procedure number, update of title "Operations Centre creation coordination procedure" to "Operations Centre Creation", small editorial improvements
2.02 Małgorzata Krakowian 2011-01-27 Name change from NGI to Operations Centre
2.01 Dimitris Zilaskos 2010-12-06 Clarification concerning dteam VO branch creation step
2.00 Małgorzata Krakowian, M. Radecki 2010-10-26 Approved by OMB
M. Krakowian 19 August 2014 Change contact group -> Operations support