Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Instructions for centrally-provided services Service Providers"

From EGIWiki
Jump to navigation Jump to search
Line 20: Line 20:
#Negotiate and sign OLA with EGI.eu<br>  
#Negotiate and sign OLA with EGI.eu<br>  
#Create wiki entry for the tool with relevant information  
#Create wiki entry for the tool with relevant information  
#*Tool name
#*Tool name  
#*Short description of the tool
#*Short description of the tool  
#*URL
#*URL  
#*Contact
#*Contact  
#*GGUS Support Unit
#*GGUS Support Unit  
#*link to related OLA  
#*link to related OLA  
#*Change management:&nbsp;
#*Change management:&nbsp;  
#*link to EGI&nbsp;RT dashboard for the tool  
#**link to EGI&nbsp;RT dashboard for the tool  
#*other:&nbsp;internal bug/task tracking facilities
#**other:&nbsp;internal bug/task tracking facilities  
#**(if exist) link to OTAG team<br>
#**(if exist) link to OTAG team<br>  
#*Release and Deployment management:&nbsp;
#*Release and Deployment management:&nbsp;  
#**Release schedule  
#**Release schedule  
#**Release notes  
#**Release notes  
#**Roadmap  
#**Roadmap  
#**URL of test instance  
#**URL of test instance


<br>
<br>

Revision as of 14:55, 15 December 2014

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Tools menu: Main page Instructions for developers AAI Proxy Accounting Portal Accounting Repository AppDB ARGO GGUS GOCDB
Message brokers Licenses OTAGs Operations Portal Perun EGI Collaboration tools LToS EGI Workload Manager


Baustelle.png This page is under construction.



Initial activities


Every new operations tool developed needs to contact EGI Operations (operations@egi.eu) and provide following data:

  1. Name of the tool
  2. Support email
  3. Name and contact details of Tool Team leader
  4. Description of the tool - purpose

With EGI Operations team help following steps needs to be performed

  1. Creation of url in egi.eu domain
  2. Registration in GOC DB under EGI.eu NGI
  3. Create category in "Requirements" queue in EGI RT tracker and RT dashboard to receive and handle service requests 
  4. Create GGUS Support Unit to receive and handle incidents (define level of quality of support - default: Medium)
  5. Negotiate and sign OLA with EGI.eu
  6. Create wiki entry for the tool with relevant information
    • Tool name
    • Short description of the tool
    • URL
    • Contact
    • GGUS Support Unit
    • link to related OLA
    • Change management: 
      • link to EGI RT dashboard for the tool
      • other: internal bug/task tracking facilities
      • (if exist) link to OTAG team
    • Release and Deployment management: 
      • Release schedule
      • Release notes
      • Roadmap
      • URL of test instance


Ongoing activities

Change management

Goal: To ensure changes are planned, approved, implemented and reviewed in a controlled manner to avoid adverse impact of changes to services or the customers receiving services.

CM.png

Record

  1. EGI recording system is EGI RT 
  2. EGI user are instruted to submit changes requests in EGI RT.
  3. Team can use internal tracker but keeping consistance between EGI RT and internal tracker is responsibility of the development team.
  4. If tool has other customers than EGI, development team is responsible to inform EGI about submited change request from other customer.  

Classify

  1. All change requests should be classified in consistance manner. Classification in EGI RT is based on Tool the request is related to.
  2. Tool providers should define list of standard change request (a Change that is recurrent, well known, has been proceduralized to follow a pre-defined, relatively risk-free path, and is the accepted response to a specific requirement or set of circumstances, where authority is effectively given in advance of implementation.). Standard change request doesn't need approval to be implemented.
  3. Emergency change for Operational tools are the highest priority change related to security vulnerability and can be implemented without approval but will be subject of post-review.

Assess and Approve

  1. Every change which is not a standard change or emergency change should be assess (prioritize)and approved both internally, among the PTs and with OMB.
  2. Where needed dedicated Operations Tool Advisory Group can be set up. The OTAG mandate is to help developers in requirement prioritization and releasing process of operational tools. OTAG provide forums to discuss the tools evolution that meet the expressed needs of the EGI community. It has representation from the all end users groups depending on the tool.

Implement, release and deploy

Phase which take place in Release and Deployment process (see below).

Review

  1. Every release is a subject of post-implementeatin review. Within one week customers are providing feedback answering to following questions:
    • Were release notes and documentation sufficient?
    • Is the tool working as expected?
  2. For changes of high impact and hight risk, the steps required to reverse an unsuccessful change or remedy any negative effects shall be defined.

Release and deployment management

RnDM.png

Plan Release


Build Release

Test Release


Document

Development team is responsible for creation and maintenance of documentations, instructions and manuals related to the tool in collaboration with EGI Operations team.


Inform

Deploy Release

Review Release

  • Each tool will be released as a standalone package.
  • The release schedule and roadmap will be discussed and agreed both internally, among the PTs and with OMB.
  • Testing and documenting the released packages are responsibilities of the PTs.


To easy the information exchange and to discuss test plans following release procedure will be applied by every PT:


Ops tool release process.png

 

  • (T0) When the development of a new release is completed a first announce is sent to all the PTs and to all the actors that should be involved in the release testing (OTAG teams). The announce has to contain:
    • release notes
    • documentation links
    • detailed test plan
    • indication of the expected release date
    • all the information needed by the conformance criteria set by the SA2 activity for the software providers.

The expected release date and the kind of testing will depend on each specific release and on its importance.

  • If a test fails a report will be produced and the release sent back to development to restart the cycle. Tests will include a documentation review and a documentation update if needed. The test phase can be performed internally to the PT if no other tools or services are affected.
  • (T1) When all the tests are passed and no more development/testing cycles are needed the testing phase is concluded and a second release announce will be announced to the consumers of the new release during OMB meeting. This second announce will contain the actual release date (TR), the release notes, the documentation links and a document describing the testing phase details. The release can result in an immediate installation on the production instances for the centralized tools or will follow a deployment process, with a possible initial testing phase on a selected number of production instances according to the SA1 needs (StagedRollout, for further details refer to the project milestone MS402).

Each tool will maintain its own documentation on the web and links to these web pages will be maintained on the project wiki by the activity management in order to have a single access point to all the tools documentation. The activity management will also supervise the need and the editing of cross tools documentation to support the integrated usage of multiple components.




Test phase and monitoring the software quality

Testing of the released software is a responsibility of the product teams.

however the manpower available for each OTPT does not allow for fully independent testing. In almost all PTs the developers act also as testers, even if, when possible, not the same person performs both activities on the same piece of code.  To mitigate this situation it will be important for a PT to discuss the testing plan within the whole jra1 activity to agree contribution to testing from other PTs. Moreover in case of major or particularly important releases contribution can also be found outside the activity, mainly inside the operation community.

The quality of the operational tools releases will be constantly monitored trough a set of metrics that will include, in example, the number of bugs found in certification and in production, as well as the time needed to fix critical and non critical bugs by each PT. This metrics are currently under definition.

If the monitoring activity will show that the software quality of some operational tools needs to be improved it will reported to the projects management and actions will be undertaken in order to strengthen the testing phase, i.e. increase the number of external and independent testers to be found both internally to the partners developing the tools and externally within the community.

Incident and Problem management

Goal: To restore normal/agreed service operations within the agreed time after the occurance of an incident, and to investigate the root causes of (recurring) incidents in order to avoid future recurrence of incidents by resolving the underlying problem, or to ensure worarounds are available.

Incident and requests management

Support should be provided:

  • via the GGUS portal, which is the single point of contact for infrastructure users to access the EGI Service Desk.
  • (for incidents) According to declared level of quality of support (default: Medium) 
  • Support is provided in English
  • Support is available
    • Monday and Friday
    • 8 h per day
  • All incidents and requests should be (assign to proper Support Unit) and prioritizedFAQ_GGUS-Ticket-Priorityaccording to suitable scheme.

Problems management

Planned maintenance windows or interruptions

Planned maintenance should be

  • declared in GOC DB in a timely manner i.e. 24 hours before
  • with typical duration up to 24 hours otherwise needs to be justified

Unscheduled interruptions should be managed according to MAN04

Security incidents

All security incidents should be registred according toEGI_CSIRT:Incident_reporting

Monitoring

Development team is responsible for development, maintenance and support for the tool monitoring probes.

Availability and reliability threshold should be defined with EGI Operations team depending on criticality of the service.

Information Security management

The following rules for information security and data protection apply:
•    The Provider must define and abide by an information security and data protection policy related to the service being provided.
•    This must meet all requirements of any relevant EGI policies or procedures and also must be compliant with the relevant national legislation.

Customer relationship management

Development team need to interact with each other inside EGI activities, primarily with OMB, the main customer of the operational tools.

Interactions among between Development teams are guaranteed through periodic phone conferences and face to face meetings. A dedicated mailing list is available as well.

Interactions with customers are guaranteed through periodic OMB meetings and (where needed) dedicated Operations Tool Advisory Group. The OTAG mandate is to help developers in requirement prioritization and releasing process of operational tools. OTAG provide forums to discuss the tools evolution that meet the expressed needs of the EGI community. It has representation from the all end users groups depending on the tool.