Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "2019-bidding/monitoring"

From EGIWiki
Jump to navigation Jump to search
(Replaced content with "{{Template:Deprecated}}")
 
(28 intermediate revisions by 5 users not shown)
Line 1: Line 1:
{{Template:Op menubar}}{{Core_services_menubar}} {{TOC_right}}
{{Template:Deprecated}}
'''Go back to the [[EGI Core Activities Bidding#PHASE_II_May_2016-December_2017|EGI Core Activities Bidding page]].'''
 
'''to add requirements on software licenses and fitsm traning&certification
 
to clarify the effort'''
 
= Service name: Monitoring (ARGO) =
 
== Introduction ==
 
Monitoring services archive and provide access to the infrastructure monitoring results of the services. These data are accessible at many levels (Resource Centres, Operations Centres and EGI.EU), and it is used for the generation of service level reports, and for the central monitoring of EGI.eu operational tools and other central monitoring needs.
Infrastructure operations require in some cases monitoring activities created ad-hoc to support specific operational activities, for example UserDN publishing in accounting records and of software versions of deployed middleware.
 
== Technical description ==
Monitoring (ARGO) is a distributed system supporting EGI/NGI operations. It provides remote monitoring of services, visualization of the service status, Operations portal interfacing and generation of availability and reliability reports. The central monitoring services are needed to ensure the aggregation of all EGI metric results and the access to the data at an EGI-wide scope through the central ARGO user interface. These results are exposed through the central ARGO web service and its programmatic interface (JSON supported). On top of that, the ARGO Reporting System generates monthly availability reports about sites and operational tools for use of the service owners. In addition to the central services described above, the activity provides also:
*Monitoring of EGI.eu technical services: a centralised installation in high availability is currently running in production to monitor the performance of EGI.eu operations tools and user community support tools.
*Maintenance of existing operations probes and deployment of new ones as required to support operations activities as requested by EGI Operations coordination
*A notification service to inform Service Providers for possible errors/problems.
*Requirements gathering
 
== Coordination ==
 
The activity will have to coordinate with:
* EGI Operations for the the support of the operational activities with monitoring data, and for the planning of new releases and updates of the monitoring system
* With the service developers to support them in the development of probes for their services
* With the other operational tools where interaction is necessary (for example messaging network, GOCDB)
 
== Operations ==
*Daily running of the system
**Monitor Services (Sites, NGIs, Service_Groups)
**Availability/Reliability computation engine
**User interface to browse the data
*Provisioning of a high availability configuration
**Min. two ARGO Monitoring boxes for the monitoring of the services, deployed in different locations
*The monitoring infrastructure must allow to test new probes without affecting the production monitoring
*Creating an Availability and Continuity Plan and implementing countermeasures to mitigate the risks defined in the related risk assessment
*Documentation
 
== Software as a service ==
In the bid, please provide also information about the possibility to provide the service to external consumers as a Software as a Service. If the provisioning of the activity as a SaaS implies additional effort or other costs, please report these costs separately, not as part of the overall budget of the bid.
 
== Maintenance ==
This activity includes:
*bug fixing
*maintenance of probes to test the functionality of the service
*integration (configuration and packaging) of new probes into ARGO
*coordination of software maintenance activities with other technology providers of the Operational tools part of the EGI Core Infrastructure or remote systems deployed by integrated and peer infrastructures that interoperate with the central EGI components of the system (on a best effort basis for the peer infrastructures providers interoperability).
*Producing the monthly reports on the performances of the resource centres, NGI central services and EGI central tools requirements gathering
*documentation
 
== Support ==
Support through the EGI helpdesk about the functionality of the service and the monitoring data gathered.
 
'''Support hours''': eight hours a day , Monday to Friday – excluding public holidays of the hosting organization.
 
== Service level targets ==
 
* Monitoring probes submission engines must be available at least 99% on a monthly basis
* User interfaces to browse monitoring results must be available at least 95% on a monthly basis
 
== Effort ==
Bids planning a effort of (39?) Person Months/year would allow these services and activities to be addressed appropriately.

Latest revision as of 16:55, 20 November 2019

Alert.png This article is Deprecated and should no longer be used, but is still available for reasons of reference.