2016-bidding/monitoring

From EGIWiki
Revision as of 12:52, 13 March 2018 by Kkoum (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


EGI Core services menu: Services PHASE I Services PHASE II Services PHASE III Bids Payments Travel procedure Performance



Go back to the EGI Core Activities Bidding page.

  • Service name: Monitoring (ARGO)

Introduction

Monitoring services archive and provide access to the infrastructure monitoring results of the services. These data are accessible at many levels (Resource Centres, Operations Centres and EGI.EU), and it is used for the generation of service level reports, and for the central monitoring of EGI.eu operational tools and other central monitoring needs. Infrastructure operations require in some cases monitoring activities created ad-hoc to support specific operational activities, for example UserDN publishing in accounting records and of software versions of deployed middleware.

Given the critical nature of the activity bid must contain an availability and continuity plan for the service.

Technical description

Monitoring (ARGO) is a centralized and Modular system supporting EGI/NGI operations. It provides remote monitoring of services, computation of the monitoring data, visualization of the service status, dashboard interfacing, notification system and generation of availability and reliability reports. The monitoring services ensure the aggregation of all EGI metric results and the access to the data at a EGI-wide scope through the central ARGO user interface. These results are exposed through the central ARGO web service and its programmatic interface (XML & JSON supported). On top of that, the ARGO Reporting System generates monthly availability reports about sites and operational tools for use of the service owners. In addition to the central services described above, the activity provides also:

  • Monitoring probes submission engines: a distributed, high available centralised installation is required to submit and run the monitoring probes for the availability computation profiles and for the other profiles required by the EGI operations.The deployment must support the size of the infrastructure.
  • Development of nagios probes:
  • Maintenance of existing operations probes
  • Development of new probes as required to support operations activities
  • Requirements gathering


Coordination

The activity will have to coordinate with:

  • EGI Operations for the the support of the operational activities with monitoring data, and for the planning of new releases and updates of the monitoring system
  • With the service developers to support them in the development of probes for their services
  • With the other operational tools where interaction is necessary (for example messaging network, GOCDB)


Operations

  • Daily running of the system
    • Monitoring probes submission enginges
    • Availability/Reliability computation engine
    • User interface to browse the data
  • Provisioning of a high availability configuration
    • Min. two distributed reduntant instances of monitoring engines Nagios boxes for the monitoring of the services.
    • Multiple consumers of monitoring data
  • The monitoring infastructure must allow to test new probes without affecting the production monitoring
  • Requirements gathering
  • Documentation

Software as a service

In the bid, please provide also information about the possibility to provide the service to external consumers as a Software as a Service. If the provisioning of the activity as a SaaS implies additional effort or other costs, please report these costs separately, not as part of the overall budget of the bid.

Maintenance

Support

Support through the EGI helpdesk about the functionality of the service and the monitoring data gathered.

Support hours: eight hours a day , Monday to Friday – excluding public holidays of the hosting organization.

Service level targets

  • Monitoring probes submission engines must be available at least 99% on a monthly basis
  • User interfaces to browse monitoring results must be avialable at least 95% on a monthly basis

Effort

Bids planning a effort between 24 and 30 Person Months/year would allow these services and activities to be addressed appropriately.