Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @

EGI Core activities:2015-bidding Monitoring

From EGIWiki
Jump to navigation Jump to search
Main operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security

EGI Core services menu: Services PHASE I Services PHASE II Services PHASE III Bids Payments Travel procedure Performance

Go back to the EGI Core Activities Bidding page.

  • Service name: Monitoring services


Central systems are needed for accessing and archiving infrastructure monitoring results of the services provided at many levels (Resource Centres, Operations Centres and EGI.EU), for the generation of service level reports, and for the central monitoring of operational tools and other central monitoring needs.

Infrastructure operations require in some cases monitoring activities to be conducted centrally to support specific service and capability monitoring, like UserDN publishing in accounting records, GLUE information validation, and of software versions of deployed middleware.

Technical description

Monitoring (SAM) is distributed system supporting EGI/NGI operations. It provides remote monitoring of services, visualization of the service status, dashboard interfacing, notification system and generation of availability and reliability reports. The central monitoring services are needed to ensure the aggregation of all EGI metric results and the access to the data at a EGI-wide scope through the central ARGO user interface. These results are exposed through the central ARGO web service and its programmatic interface (XML & JSON supported). On top of that, the ARGO Reporting System generates monthly availability reports about sites and operational tools for use of the service owners. In addition to the central services described above, the activity provides also:

  • Monitoring of technical services: a centralised SAM installation is currently running in production to monitor the performance of operations tools and user community support tools.
  • A central Nagios service is provided to support specific operations activities like User DN publishing in accounting records, GLUE information validation and monitoring of deployed software versions. New specific monitoring needs will emerge depending on the operations technical activities, and the central monitoring Nagios will be configured to address them. The Nagios infrastructure needs to be scaled accordingly.
  • When the monitoring infrastructure of EGI will move to a full central deployment, the Monitoring service will include a high availability deployment of Nagios services to monitor the entire EGI Feration (more than 5000 services). The deployment must support the size of the infrastructure.
  • Development of nagios probes:
    • Maintenance of existing operations probes
    • Development of new probes as required to support operations activities
    • Requirements gathering


This activity is responsible for the coordination of the system operations and upgrade activities with those partners that are in charge of operating other systems that depend on it.


  • Daily running of the system
  • Provisioning of a high availability configuration
    • Min. three Nagios boxes for the monitoring of the services. The Nagios’es cannot be deployed all in the same site.
    • Multiple consumers of monitoring data
  • A test infrastructure to verify interoperability and the impact of software upgrades on depending systems
  • Deployment in production of the releases of the monitoring system (ARGO) produced in EGI-Engage


This activity includes:

  • bug fixing, proactive maintenance, improvement of the system
  • maintenance of probes to test the functionality of the service
  • integration (configuration and packaging) of new probes into SAM
  • coordination of software maintenance activities with other technology providers that provide software for the EGI Core Infrastructure or remote systems deployed by integrated and peer *infrastructures that interoperate with the central EGI components of the system.
  • maintenance of probes to test the functionality of the service
  • Producing the monthly reports on the performances of the resource centres, NGI central services and EGI central tools
  • requirements gathering
  • documentation


Support through the EGI helpdesk about the functionality of the service and the monitoring data gathered.

Support hours: eight hours a day , Monday to Friday – excluding public holidays of the hosting organization.

Service level targets


Bids planning a total effort between 20 and 24 Person Months/year would allow these services and activities to be addressed appropriately.