SAM
Revision as of 17:16, 11 February 2011 by Psolagna (talk | contribs) (→Service Availability Monitoring)
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
Tools menu: | • Main page | • Instructions for developers | • AAI Proxy | • Accounting Portal | • Accounting Repository | • AppDB | • ARGO | • GGUS | • GOCDB |
• Message brokers | • Licenses | • OTAGs | • Operations Portal | • Perun | • EGI Collaboration tools | • LToS | • EGI Workload Manager |
Service Availability Monitoring
The Service Availability Monitoring (SAM) system is used to monitor the resources within the production infrastructure. SAM monitoring data is used for calculation of availability and reliability of grid sites. It includes the following components:
- probes: a test execution framework (based on the open source monitoring framework Nagios) and the Nagios Configuration Generator (NCG)
- the Aggregated Topology Provider (ATP), the Metrics Description Database (MDDB), and the Metrics Results Database (MRDB)
- the message bus to publish results and a programmatic interface
- the visualization portal (MyEGI).
Main links:
- SAM Instances
- NEW! Grid probes from org.SAM package
- EMI Nagios probes
Documentation
Installation instructions
- Installation Instruction -NEW Confluence page
- NAGIOS&NCG YAim Based Installation Instruction -OLD page with YAIM variables definition
- SAM/NAGIOS Reference Card for sitemanger
- SAM Administrators FAQ
- Setting NAGIOS to Monitor Uncertified Sites
Tests list
Tools information pages:
- MyEGI
- NCG:
- Aggregated Topology Provider (ATP)
- JIRA SAM project tracking system
Procedures
- Validate ROC or NGI Nagios Procedures
- Procedure for adding new probes to SAM release
- Procedure for setting Nagios test an Availability test
Resources