Difference between revisions of "SAM"
Jump to navigation
Jump to search
(→SAM) |
(→SAM) |
||
Line 18: | Line 18: | ||
* [[SAM Tests|SAM Tests terminology and types]] | * [[SAM Tests|SAM Tests terminology and types]] | ||
* SAM Project [https://tomtools.cern.ch/jira/browse/SAM home page] and [https://tomtools.cern.ch/confluence/display/SAMDOC/Milestones SAM milestones] | * SAM Project [https://tomtools.cern.ch/jira/browse/SAM home page] and [https://tomtools.cern.ch/confluence/display/SAMDOC/Milestones SAM milestones] | ||
===SAM profiles === | ===SAM profiles === |
Revision as of 11:58, 24 January 2013
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
Tools menu: | • Main page | • Instructions for developers | • AAI Proxy | • Accounting Portal | • Accounting Repository | • AppDB | • ARGO | • GGUS | • GOCDB |
• Message brokers | • Licenses | • OTAGs | • Operations Portal | • Perun | • EGI Collaboration tools | • LToS | • EGI Workload Manager |
The Service Availability Monitoring (SAM) system is used to monitor the resources within the production infrastructure. SAM monitoring data is used for calculation of availability and reliability of grid sites. It includes the following components:
- probes: a test execution framework (based on the open source monitoring framework Nagios) and the Nagios Configuration Generator (NCG)
- the Aggregated Topology Provider (ATP), the Metrics Description Database (MDDB), and the Metrics Results Database (MRDB)
- the message bus to publish results and a programmatic interface
- the visualization portal (MyEGI).
SAM tool instances
Documentation
Introduction
SAM
- SAM Tests terminology and types
- SAM Project home page and SAM milestones
SAM profiles
Resource Centre AVAILABILITY/RELIABILITY COMPUTATION
- ROC_CRITICAL - the profile for Availability/Reliability computation of EGI Resource Centres (OPS VO). It replaces WLCG_CREAM_LCGCE_CRITICAL as of 01 Jan 2012.
FOR GENERATION OF ALARMS IN THE OPERATIONS DASHBOARD IN CASE OF FAILURE
ALL METRICS THAT NCG CAN USE TO CONFIGURE A SAM NGI
- ROC - all the possible metrics that NCG can use to configure NGI Nagios.
NOTE WELL: starting from SAMUpdate-17 the removal of a metric from ROC profile will immediately cause the removal of the metric from all NGI Nagios instances, i.e. tests will no longer be executed.
MyEGI
NCG
ATP
User guides
Administrator guides
- SAM Release Notes
- SAM (including configuration via YAIM)
- SAM/NAGIOS Reference Card for sitemanger
- VO SAM
- Monitoring uncertified sites:
- Setting NAGIOS to Monitor Uncertified Sites
- IMPORTANT. EGI.eu provides catch-all WMS and BDII services for the monitoring of uncertified sites. The service is open for use, and your NGI can easily apply here.
Probes
Developers guides
Support
FAQs and Troubleshooting guides
Check this
EGI.eu central tools and NGI SAM
OTHERS
- GLEXEC - gLExec tests
WLCG
- WLCG_CREAM_CRITICAL
- WLCG_CREAM_LCGCE_CRITICAL profile used for WLCG Availability/Reliability computation
- WLCG_CRITICAL
- WLCG_CRITICAL_TEST
OSG
- Validate ROC or NGI Nagios Procedures: PROC05
- Setting a Nagios test status to OPERATIONS: PROC06
- Adding new probes to SAM: PROC07
- Management of the EGI OPS Availability and Reliability Profile: PROC08
SAM/Nagios Support in GGUS
Resources
- SAM milestones
- EMI Nagios and status (ARC, dCache, gLite, UNICORE)
- Andreade, P.; M. Babik, M.; Bhatt, K; Service Availability Monitoring Framework Based On Commodity Software; CHEP12, March 2012 (poster)