Difference between revisions of "ARGO"
Line 68: | Line 68: | ||
=== Profiles for RC monitoring === | === Profiles for RC monitoring === | ||
*[https:// | *[https://poem.egi.eu/poem/admin/poem/profile/11/ ARGO_MON] | ||
** Tests for monitoring of all EGI services. | |||
** [[ROC_SAM_Tests |ROC Tests description]] | ** [[ROC_SAM_Tests |ROC Tests description]] | ||
* [https:// | * [https://poem.egi.eu/poem/admin/poem/profile/13/ ARGO_MON_CRITICAL] | ||
** The profile for Availability/Reliability computation of EGI Resource Centres (OPS VO), subset of ARGO_MON tests. | |||
** These profile contains a subset of [[ROC_SAM_Tests | | ** These profile contains a subset of [[ROC_SAM_Tests |ARGO_MON Tests]]. | ||
* | *[https://poem.egi.eu/poem/admin/poem/profile/12/ ARGO_MON_OPERATORS] | ||
** Subset of ROC tests that are Operations tests, metrics that can generate an alarm on the operations dashboard when failing. | |||
** These profile contains a subset of [[ROC_SAM_Tests | | ** These profile contains a subset of [[ROC_SAM_Tests |ARGO_MON Tests]]. | ||
=== Profile for Cloud RC monitoring === | === Profile for Cloud RC monitoring === | ||
* [https:// | * [https://poem.egi.eu/poem/admin/poem/profile/1/ CLOUD-MON] | ||
** Tests for monitoring EGI FedCloud resources from cloudmon.egi.eu | |||
** [https://wiki.egi.eu/wiki/Cloud_SAM_tests CLOUD_MONITOR Tests description] | ** [https://wiki.egi.eu/wiki/Cloud_SAM_tests CLOUD_MONITOR Tests description] | ||
* [https:// | * [https://poem.egi.eu/poem/admin/poem/profile/2/ CLOUD-MON_CRITICAL] | ||
** Tests for calculating A/R of EGI FedCloud resources from cloudmon.egi.eu | |||
** [https://wiki.egi.eu/wiki/Cloud_SAM_tests CLOUD_MONITOR Tests description] | ** [https://wiki.egi.eu/wiki/Cloud_SAM_tests CLOUD_MONITOR Tests description] | ||
=== Profiles for Operations Tools monitoring === | === Profiles for Operations Tools monitoring === | ||
* [https:// | * [https://poem.egi.eu/poem/admin/poem/profile/4/ OPS_MONITOR] | ||
** Tests for monitoring of all EGI.eu Central Operational Tools from opsmon.egi.eu, including NGI SAM | |||
** [https://wiki.egi.eu/wiki/OPS-MONITOR_profile_SAM_tests OPS_MONITOR Tests description] | ** [https://wiki.egi.eu/wiki/OPS-MONITOR_profile_SAM_tests OPS_MONITOR Tests description] | ||
* [https:// | * [https://poem.egi.eu/poem/admin/poem/profile/5/ OPS_MONITOR_CRITICAL] | ||
** Subset of OPS_MONITOR tests used for A/R calculation | |||
=== Others === | === Others === | ||
* [https://midmon.egi.eu/poem/admin/poem/profile/1/ MW_MONITOR] - Tests for monitoring all EGI services for special purposes (MW upgrades) from midmon.egi.eu | * [https://midmon.egi.eu/poem/admin/poem/profile/1/ MW_MONITOR] - Tests for monitoring all EGI services for special purposes (MW upgrades) from midmon.egi.eu | ||
** Deployed: on Central instance (midmon.egi.eu) | ** Deployed: on Central instance (midmon.egi.eu) |
Revision as of 06:39, 13 July 2016
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
Tools menu: | • Main page | • Instructions for developers | • AAI Proxy | • Accounting Portal | • Accounting Repository | • AppDB | • ARGO | • GGUS | • GOCDB |
• Message brokers | • Licenses | • OTAGs | • Operations Portal | • Perun | • EGI Collaboration tools | • LToS | • EGI Workload Manager |
Tool name | ARGO |
Tool Category and description | Service Monitoring for Availability and Reliability |
Tool url | https://argoeu.github.io |
argo-ggus-support@grnet.gr | |
GGUS Support unit | ARGO/SAM EGI Support |
GOC DB entry | https://goc.egi.eu/portal/index.php?Page_Type=Site&id=641 |
Requirements tracking - EGI tracker | https://rt.egi.eu/rt/Dashboards/5544/SAM-Requirements |
Issue tracking - Developers tracker | https://github.com/ARGOeu/ARGO/issues |
Release schedule | https://github.com/ARGOeu/ARGO/milestones |
Release notes | TBD |
Roadmap | TBD |
Related OLA | https://documents.egi.eu/public/ShowDocument?docid=2170 |
Test instance url | http://cclavoisier04.in2p3.fr:8080/lavoisier |
Documentation | https://argoeu.github.io/overview/ |
License | Apache 2 |
Provider | GRNET, SRCE, CNRS |
Source code | https://github.com/ARGOeu/ |
Change, Release and Deployment
This sections are providing detailed agreement in terms of requirements gathering, release and deployment of the tool which extend Instructions for Operations Tools teams
Documentation
POEM Profiles
Profiles for RC monitoring
- ARGO_MON
- Tests for monitoring of all EGI services.
- ROC Tests description
- ARGO_MON_CRITICAL
- The profile for Availability/Reliability computation of EGI Resource Centres (OPS VO), subset of ARGO_MON tests.
- These profile contains a subset of ARGO_MON Tests.
- ARGO_MON_OPERATORS
- Subset of ROC tests that are Operations tests, metrics that can generate an alarm on the operations dashboard when failing.
- These profile contains a subset of ARGO_MON Tests.
Profile for Cloud RC monitoring
- CLOUD-MON
- Tests for monitoring EGI FedCloud resources from cloudmon.egi.eu
- CLOUD_MONITOR Tests description
- CLOUD-MON_CRITICAL
- Tests for calculating A/R of EGI FedCloud resources from cloudmon.egi.eu
- CLOUD_MONITOR Tests description
Profiles for Operations Tools monitoring
- OPS_MONITOR
- Tests for monitoring of all EGI.eu Central Operational Tools from opsmon.egi.eu, including NGI SAM
- OPS_MONITOR Tests description
- OPS_MONITOR_CRITICAL
- Subset of OPS_MONITOR tests used for A/R calculation
Others
- MW_MONITOR - Tests for monitoring all EGI services for special purposes (MW upgrades) from midmon.egi.eu
- Deployed: on Central instance (midmon.egi.eu)
- Tests: 15
- MW_MONITOR Tests description
- SEC_MONITOR - Security tests for monitoring all EGI services from secmon.egi.eu
- Deployed: on Central instance (secmon.egi.eu)
- Tests: 14
- SEC_MONITOR Tests description
SAM tests
Tests on NGI/ROC SAM instances are the one which frameworks includes in the SAM configuration. In addition SAM admins can add their own probes to these instances.
SAM teams proposes addition of new probes.
- The addition of probes is part of SAM release and thus part of the staged rollout.
- It was agreed that prior to release new list of probes will be briefly presented at the OMB meeting.
- Probes which perform internal components of SAM are not presented at OMB.
List of tests:
- The list of tests included in the SAM release can be found here - NGI profile SAM tests.
- List of MW related tests: MW SAM tests.
- List of operational tools tests: OPS-MONITOR profile SAM tests.
- List of cloud tests: Cloud SAM tests.
Operations tests
Tests on Operations Portal are the ones used for raising alarms for ROD and Operations teams. Operations portal does not execute these tests, but receives alarms from NGI/ROC SAM instances. Operations Portal contains list of the probes used for alarms and others are filtered.
The procedure for adding a new probe can be found here PROC06.
The list of tests can be found here - Operations SAM tests.
Availability tests
Set of tests used for calculating availability and reliability of sites and services. The A/R calculation is related to the OLA. As in case of Operations Portal, availability calculation component receives results from NGI/ROC SAM instances.
TSA1.8 proposes a change in avail calculation (which probe results count in it) and the OMB approves.
The list of tests can be found here - Availability SAM tests.