Difference between revisions of "SAM Nagios probes refactoring TF"
Jump to navigation
Jump to search
(→Probes) |
(→Probes) |
||
Line 30: | Line 30: | ||
* Identified Isuses: | * Identified Isuses: | ||
** NGI SAM Nagioses have documentation URL hardcoded in metric configuration | ** '''NGI SAM Nagioses have documentation URL hardcoded in metric configuration''' | ||
*** changing the URLs requires SAM update | *** changing the URLs requires SAM update | ||
*** '''Solution:''' - Plan the Change | *** '''Solution:''' - Plan the Change | ||
** SAM CE Nagios framework is unsupported | ** '''SAM CE Nagios framework is unsupported''' | ||
*** used for WN* tests | *** used for WN* tests | ||
*** '''Solution:''' - replace them | *** '''Solution:''' - replace them | ||
** SRM tests (add reference) cause false alarms on new DPM versions (add link to GGUS tkt) | ** '''SRM tests (add reference) cause false alarms on new DPM versions''' (add link to GGUS tkt) | ||
*** '''Solution:''' - ?? | *** '''Solution:''' - ?? | ||
** org.gstat.SanityCheck - not maintained anymore | ** '''org.gstat.SanityCheck - not maintained anymore''' | ||
*** checks small subset of BDII GLUE 1.2 data | *** checks small subset of BDII GLUE 1.2 data | ||
*** (add reference to profiles were it is enabled) | *** (add reference to profiles were it is enabled) | ||
*** '''Solution:''' replace with org.bdii.GLUE2-Validate - through validation of GLUE 2 data | *** '''Solution:''' replace with org.bdii.GLUE2-Validate - through validation of GLUE 2 data | ||
** LFC decommissioning - with implications in tests depending on it | ** '''LFC decommissioning''' - with implications in tests depending on it | ||
*** org.sam.WN-Rep* (CREAM-CE) | *** org.sam.WN-Rep* (CREAM-CE) | ||
*** org.sam.WN-Rep* (CREAM-CE) | *** org.sam.WN-Rep* (CREAM-CE) | ||
Line 48: | Line 48: | ||
**** remove all LFC-dependent tests, or find replacement | **** remove all LFC-dependent tests, or find replacement | ||
**** deploy dedicated LFC and reconfigure all NGI SAM Nagioses | **** deploy dedicated LFC and reconfigure all NGI SAM Nagioses | ||
** SAM requires unsupported software | **''' SAM requires unsupported software''' | ||
*** UMD 2 middleware: | *** UMD 2 middleware: | ||
**** '''Solution:''' migration to UMD-3 planned in September - FOllow-Up | **** '''Solution:''' migration to UMD-3 planned in September - FOllow-Up |
Revision as of 15:28, 23 July 2014
Main | EGI.eu operations services | Support | Documentation | Tools | Activities | Performance | Technology | Catch-all Services | Resource Allocation | Security |
Tools menu: | • Main page | • Instructions for developers | • AAI Proxy | • Accounting Portal | • Accounting Repository | • AppDB | • ARGO | • GGUS | • GOCDB |
• Message brokers | • Licenses | • OTAGs | • Operations Portal | • Perun | • EGI Collaboration tools | • LToS | • EGI Workload Manager |
Mandate
- assess the support status of various Nagios probes available
- recommend removal or replacement of unsupported probes from the SAM Nagios framework
- improve documentation:
- availability of references to the individual nagios probes/tests descriptions in a central place
- update known documentations web pages with proper references (avoid broken links)
- improve developers guides
- require change to SAM to remove harcoded documentation URLs in metrics configuration
Reference:OMB, June 26, 2014 - Re-factoring SAM probes
Tasks
Documentation
Generic improvements
- Eliminate Broken Links
- collect all documentation links in a central place
Developers Documentation
- (add link)
- colelct requierments and suggestions from Developers
Probes
- Identified Isuses:
- NGI SAM Nagioses have documentation URL hardcoded in metric configuration
- changing the URLs requires SAM update
- Solution: - Plan the Change
- SAM CE Nagios framework is unsupported
- used for WN* tests
- Solution: - replace them
- SRM tests (add reference) cause false alarms on new DPM versions (add link to GGUS tkt)
- Solution: - ??
- org.gstat.SanityCheck - not maintained anymore
- checks small subset of BDII GLUE 1.2 data
- (add reference to profiles were it is enabled)
- Solution: replace with org.bdii.GLUE2-Validate - through validation of GLUE 2 data
- LFC decommissioning - with implications in tests depending on it
- org.sam.WN-Rep* (CREAM-CE)
- org.sam.WN-Rep* (CREAM-CE)
- Solutions:
- remove all LFC-dependent tests, or find replacement
- deploy dedicated LFC and reconfigure all NGI SAM Nagioses
- SAM requires unsupported software
- UMD 2 middleware:
- Solution: migration to UMD-3 planned in September - FOllow-Up
- CentOS/SL5
- Solution: - migration to CentOS/SL6 planned within EGI InSPIRE JRA2 activity
- UMD 2 middleware:
- NGI SAM Nagioses have documentation URL hardcoded in metric configuration
People
- Cristina Aiftimiei
- Emir Imamagic
- Peter Solagna