EGI-InSPIRE:SA1.7-QR7
1. Task Meetings
2. Main Achievements
Grid Oversight
ROD teams newsletter
This quarter we have published a ROD teams newsletter in November, December and January. The rationale behind the newsletter is descibed in the SA1.7-QR4 report.
ROD performance index
For background information on this, have a look at SA1.7-QR6, section RP OLA and ROD metrics. Since October we have been asking all NGIs above 10 items in the COD dashboard duting one month about the explanation through GGUS, what was the reason of such result and how do you plan to improve the situation. The good news is that we have seen a continuous decling in the amount of items in the COD dashboard.
Non-OK Alarms Followup
For background information on this, have a look at SA1.7-QR6, section Non-OK Alarms Followup. We have continued this activity in QR7.
Availability followup
- There is a Nagios probe under development that is going to raise an alarm when a site's avaliability and/or reliability is below the 70%/75% threshold. The COD has provided input which was put into the RT ticket: https://rt.egi.eu/rt/Ticket/Display.html?id=289. We have organised a phone conf on the requirements that this probe should fulfill. We have done a new proposal in this field and hope to get aan agreement from all parties involved so this issues can make some progress.
Unknown Followup
- Recently, we discovered that in the availability en reliability metrics there were a substantial amount of UNKNOWN test results for individual sites but also for all sites in an entire NGI. Since UNKNOWN test results are not taken into account in the availability/reliability metrics, this will cloud the availability and reliability metrics. Currently this issue is under investigation. More information on this topic may be found at: https://wiki.egi.eu/wiki/Grid_operations_oversight/Unknown_issue
TPM
Network Support
3. Issues and Mitigation
Issue Description | Mitigation Description |
---|---|