Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "EGI-InSPIRE:SA1.7-QR8"

From EGIWiki
Jump to navigation Jump to search
Line 82: Line 82:
!scope="col"| Mitigation Description
!scope="col"| Mitigation Description
|-
|-
|
|grid Oversight: Unresponsive NGIs with respect to NGI core services followup tickets
|
|We will discuss a procedure how to deal with this with the COO
|}
|}



Revision as of 16:11, 3 May 2012

1. Task Meetings

Date (dd/mm/yyyy) Url Indico Agenda Title Outcome
23-02-2012 https://www.egi.eu/indico/conferenceDisplay.py?confId=827 COD https://www.egi.eu/indico/getFile.py/access?resId=0&materialId=minutes&confId=827
22-03-2012 https://www.egi.eu/indico/conferenceDisplay.py?confId=963 COD https://www.egi.eu/indico/getFile.py/access?resId=0&materialId=minutes&confId=963
17-04-2012 https://www.egi.eu/indico/conferenceDisplay.py?confId=1016 COD https://www.egi.eu/indico/getFile.py/access?resId=0&materialId=minutes&confId=1016

2. Main Achievements

Grid Oversight

ROD teams newsletter

This quarter we have published a ROD teams newsletter in February and April. The rationale behind the newsletter is descibed in the SA1.7-QR4 report.

ROD performance index

For background information on this, have a look at SA1.7-QR6, section RP OLA and ROD metrics. Since October we have been asking all NGIs above 10 items in the COD dashboard duting one month about the explanation through GGUS, what was the reason of such result and how do you plan to improve the situation. Currently we are continuing to collect and investigate these metrics and also to correlate this with other metrics and see if we can draw some conclusions from them.

Non-OK Alarms Followup

For background information on this, have a look at SA1.7-QR6, section Non-OK Alarms Followup. We have continued this activity in Q8.

Availability followup

See SA1.7-QR6 for more background information. There has been a phone conf with jra1 (https://www.egi.eu/indico/getFile.py/access?resId=0&materialId=minutes&confId=716) where the availability probe has been discussed. There will be a probe that meets the following specs:

  • The probe only measures availability
  • The probe computes the availability 30 days in the past
  • The probe returns a WARNING when: 70%>= availability <=75%
  • The probe returns a CRITICAL when: availability <70%

We are waiting for this probe to be available for testing.

Apart from this we have continued the followup of this in the traditional way by means of GGUS tickets in Q8.

Unknown Followup

See SA1.7-QR6 and SA1.7-QR6 for more background information. In Q8 we have continued this activity.

Followup NGI Core Services availability

We have issued GGUS tickets to NGIs that do not meet the 99% availability requirement. In februari we have started up this activity.

TPM

TBD

Network Support

TBD

3. Issues and Mitigation

Issue Description Mitigation Description
grid Oversight: Unresponsive NGIs with respect to NGI core services followup tickets We will discuss a procedure how to deal with this with the COO

4. Plans for the next period

Grid Oversight

The plans for the next period is to proceed with the current activities and come up with a proposal to include test resources in the infrastructure.

TPM

Network Support