Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "EGI-InSPIRE:SA1.7-QR13"

From EGIWiki
Jump to navigation Jump to search
 
(9 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{{Template:Op menubar}} {{Template:Inspire_reports_menubar}} {{TOC_right}}  
{{Template:EGI-Inspire menubar}}
 
{{Template:Inspire_reports_menubar}}
{{TOC_right}}
= 1. Task Meetings = <!--
= 1. Task Meetings = <!--
Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all task meetings participants are ALWAYS recorded either on indico from the registrants’ list, or in the minutes.  
Notes. Report here all task-specific meetings held. This includes (a) face-to-face meetings and (b) phone meetings. Make sure that for all task meetings participants are ALWAYS recorded either on indico from the registrants’ list, or in the minutes.  
Line 12: Line 15:
! style="width: 50%" | Outcome
! style="width: 50%" | Outcome
|-
|-
| ...
| June 6th 2013
| ....
| https://indico.egi.eu/indico/conferenceDisplay.py?confId=1715
| ...
| COD phone conf
| ...
| https://indico.egi.eu/indico/getFile.py/access?resId=0&materialId=minutes&confId=1715
|}
|}


Line 25: Line 28:
'''Followup upgrades of unsupported software'''
'''Followup upgrades of unsupported software'''


There were quite a large number of sites that were still running EMI-1 software that is no longer supported. Last quarter a campaign was started to make these sites upgrade their services that run this software. This campaign was continued this quarter. COD has requested RODs to issued GGUS tickets to these sites and is following this up.
COD is involved in the process of retirement the EMI-1 middleware.
The majority of services were already upgraded.
Currently COD oversights process of StoRM service retirement (7 instances).
In near future the EMI-1 dCache service withdrawal, that has extended
support till 31.08.2013, will be performed.


'''ROD performance index'''
'''ROD performance index'''
Line 38: Line 45:
'''Unknown Followup'''
'''Unknown Followup'''


See [[SA1.7-QR6]] and [[SA1.7-QR6]] for more background information. In Q11 we have continued this activity. In addition, we have started discussions with the SAM nagios team to have a nagios probe that will raise alarms on the operations dashboard when the unknown percentage is higher than a certain threshold. These discussions are nearly completed.
See [[SA1.7-QR6]] and [[SA1.7-QR6]] for more background information. In Q13 we have continued this activity. In addition, we have started discussions with the SAM nagios team to have a nagios probe that will raise alarms on the operations dashboard when the unknown percentage is higher than a certain threshold. These discussions are nearly completed.


'''Followup NGI Core Services availability'''
'''Followup NGI Core Services availability'''
Line 46: Line 53:
'''Nagios Probe working group'''
'''Nagios Probe working group'''


This quarter a nagios probe working group has been setup having the following tasks:
SAM/nagios probe WG finished dealing with EMI-1 probes integration issues. A final issue of dropping org.arc.AUTH and org.arc.SW-VERSION metrics was noticed and agreed with NGIs. The next step is to select the probes which should be turned operations, but this will be done when SAM update 22 is deployed by majority of NGIs.
  * revise probes before they are integrated into SAM framework
 
  * evaluate probe- and monitoring-related improvements
The WG is currently dealing with the following new requests:
COD is leading this activity. The WG met 4 times covering issues of ARC and gLite probes.
  * revise SE probes (alarms for VO for low free space available)
  * probe development requests from VOs
* too much resources requested by new MPI probes
 
== Software support ==
 
The activity ran smoothly following the established procedures.
Number of tickets handled in this period is lower (125 vs. 173 and 192 in
preceeding quarters), which is an expected drop in the main vacation period.
The ratio of solved tickets is 24% which remains within the usual range
(20-30%).
 
This is the first reporting period after the end of EMI and IGE projects.
However, due to thorough preparation and establishment of support
relationships with the individual product teams no particular issues were
encountered.
 
In July a proposal for 2nd level software support EGI core tasks was
submitted to the bid for the post-InSPIRE period. Out of the current
project partners contributing to this task, CESNET, STFC and JUELICH
participate formally in the bid, while INFN and NDGF promise informal
best-effort support.


= 3. Issues and Mitigation = <!-- fill the table below
= 3. Issues and Mitigation = <!-- fill the table below
Line 68: Line 96:
= 4. Plans for the next period = <!-- provide your text below. PLEASE PROVIDE TEXT IN A GOOD EDITED FORM (NO BULLET LISTS OF SHORT ITEMS THAT REQUIRE EXPANSION WHEN INSERTED IN A REPORT) -->  
= 4. Plans for the next period = <!-- provide your text below. PLEASE PROVIDE TEXT IN A GOOD EDITED FORM (NO BULLET LISTS OF SHORT ITEMS THAT REQUIRE EXPANSION WHEN INSERTED IN A REPORT) -->  


[[Category:SA1_Task_QR_Reports]]
 
== Grid Oversight ==
'''Review of certification procedures etc'''
'''Review of certification procedures etc'''


Line 75: Line 104:
'''Further plans'''
'''Further plans'''


We will continue the activities that we already doing. Further we are going to proceed with carrying out the plan outlined in https://indico.egi.eu/indico/getFile.py/access?contribId=4&resId=0&materialId=slides&confId=1100 and startup the plans described in https://documents.egi.eu/public/ShowDocument?docid=1529. More specific, the next quarter we will stop the followup of the top-BDII and RPI follow-up as a part of these plans.
We will continue the activities that we already doing. Further we are going to proceed with carrying out the plan outlined in https://indico.egi.eu/indico/getFile.py/access?contribId=4&resId=0&materialId=slides&confId=1100 and startup the plans described in https://documents.egi.eu/public/ShowDocument?docid=1529. More specific, this quarter we have stopped the followup of the top-BDII and RPI follow-up as a part of these plans. we will monitor the impact of this.

Latest revision as of 19:12, 6 January 2015

EGI Inspire Main page


Inspire reports menu: Home SA1 weekly Reports SA1 Task QR Reports NGI QR Reports NGI QR User support Reports



1. Task Meetings

Date (dd/mm/yyyy) Url Indico Agenda Title Outcome
June 6th 2013 https://indico.egi.eu/indico/conferenceDisplay.py?confId=1715 COD phone conf https://indico.egi.eu/indico/getFile.py/access?resId=0&materialId=minutes&confId=1715

2. Main Achievements

Grid Oversight

Followup upgrades of unsupported software

COD is involved in the process of retirement the EMI-1 middleware. The majority of services were already upgraded. Currently COD oversights process of StoRM service retirement (7 instances). In near future the EMI-1 dCache service withdrawal, that has extended support till 31.08.2013, will be performed.

ROD performance index

For background information on this, have a look at SA1.7-QR6, section RP OLA and ROD metrics. Since October 2011 we have been asking all NGIs above 10 items in the COD dashboard duting one month about the explanation through GGUS, what was the reason of such result and how do you plan to improve the situation. Currently we are continuing to collect and investigate these metrics and also to correlate this with other metrics and see if we can draw some conclusions from them.

Availability followup

See SA1.7-QR6 for more background information. COD has issued GGUS tickets to sites that are below 70% availability for more than three consecutive months that are eligible for suspension.

Unknown Followup

See SA1.7-QR6 and SA1.7-QR6 for more background information. In Q13 we have continued this activity. In addition, we have started discussions with the SAM nagios team to have a nagios probe that will raise alarms on the operations dashboard when the unknown percentage is higher than a certain threshold. These discussions are nearly completed.

Followup NGI Core Services availability

We have issued GGUS tickets to NGIs that do not meet the 99% availability requirement. In februari 2012 we have started up this activity. At first we have only submitted GGUS tickets to NGIs informing the of their low top-level BDII availability. This activity has been continued in this quarter.

Nagios Probe working group

SAM/nagios probe WG finished dealing with EMI-1 probes integration issues. A final issue of dropping org.arc.AUTH and org.arc.SW-VERSION metrics was noticed and agreed with NGIs. The next step is to select the probes which should be turned operations, but this will be done when SAM update 22 is deployed by majority of NGIs.

The WG is currently dealing with the following new requests:

* revise SE probes (alarms for VO for low free space available) 
* probe development requests from VOs
* too much resources requested by new MPI probes

Software support

The activity ran smoothly following the established procedures. Number of tickets handled in this period is lower (125 vs. 173 and 192 in preceeding quarters), which is an expected drop in the main vacation period. The ratio of solved tickets is 24% which remains within the usual range (20-30%).

This is the first reporting period after the end of EMI and IGE projects. However, due to thorough preparation and establishment of support relationships with the individual product teams no particular issues were encountered.

In July a proposal for 2nd level software support EGI core tasks was submitted to the bid for the post-InSPIRE period. Out of the current project partners contributing to this task, CESNET, STFC and JUELICH participate formally in the bid, while INFN and NDGF promise informal best-effort support.

3. Issues and Mitigation

Issue Description Mitigation Description
Grid Oversight: None

4. Plans for the next period

Grid Oversight

Review of certification procedures etc

We are busy developing a procedure to incorporate test resources into the EGI infrastructure, review the certification procedures and to identify possible changes to the operational tools. This discussion is now finished and we will make further progress on this.

Further plans

We will continue the activities that we already doing. Further we are going to proceed with carrying out the plan outlined in https://indico.egi.eu/indico/getFile.py/access?contribId=4&resId=0&materialId=slides&confId=1100 and startup the plans described in https://documents.egi.eu/public/ShowDocument?docid=1529. More specific, this quarter we have stopped the followup of the top-BDII and RPI follow-up as a part of these plans. we will monitor the impact of this.