Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

EGI-InSPIRE:SA1.7-QR10

From EGIWiki
Revision as of 17:38, 31 October 2012 by Reale (talk | contribs)
Jump to navigation Jump to search


1. Task Meetings

Date (dd/mm/yyyy) Url Indico Agenda Title Outcome
August 16th 2012 https://indico.egi.eu/indico/conferenceDisplay.py?confId=1144 COD meeting https://indico.egi.eu/indico/getFile.py/access?resId=0&materialId=minutes&confId=1144
September 5th 2012 https://indico.egi.eu/indico/getFile.py/access?resId=0&materialId=minutes&confId=1144 COD meeting https://indico.egi.eu/indico/getFile.py/access?resId=0&materialId=minutes&confId=1147
September 21st 2012 https://indico.egi.eu/indico/contributionDisplay.py?sessionId=56&contribId=242&confId=1019 ROD session at EGI TF12 https://indico.egi.eu/indico/materialDisplay.py?contribId=242&sessionId=56&materialId=slides&confId=1019
October 3rd 2012 https://indico.egi.eu/indico/conferenceDisplay.py?confId=1171 COD meeting https://indico.egi.eu/indico/getFile.py/access?resId=0&materialId=minutes&confId=1171
23rd October 2012 https://indico.egi.eu/indico/conferenceDisplay.py?confId=1200 COOCOD meeting

== Network Supprot

EGI TF 2012 F2F meeting 18 September 2012 https://indico.egi.eu/indico/sessionDisplay.py?sessionId=37&tab=contribs&confId=1019

F2F meeting beween the HINTS Team and PerfSONAR MDM Team in Erlangen, September 12, 2012


2. Main Achievements

Grid Oversight

Followup upgrades of unsupported software There were quite a large number of sites that were still running glite-3.1 and glite-3.2 software that is no longer supported. In this quarter a campaign was started to make these sites upgrade their services that run this software. COD has issued GGUS tickets to these sites and is following this up.

ROD teams newsletter

This quarter we have published a ROD teams newsletter in October. The rationale behind the newsletter is descibed in the SA1.7-QR4 report.

ROD performance index

For background information on this, have a look at SA1.7-QR6, section RP OLA and ROD metrics. Since October 2011 we have been asking all NGIs above 10 items in the COD dashboard duting one month about the explanation through GGUS, what was the reason of such result and how do you plan to improve the situation. Currently we are continuing to collect and investigate these metrics and also to correlate this with other metrics and see if we can draw some conclusions from them. It appears that the amount of issues in the COD dashboard is going down.

Availability followup

See SA1.7-QR6 for more background information. A probe measuring the availability and reliability of a site has been supplied to the ops portal developers and is now deployed. The algorithm of this probe is incorporated into the ops portal and it will now generated alarms when a site's availability and reliability is below 70%/75%. As a consequence, COD will stop the activity of monthly issuing GGUS tickets to these sites as of November 1st 2012.

Unknown Followup

See SA1.7-QR6 and SA1.7-QR6 for more background information. In Q10 we have continued this activity.

Followup NGI Core Services availability

We have issued GGUS tickets to NGIs that do not meet the 99% availability requirement. In februari 2012 we have started up this activity. At first we have only submitted GGUS tickets to NGIs informing the of their low top-level BDII availability.

OMB

We are busy developing a procedure to incorporate test resources into the EGI infrastructure and to identify possible changes to the operational tools.

EGI TF12

We have organised a session for ROD teams at EGI TF12 in Prague. There were 26 participants. Further we gave two presentations from COd in the Future of Ops session at EGI TF12.

COD F2F meeting We have organised a COD fact to face meeting. Topics for this meeting will be:

* activities for the remainder of EGI InSPIRE
* pilot resource allocation
* how to raise availability and reliability and can we rais it further?
* reporting, is the tooling for this sufficient and what can be improved?

Network Support

Tested in a preliminar way CREAM CE and DPM using IPv6 in 4 different network configurations. Set up workload components services in the IPv6 testbed. Started structuring a global IPv6 testbed for EGI. Restructured and made more usable the whole IPv6 wiki.

HINTS further consolidated. Discussions with pS-MDM team on possible integration of probes still on going.

3. Issues and Mitigation

Issue Description Mitigation Description
Grid Oversight: Unresponsivity NGI_ZA during NGI certification process We will propose to close the GGUS ticket and roll back all activities that have been carried out so far in this field.
Grid Oversight: Unresponsivity of some NGIs observed during followup activities So far NGIs seem to respond well to personal emails. In these emails NGIs are asked to include into their working habits to have a look at GGUS a few times a day.

Network Support

Need to clarify relationship with CERN site and further integrate sites into the global testbed. Need to dig into the issue of LRMS not working using IPv6.


4. Plans for the next period

Network Support

Further extend the global IPv6 testbed including new sites and services. Further report on outcomes on https://wiki.egi.eu/wiki/IPv6TestReports. Finalize discussion in HINTS-pS-MDM integration.