EGI-InSPIRE:SA1.7-QR8

From EGIWiki
(Redirected from SA1.7-QR8)
Jump to: navigation, search
EGI Inspire Main page


Inspire reports menu: Home SA1 weekly Reports SA1 Task QR Reports NGI QR Reports NGI QR User support Reports


Contents


1. Task Meetings

Date (dd/mm/yyyy) Url Indico Agenda Title Outcome
23-02-2012 https://www.egi.eu/indico/conferenceDisplay.py?confId=827 COD https://www.egi.eu/indico/getFile.py/access?resId=0&materialId=minutes&confId=827
22-03-2012 https://www.egi.eu/indico/conferenceDisplay.py?confId=963 COD https://www.egi.eu/indico/getFile.py/access?resId=0&materialId=minutes&confId=963
17-04-2012 https://www.egi.eu/indico/conferenceDisplay.py?confId=1016 COD https://www.egi.eu/indico/getFile.py/access?resId=0&materialId=minutes&confId=1016

2. Main Achievements

Grid Oversight

ROD teams newsletter

This quarter we have published a ROD teams newsletter in February and April. The rationale behind the newsletter is descibed in the SA1.7-QR4 report.

ROD performance index

For background information on this, have a look at SA1.7-QR6, section RP OLA and ROD metrics. Since October we have been asking all NGIs above 10 items in the COD dashboard duting one month about the explanation through GGUS, what was the reason of such result and how do you plan to improve the situation. Currently we are continuing to collect and investigate these metrics and also to correlate this with other metrics and see if we can draw some conclusions from them.

Non-OK Alarms Followup

For background information on this, have a look at SA1.7-QR6, section Non-OK Alarms Followup. We have continued this activity in Q8.

Availability followup

See SA1.7-QR6 for more background information. There has been a phone conf with jra1 (https://www.egi.eu/indico/getFile.py/access?resId=0&materialId=minutes&confId=716) where the availability probe has been discussed. There will be a probe that meets the following specs:

We are waiting for this probe to be available for testing.

Apart from this we have continued the followup of this in the traditional way by means of GGUS tickets in Q8.

Unknown Followup

See SA1.7-QR6 and SA1.7-QR6 for more background information. In Q8 we have continued this activity.

Followup NGI Core Services availability

We have issued GGUS tickets to NGIs that do not meet the 99% availability requirement. In februari we have started up this activity. At first we have only submitted GGUS tickets to NGIs informing the of their low top-level BDII availability. The last month we have also pointed them to documentation on how to setup a reliable top-level BDII service. We hope this helps to reduce the number of NGIs gettig these kind of tickets.

TPM

Two infos, which should be regarded in the TPM’s daily work:

  1. We I would like to inform you that the Turkish NGI accepted to provide temporary operational support to Azerbaijan for the coming 12 months. This means that basic operational problems and tickets originated by site managers from Azerbaijan, have to be addressed by NGI_TR. Most of the tickets in GGUS are originated by Parvin Aliyeva (the site manager has a cern e-mail account). The site manager was instructed to contact NGI_TR to arrange the details of the operational support that will be provided by NGI_TR. For the moment I'm aware of a single site that is being configured.
  2. EGI requested to NGIs to configure their Nagioses to probe the glexec capabilities of the CEs accepting pilot jobs. One of the steps for the nagios administrators is to request the "/pilot" role for the VO ops. In the next couple of weeks or so, if in a GGUS ticket a user is asking for the '/pilot' role (pilot role is a VO role) without specifying any VO, is very likely that this ticket has to be assigned to "VOsupport, ops". New support units were added in the recent past:

Network Support 

3. Issues and Mitigation

Issue Description Mitigation Description
Grid Oversight: Unresponsive NGIs with respect to NGI core services followup tickets We will discuss a procedure how to deal with this with the COO
Grid Oversight: NGI creation procedure getting stuck on NGI unresponsiveness We will discuss a procedure how to deal with this with the COO
Network Support: UNICORE middleware testing in IPv6 not assigned so far.

4. Plans for the next period

Grid Oversight

The plans for the next period is to proceed with the current activities and come up with a proposal to include test resources in the infrastructure.

TPM

Network Support

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox
Print/export