Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "EGI-InSPIRE:Ibergrid-QR8"

From EGIWiki
Jump to navigation Jump to search
Line 124: Line 124:


===2.1. Progress Summary===
===2.1. Progress Summary===
# LCMAPS message report sent to SA1 coordinators, and introduced in RT ticket: https://rt.egi.eu/rt/Ticket/Display.html?id=1983
# Operations/Platform Deployment Survey was distributed between the Ibergrid Sites, then the results were collected and the summary was introduced in the EGI wiki on the following link, https://wiki.egi.eu/wiki/Operations/Platform_Deployment_Survey.
# Internal survey to the Ibergrid sites about evaluation of the quality of services and tools at disposal of the site administrators. Analysis and discussion of the survey results in the weekly Ibergrid Operations meeting.
# Added a third primary DNS from RedIRIS site to the TopBDII HA mechanism. This third primary DNS, elsuper.rediris.es, was put on production during March 2012.
# New operations Wiki: http://ibergrid.lip.pt.
# Made a report with the different TOP-BDIIs used by the Ibergrid sites in their services.
# Analysis of the 1st EGI review report.
# Closed the GGUS tickets which had been open to the Ibergrid sites to configure the VOMS redundancy for the Ibergrid VOS (GGUS tickets from #76214 hasta to #76232).
# Push sites to implement VOMS redundancy for the Ibegrid VOs
# Open the GGUS tickets from #81617 to #81625 to the Ibergrid sites to support the Ibergrid macro VOs and the VOMS redundancy for these VOs correctly.
# Implementation of the TopBDII HA proposal in NGI_IBERGRID. Process is being tracked via GGUS ticket: https://ggus.eu/ws/ticket_info.php?ticket=74883
# Working in the RT-GGUS integration and in the User support shifts. GGUS, Security, and General Support queues in the Ibergrid RT ticket system. Created two new email addresses, helpdesk@ibergrid.eu address to submit tickets to the Ibergrid RT ticket system, and  ibergrid-support@listas.cesga.es is a mailing list to be used for the communications between the user support shifts teams.
# Decommission the SWE helpdesk. GGUS is no longer working with SWE helpdesk. Now, when a ticket is opened to a site in the Ibergrid NGI, a notification is sent to the ibergrid-tickets@listas.cesga.es mailing list, and an additional notification is submitted to the site-administrators for the site via the site support email declared in GOCDB.


===2.2. Main Achievements===
===2.2. Main Achievements===
# Two new sites were joined to the infrastructure:  ARAGRID-CIENCIAS and RC-GISELA-CETA.
# Enforcement of the TopBDII HA mechanism at local sites.
# It has been installed a dedicated WMS to support the Ibergrid VOs.
# 100% A/R in February, March and April 2012 for the TopBDII service (after the implementation of the TopBDII HA mechanism).
# A new site was joined to the Ibergrid infrastructure: CETA-GRID.  


===2.3. Issues and mitigation===
===2.3. Issues and mitigation===
Line 141: Line 143:
!scope="col"| Issue Description
!scope="col"| Issue Description
!scope="col"| Mitigation Description
!scope="col"| Mitigation Description
|-
|UOGRID site is still suspended.
|This site entering in suspension status after 1 month of downtime during the last QR5.
|-  
|-  
|There was several issues related with the Ibergrid Regional Nagios:
|There was several issues related with the Ibergrid Regional Nagios:
* There was an external issue related with the connection between RedIris and the Galicia centers), which affected to the Ibergrid Regional Nagios. The incident had place on 08/09/2011, and it took from 01:00 AM CEST to 13:00 PM CEST. GGUS ticket #74146 was open in order to do not take into account that period in the A/R metrics.  
* The request on the VOMS OPS server for the digital certificate that it is being used to submit the Ibergid Nagios tests expired the Friday 30th March at night, and it was restarted  the Saturday 31th March in the morning. GGUS ticket #80793 was open in order to do not take into account that period in the A/R metrics, but the re-computation was rejected since the impact was very low (less than 1.5%).
* There was an external issue related with the top-bdii configured in the R-Nagios, which affected to the Ibergrid Regional Nagios. The incident had place on 19/09/2011, and it took from 22:00 PM CEST to 12:00 AM CEST (20/09/2011). GGUS ticket #74652 was open in order to do not take into account that period in the A/R metrics.
* There was an external issue related with the connection between the Spanish and Portuguese NRENs to the GÉANT network, which affected to the Ibergrid Regional Nagios. The incident had place on 04/12/2012, and it took from 17:00 PM CEST to 04/23/2012 at 14:00 PM CEST. GGUS ticket #81227 was open in order to do not take into account that period in the A/R metrics.  
* It was necessary to reinstall the Ibergrid Regional Nagios, due to an issue related with the Mysql. GGUS ticket #75701 was open in order to do not take into account that period in the A/R metrics.  
|  
|  
|}
|}

Revision as of 12:06, 2 May 2012

_NOTOC__

Quarterly Report Number NGI Name Partner Name Author
QR8 Ibergrid LIP & CSIC Álvaro Simón García, Esteban Freire García (CSIC)


1. MEETINGS AND DISSEMINATION

1.1. CONFERENCES/WORKSHOPS ORGANISED

Date Location Title Participants Outcome (Short report & Indico URL)


1.2. OTHER CONFERENCES/WORKSHOPS ATTENDED

Date Location Title Participants Outcome (Short report & Indico URL)
28-Jan to 1-Feb LBL Berkeley (USA) LHCOPN and LHCONE joint meeting 1
  • ifae/pic: Follow up of activities of the networking group for the LHC. The architecture of the new LHCONE dedicated network extended to Tier2s is being decided now.

https://indico.cern.ch/conferenceDisplay.py?confId=160533

13-14 February Brussels Cloud Scape IV 2
15th February EVO meeting EGI Security Threat Risk Assessment meeting 2
23th March EVO meeting EGI Security Threat Risk Assessment meeting 2
  • RedIRIS: The main aim of this meeting was to finalize the list of threats as much as possible and decide how to further proceed.
26-30 March Garching, Munich EGI Community Forum 2012 ~ 19
10-13 April Bern EuroSys 2012 1
17-19 April DESY, Zeuthen (Germany) 6th International dCache Workshop 3
20th April EVO meeting EGI Security Assessment group meeting 2
  • RedIRIS: The main aim of this meeting was to discuss threats with deviating opinion on risk and plans for final report.
23-24 April Bologna EGI-CSIRT Face to Face meeting 1
  • RedIRIS: Review of the actions and task force which EGI-CSIRT is involved, besides a couple of hands-on were given during the meeting, one about forensic analysis and the second one about handling incidents using RTIR.

https://www.egi.eu/indico/conferenceTimeTable.py?confId=812.

23-27 April Prague HEPIX Spring 2012 Workshop 1
  • ifae/pic: Contribution “PIC Site Report” and attendance to the whole HEPIX workshop.

https://indico.cern.ch/contributionDisplay.py?contribId=16&sessionId=1&confId=160737

25 April CERN Many-core architectures for LHCb 1
  • ifae/pic: Follow-up of the LHC experiment software framework evolution to make efficient use of many core processor architectures.

https://indico.cern.ch/conferenceDisplay.py?confId=184092


1.3. PUBLICATIONS

Publication title Journal / Proceedings title Journal references
Volume number
Issue

Pages from - to
Authors
1.
2.
3.
Et al?
DRI: Data and Job Management on the Grid EGI Community Forum 2012 (26-30 March 2012). Poster. http://cf2012.egi.eu/exhibition/posters_and_demos.html José Miguel Franco Valiente, César Suárez Ortega, Manuel Rubio Del Solar, Jorge Sevilla Cedillo
Web Interface for Generic Grid Jobs Computing and informatics Vol 31, 2012, No.1 p173-186 Antònia Tugores, Pere Colet


2. ACTIVITY REPORT

2.1. Progress Summary

  1. Operations/Platform Deployment Survey was distributed between the Ibergrid Sites, then the results were collected and the summary was introduced in the EGI wiki on the following link, https://wiki.egi.eu/wiki/Operations/Platform_Deployment_Survey.
  2. Added a third primary DNS from RedIRIS site to the TopBDII HA mechanism. This third primary DNS, elsuper.rediris.es, was put on production during March 2012.
  3. Made a report with the different TOP-BDIIs used by the Ibergrid sites in their services.
  4. Closed the GGUS tickets which had been open to the Ibergrid sites to configure the VOMS redundancy for the Ibergrid VOS (GGUS tickets from #76214 hasta to #76232).
  5. Open the GGUS tickets from #81617 to #81625 to the Ibergrid sites to support the Ibergrid macro VOs and the VOMS redundancy for these VOs correctly.
  6. Working in the RT-GGUS integration and in the User support shifts. GGUS, Security, and General Support queues in the Ibergrid RT ticket system. Created two new email addresses, helpdesk@ibergrid.eu address to submit tickets to the Ibergrid RT ticket system, and ibergrid-support@listas.cesga.es is a mailing list to be used for the communications between the user support shifts teams.
  7. Decommission the SWE helpdesk. GGUS is no longer working with SWE helpdesk. Now, when a ticket is opened to a site in the Ibergrid NGI, a notification is sent to the ibergrid-tickets@listas.cesga.es mailing list, and an additional notification is submitted to the site-administrators for the site via the site support email declared in GOCDB.

2.2. Main Achievements

  1. Enforcement of the TopBDII HA mechanism at local sites.
  2. 100% A/R in February, March and April 2012 for the TopBDII service (after the implementation of the TopBDII HA mechanism).
  3. A new site was joined to the Ibergrid infrastructure: CETA-GRID.

2.3. Issues and mitigation

Issue Description Mitigation Description
There was several issues related with the Ibergrid Regional Nagios:
  • The request on the VOMS OPS server for the digital certificate that it is being used to submit the Ibergid Nagios tests expired the Friday 30th March at night, and it was restarted the Saturday 31th March in the morning. GGUS ticket #80793 was open in order to do not take into account that period in the A/R metrics, but the re-computation was rejected since the impact was very low (less than 1.5%).
  • There was an external issue related with the connection between the Spanish and Portuguese NRENs to the GÉANT network, which affected to the Ibergrid Regional Nagios. The incident had place on 04/12/2012, and it took from 17:00 PM CEST to 04/23/2012 at 14:00 PM CEST. GGUS ticket #81227 was open in order to do not take into account that period in the A/R metrics.