Difference between revisions of "EGI-InSPIRE:Italy-QR4"

From EGIWiki
Jump to: navigation, search
Line 154: Line 154:
 
  Please, fill the table below. You can add a line copyng the two lines -->
 
  Please, fill the table below. You can add a line copyng the two lines -->
 
|-
 
|-
|  Nagios main issues were due to the impossibility to perform checks only for CEs enabling glexec with a different and ad hoc voms role (namely pilot role) ||  Since update 10 has been rejected, while waiting for the upcoming update 10.1, notifications for glexec warning at the Nagios level have been disabled as first mitigation.
+
|  Nagios main issues were due to the impossibility to perform checks only for CEs enabling glexec with a different and ad hoc voms role (namely pilot role) |  Since update 10 has been rejected, while waiting for the upcoming update 10.1, notifications for glexec warning at the Nagios level have been disabled as first mitigation.

Revision as of 11:00, 29 April 2011


Quarterly Report Number NGI Name Partner Name Author
QR4 IGI GARR M. Reale
QR4 IGI INFN L. Gaido, P. Veronesi


1. MEETINGS AND DISSEMINATION

1.1. CONFERENCES/WORKSHOPS ORGANISED

Date Location Title Participants Outcome (Short report & Indico URL)

1.2. OTHER CONFERENCES/WORKSHOPS ATTENDED

Date Location Title Participants Outcome (Short report & Indico URL)
February 7-8, 2011 Poznan GN3 SA2 T3 kick Off Meeting GARR: M. Reale, N. Ciurleo Formalized participation of EGI Net Sup in PerfSONAR User Panel. Report (protected by GN3 wiki password: [1] URL: [2]
April 7, 2011 VCONF Network Support update GARR: F. Galeazzi, M. Reale Defined long term strategy for Network Support coordination for EGI. Short Report: [3]
April 11-15, 2011 Vilnius EGI UF GARR (F. Galeazzi, A. Pagano, M. Reale), INFN (M. Bencivenni, R. Brunetti, D. Cesini, A. Cristofori, E. Fattibene, L. Gaido) (1 demo, 1 poster, 1 talk, 2 booths);[4]



1.3. PUBLICATIONS

Publication title Journal / Proceedings title Journal references
Volume number
Issue

Pages from - to
Authors
1.
2.
3.
Et al?

2. ACTIVITY REPORT

2.1. Progress Summary

2.1.1. GARR

In Quarter Four IGI GARR staff has been coordinating the Network Support task for EGI according to IGI’s commitments in the EGI-Inspire DoW. In particular a videoconference among the contributing partners to the Net Sup task force has been organized on April 7, 2011, to make the point overall on the status of the tools being further developed by volunteering NGIs, to improve the usability and reliability. The status of HINTS has been reported by FranceGrilles and UREC; the current status of NetJobs has been summarized by GARR and UREC, and finally the status of PerfSONAR e2e Monitor live CD has been summarized by Red-IRIS. All tools appear to be already at a satisfactory level of development and deployment. The general decision to complete the general development of the tools, extensively test them on a distributed set of EGI sites and be ready to present the tools at a mature stage for deployment at the next EGI User Forum in Lyon has been taken, as suggested by the SA1 activity of EGI-Inspire during the face-to-face meeting in Amsterdam on January 24, 2011 ( Network Supported F2F OMB). In Q4 the IGI GARR staff has been deploying the NetJobs server in Rome and transferring the DB from the development server based in Paris, at UREC. The front-end [5] currently connects to a DB located at GARR. Correspondingly, a platform to further develop both the front end and the back end DB, based on NetBeans and pgAdmin (for Postgresql) , has been set up, in order to enable the further refinement of the tool in the next months. Currently 8 sites in France and Italy do belong to the corresponding NetJobs associated testbed, but the intention is to extend this set in the next weeks. A session (informal discussion) on the further tasks the Network Support should be involved in (besides the three tools currently being provided) has been organized at the EGI User Forum in Vilnius on Monday April 11 2011. Few participants were actually attending the meeting (FranceGrille, Switch/Swing, METALAB/CESNET, GARR, KIT/D-Grid (GGUS)). Among the tasks which have been identified as requiring more effort is the support for IPv6. Even if Requirements for the Middleware do normally come from the UCB and the user community in general, the has been consensus to start a discussion with both the EGI and EMI communities around this item. In Q4 GARR has also attended a PerfSONAR development meeting in Poznan, Poland (February 7-8, 2011) and liaised with the NRENs/GEANT community about the PerfSONAR tools for multi domain network monitoring and their possible application to EGI. A permanent communication channel with GEANT / DANTE ( SA2, NA4) has been established around PerfSONAR and its possible application for EGI. In particular IGI/GARR staff joined the PerfSONAR User Panel, in consideration of their role within the EGI-Inspire project and EGI. Finally, in Q4 the FAQ section of the GGUS Network Support Unit has been updated, to describe the newly established procedures around network support for EGI, as discussed at the end of January in Amsterdam. A first installation of the HINTS server has been performed and it is currently reachable at GARR at [6]: it will be further tested in the forthcoming months, jointly with the deployment of probes at various pilot sites.


2.1.2. INFN

  • Italian NGI (IGI) has been setup and the transition from ROC has been completed.
  • An NGI_IT representative is actively involved in the new EGI task force to follow up operation tools regionalization. During this EGI task force early activities, NGI_IT is providing use case scenarios and new requirements to enhance the regionalization of the operation tools taken into account.
  • Availability/Reliability
    • good overall results in the league table for this quater: Feb (95%/97%), Mar (96%/97%), Apr (NA); a couple of issues has been identified at site level (see below).
  • Site changes
    • GRISU-COMETA-UNICT-DIIT and GRISU-ENEA-GRID has been suspended on April for low availability and very long downtime period.
    • The new site INFN-BOLOGNA-T3 has been certified on April.
  • Core Services
    • Alice WMS cluster has been decomissioned and the WMS instances previously used exclusively by the Alice VO are now assigned to the multi VO pool. This allowed the consolidation of the pool that will also be used by Superbvo.org which has recently joined the Grid and started to use the WMS service. High availability and reliability of the WMS clusters are ensured by Nagios checks and the load balancing based on WMSMonitor [7].
    • The italian LFC service supports several VOs (argo, ams02.cern.ch, babar, bio, cdf, compchem, comput-er.it, superbvo.org, eticsproject.eu, enea, glast.org, gridit, inaf, infngrid, ingv, libi, pacs.infn.it, pamela, planck, theophys, tps.infn.it, virgo, euchina, enmr.eu, euindia, cyclops, compassit). The setup of this core service has been migrated to a high availability and fault tolerance configuration (two frontends and a dedicated server for the mysql backend, all checked via nagios).
  • Early Adopter activity
    • we installed the sl5 STORM version in January in order to test the service, but a problem with the bdii package prevented the functioning of the storage element: indeed the yaim configuration could not properly finish. The update released March 23rd fixed this problem and the service is working fine.
    • As early adopter for the grid Nagios software, two main updates (09 and 10) have been deployed. While the first one has successfully tested and expected new functionalities succeeded, update 10 has shown some limitations in the support to glexec framework. Main issues were due to the impossibility to perform checks only for CEs enabling glexec with a different and ad hoc voms role (namely pilot role). Since update 10 has been rejected, while waiting for the upcoming update 10.1, notifications for glexec warning at the Nagios level have been disabled as first mitigation.

2.2. Main Achievements

2.2.1. GARR

  • Defined the long term strategy for Network Support.
  • Installed the HINTS server in France and Italy.
  • Written the NetSup GGUS FAQ section for GGUS users.
  • Defined the workflow for network support within GGUS.
  • Formalised liaison with GN3 and EGI participation to perfSONAR User Panel.

2.2.2. INFN

  • Production services (LFC, WMSes) consolidated in order to full fill VOs requirements and to improve their availability.

2.3. Issues and mitigation

Issue Description Mitigation Description

|- | Nagios main issues were due to the impossibility to perform checks only for CEs enabling glexec with a different and ad hoc voms role (namely pilot role) | Since update 10 has been rejected, while waiting for the upcoming update 10.1, notifications for glexec warning at the Nagios level have been disabled as first mitigation.