|EGI Inspire Main page|
|Inspire reports menu:||Home •||SA1 weekly Reports •||SA1 Task QR Reports •||NGI QR Reports •||NGI QR User support Reports|
1. Task Meetings
|Date (dd/mm/yyyy)||Url Indico Agenda||Title||Outcome|
|31-03-2011||https://www.egi.eu/indico/conferenceDisplay.py?confId=437||COD Dashboard meeting||https://www.egi.eu/indico/conferenceDisplay.py?confId=437|
|weekly||https://www.egi.eu/indico/categoryDisplay.py?categId=27||shopping list meeting||https://www.egi.eu/indico/categoryDisplay.py?categId=27|
2. Main Achievements
ROD teams news letter
The transition from EGEE to EGI InSPIRE came about with a lot of changes. For Operations, the EGEE Regional Operations Centres, called ROCs, are in the process of being dismantled and their responsibilities transferred to the NGIs, or have already completed this process. In the EGI era, ROD teams will monitor the quality of sites in their country or region, whereas COD is responsible for the global oversight over the whole EGI infrastructure. This is to provide a high-quality grid infrastructure to the user communities. These changes have also leaded us to think about how COD and ROD are going to interact with each other in this new setting. During the Grid Oversight session at the EGI Tech Forum it was made clear to us that people find it cumbersome to travel in order to have regular face to face meetings. Nevertheless, we do feel the need to create and maintain a coherent and alive Grid Oversight community and to have interaction between ROD and COD that goes beyond the dashboards. This is necessary, in our view, to create a top-quality grid infrastructure for our users. For this reason we have created this newsletter. The purpose of this newsletter is to inform you about recent and upcoming developments related to Grid Oversight and to show to you the metrics indicating how well we did the past month. We have published newsletters since december 2011. We will continue to do this on a monthly basis.
ROD session at EGI UF
At the EGI User Forum in Vilnius, we have organised a ROD teams session. During the ROD session there were four presentations. The first one was from Marcin Radecki discussing the Grid Oversight work. In the second presentation, Gonçalo Borges from the NGI IBERGRID gave a very nice presentation on the IBERGRID operations and their experiences with the regionalised operational tools. Finally there was a slot on operational tools where two presentations were given by Cyril L'Orphelin on the status and roadmap of the operational portal and Emir Imamagic on the SAM roadmap. The presentations can be downloaded from: https://www.egi.eu/indico/sessionDisplay.py?sessionId=9&confId=207#20110411. We were very pleased with the fact that no less than 35 people were attending this session.
COD team has started using new technology to pass info to ROD members. You can now learn your duties by watching our video tutorials! The series will contain 6 parts:
1. How to become a ROD member – 7 steps which should be done to become a ROD member 2. Operations tools – a brief introduction of operations tools needed by a ROD member to perform their duties 3. How to handle alarms – an instruction how to manage alarms on the Operations Portal (ticket creation, closing and masking alarms) 4. How to handle tickets – an instruction how to manage tickets on the Operations Portal (ticket creation, updating and closing tickets) 5. Issues escalated to COD – an introduction of cases which are escalated to COD and how to deal with them 6. Operations portal – a brief introduction of the Operations Portal tools
Currently the first two videos are available and you can find links to them on ROD wiki page: wiki/Grid_operations_oversight/ROD#Videos_tutorials. All videos will be uploaded to YouTube soon.
TPM activity is done by two teams, which are in permanent contact, so no extra meetings are required to organize the daily work. TPM can be considered as a very reliable service. A prototype of the Technology Helpdesk (EMI/IGE/SAGA) was presented in Vilnius. It is a separate GGUS instance to deal with middleware related tickets. TPM should be able to identify these tickets and assign it to DMSU.
In Quarter Four IGI GARR staff has been coordinating the Network Support task for EGI according to IGI’s commitments in the EGI-Inspire DoW. In particular a videoconference among the contributing partners to the Net Sup task force has been organized on April 7, 2011, to make the point overall on the status of the tools being further developed by volunteering NGIs, to improve the usability and reliability. The status of HINTS has been reported by FranceGrilles and UREC; the current status of NetJobs has been summarized by GARR and UREC, and finally the status of PerfSONAR e2e Monitor live CD has been summarized by Red-IRIS. All tools appear to be already at a satisfactory level of development and deployment. The general decision to complete the general development of the tools, extensively test them on a distributed set of EGI sites and be ready to present the tools at a mature stage for deployment at the next EGI User Forum in Lyon has been taken, as suggested by the SA1 activity of EGI-Inspire during the face-to-face meeting in Amsterdam on January 24, 2011 ( Network Supported F2F OMB). In Q4 the IGI GARR staff has been deploying the NetJobs server in Rome and transferring the DB from the development server based in Paris, at UREC. The front-end http://netjobs.dir.garr.it currently connects to a DB located at GARR. Correspondingly, a platform to further develop both the front end and the back end DB, based on NetBeans and pgAdmin (for Postgresql) , has been set up, in order to enable the further refinement of the tool in the next months. Currently 8 sites in France and Italy do belong to the corresponding NetJobs associated testbed, but the intention is to extend this set in the next weeks. A session (informal discussion) on the further tasks the Network Support should be involved in (besides the three tools currently being provided) has been organized at the EGI User Forum in Vilnius on Monday April 11 2011. Few participants were actually attending the meeting ( FranceGrille, Switch/Swing, METALAB/CESNET, GARR, KIT/D-Grid (GGUS) ). Among the tasks which have been identified as requiring more effort is the support for IPv6. Even if Requirements for the Middleware do normally come from the UCB and the user community in general, the has been consensus to start a discussion with both the EGI and EMI communities around this item. In Q4 GARR has also attended a PerfSONAR development meeting in Poznan, Poland (February 7-8, 2011) and liaised with the NRENs/GEANT community about the PerfSONAR tools for multi domain network monitoring and their possible application to EGI. A permanent communication channel with GEANT / DANTE ( SA2, NA4) has been established around PerfSONAR and its possible application for EGI. In particular IGI/GARR staff joined the PerfSONAR User Panel, in consideration of their role within the EGI-Inspire project and EGI. Finally, in Q4 the FAQ section of the GGUS Network Support Unit has been updated, to describe the newly established procedures around network support for EGI, as discussed at the end of January in Amsterdam. A first installation of the HINTS server has been performed and it is currently reachable at GARR at http://grid-4.dir.garr.it : it will be further tested in the forthcoming months, jointly with the deployment of probes at various pilot sites.
To summarize the main achievements:
- Defined the long term strategy for Network Support. - Installed the HINTS server in France and Italy. - Written the NetSup GGUS FAQ section for GGUS users. - Defined the workflow for network support within GGUS. - Formalised liaison with GN3 and EGI participation to perfSONAR User Panel.
3. Issues and Mitigation
|Issue Description||Mitigation Description|
|Grid Oversight: None|
|Network Support: None|
4. Plans for the next period
1. Continue ROC transition to NGIs.
2. Continue investigation of the impact on operations support model related to new middlewares in EGI.
3. Continue the investigation on how to improve availability and reliability metrics.
4. Evaluation of upcoming new releases of the operational dashboard.
5. Finish the tutorial videos.
Plans shall be worked out to further automate TPMs work and how the monitoring of untouched tickets could be improved. A workshop with all people involved in the TPM task could help with this. In preparation for this workshop ticket statistics and analysis will be done.
Fully deploy the operational tools (HINTS, NetJobs, e2eMON Live perfSONAR CD) on a large testbed and test them in view of production usage by EGI. Organize a questionnarie for the NRENs to clarify NREN-NGI interaction models. Start discussion around IPv6 in the community.