NGI International Task Review MS109 Italy

This page contains the assessment of the NGI International Task at year 1 of the EGI-InSPIRE project (one page per NGI). The NGI representatives are required to fill the tables according to the required information. The content will be integral part of the EGI-InSPIRE milestone MS109 "NGI International Task Review"

User Services

Human Services (Table 1)

Table 1: NGI Assessment: User Services >> Human Services
EGI-InSPIRE EGI_DS Name Assessment Score How to Improve
NA3.3N U-N-3 U-N-13 Requirements Gathering Requirements are collected through appropriate mailing lists and specific questionnaires. Training events for grid users are also a good source for this. 4
NA3.3N U-N-14 U-N-15 Application Database A national instance of the application DB is hosted at INFN-Catania and managed by IGI. This has been discussed during the EGI User Forum in Amsterdam. The DB table contains some extensions with respect to the central EGI one in order to satisfy the italian user communities' needs. 4 sincronization between the italian instance and the central one is needed (and already planned)
NA3.3N U-N-16 U-N-17 Training IGI operations staff has participated to various national training events: as organizers of training events for grid administrators and as collaborators of training events for grid users. 5 no major improvement needed
NA3.3N U-N-12 U-N-18 U-N-19 Consultancy IGI grid operators provide consultancy through local contacts and training events (see above) 3 the involvment of grid operators in dissemination events could help

Operations Services

Human Services (Table 2)

Table 2: NGI Assessment: Operations Services >> Human Services
EGI-InSPIRE EGI_DS Name Assessment Score How to Improve
SA1.4N O-N-9 Requirements Gathering Requirements are collected through both technical and managerial mailing lists as well as during the bi-weekly phone conferences and meetings of the italian production grid. Requirements are then prioritized and sent to EGI-InSPIRE (OMB). 3 Usually very few grid sites or grid operators provide input when requested. This is probably due to the lack of a deep knowledge of the grid middleware internals and of the mw general framework. Specific training about EMI middleware and its evolution would probably help.
SA1.1N O-N-9 Operations Coordination Coordination is made through bi-weekly phone conferences and face to face meetings when necessary. Various mailing lists have been setup. 4 More than 50 sites are now part of the italian grid infrastructure; this make coordination a little bit difficult especially when urgent problems (e.g. a security update) have to be addressed.
SA1.2N O-N-9 Security IGI CSIRT has been set up and is operational. A Task Force to evaluate the software vulnerabilities and the countermeasures has been set up in order to speed up and ease the system administrators work. The national incident response procedures have been improved with the definition of the roles of all the involved partners, belonging to different institutions and domains. 3 Improvements of the security monitoring system (through a dedicated national monitoring framework and the development of security probes) and an auditing activity (in collaboration with other groups doing similar activities at the national level) would improve the overall security

Infrastructure Services (Table 3)

Table 3: NGI Assessment: Operations Services >> Infrastructure Services
EGI-InSPIRE EGI_DS Name Assessment Score How to Improve
SA1.3N O-N-9 Software Rollout Three sites are participating as Early Adopter for some gLite components (argus, nagios, mpi utils, storm, wms, cream, lfc mysql, cluster, lsf utils) considered most critical and used in the Italy.

At least three different procedures have been followed during this first year. Furthermore, technical issues dealing with tracking tools have been continuously experienced.

2 Simplify communication channels between early adopters and product teams during debugging activity (eg. review the GGUS ticket vs DMSU).

Separate early adopter team steps from staged rollout manager ones in the procedure to clarify what each role is supposed to do during the activity.

SA1.4N O-N-3 Monitoring Starting from the formal validation of the first regionalized release, the maturity level reached by the service availability monitoring framework can efficiently satisfy the main NGI operational needs. Moreover, the management of the credentials to execute checks over the regional infrastructure has been significantly enhanced with an early and natively support to robot certificates.

Nevertheless, some evolution towards a fully regionalized operations model have to be focused in the near future.

4 Some improvements on the flexibility to decide about critical metrics for both modify existing and introduce new ones.

NGIs could benefit from out of the box high availability solutions for the monitoring engine (Nagios).

SA1.5N O-N-2 Accounting Accounting in Italy relies on DGAS: data are collected and stored in a hierarchical database infrastructure (HLRs) and then published to the CESGA central repository.

IGI sites are regularly checked for correctness of published accounting data and republication is done when necessary. Tests of an ActiveMQ-enabled DGAS prototype are ongoing.

SA1.4N O-N-1 O-N-4 Configuration Repository and Operations Portal Currently we have not deployed regional instances for the operations dashboard since there is no interface between it and the regional ticketing systems. However we are participating to the Regionalization Task Force whose goal is the definition of use cases for the regional services 2 Fully regionalized tools will be interesting to us.
SA1.6N SA1.7N O-N-6 O-N-7 Helpdesk A regional Ticketing system based on XOOPS/XHELP is operational since EGEE era. It is interfaced to GGUS via web services and it provides helpdesk services for both national and international user communities. 4 Although the system is working properly, its maintenance and evolution is done by IGI only. The adoption of a more widely used and supported tool should be evaluated in order to improve sustainability
SA1.8N O-N-5 O-N-8 Core Services IGI managed several instances of gLite core services, geographically distributed, used as default by all italian sites:

2 Nagios used for NGI monitoring activity, 5 instances of TOP-BDII configured in HA, 4 instances of VOMS configured in HA, 2 instances of Myproxy, a LFC server supporting several VOs (both regional and international scope), 28 WMSes, 19 LBs, 13 server HLR for accounting purpose (part of DGAS architecture). Lots of services are configured in high availability and load balancing using some tool, such as an additional local nagios (for top-bdii, lfc, voms) and 3 instances of WMSMonitor (for WMS and LB) geographically distributed. Upgrade procedure for each services has been put in place, in order to minimize the impact from the user point of view. The overall availability reached is very satisfactory, the effort to manage all services has diminished thanks to the techniques of high availability and the procedures developed.

4 It is very difficult to configure some important services in high availability, Mysql service is a good example and we are working on it.

Other (Table 4)

Table 4: NGI Assessment: Other
EGI-InSPIRE EGI_DS Name Assessment Score How to Improve
NA2.3N Policy Development The activity consisted in the participation to the SPG meetings and to IGI policy groups related to security and operations. 2 A clearer definition of what has to be implemented would help people to contribute to the policy development work
NA2.2N E-N-2 Dissemination During the first year of the EGI-Inspire project, the Italian National Grid Initiative has reinforced and well established the sets already started with the EGEE series of projects.

In particular more efforts have been dedicated to the organization of some communication strategies aiming at a wider and better knowledge of the Italian Infrastructure, its usability, its features and the benefits one can get by using it for work. In particular different environments and communities have been selected to address the message, such as new SMEs, new research areas, and some fields of the Public Administration. Some specific meetings have been organized where details on the infrastructure were given by technical experts and some case studies were presented. A good feedback was collected.

Participation to major national and international events has been realized. Booths in computing related events have been set up, where posters have been exhibited and demos have been presented. Brochures and leaflets have been updated or brand new ones made and widely spread and distributed in all the above mentioned occasions. A video with all practical application of the Grid Infrastructure to the various fields of research has been done and made run at the SC10 event.

A sort of a closer cooperation with the training team has been tried. All dissemination material has usually been distributed to the courses attendants. Students and young researchers seem to be the most interested ones. Their feedback on brochure contents is a positive one and many of them show deeper interest in grid and starts attending courses and, thereafter, using grid for their research. Surveys and interviews on the level of knowledge/awareness about the grid have been practiced and some on specific needs have been collected. Feedback have been gathered, although, they are usually discouraging, considered the level of awareness of the non-user.

4 Critical

Relationship with the press and non-specialized journalists is critical. They still feel the Grid topic a difficult one to understand and to write about. Grid is still seen by the majority of people as a tool for just specific sectors of knowledge and production. Still, much there needs to be done, in order to enlarge the awareness of the grid usefulness among many sectors of social and productive life. When contacted, the press replies just if the news is a “breaking” one (everybody seem to be expecting the Higgs Boson Found). Releases, improvements and development of the existent features, do not seem to attract the press attention. Being too general in writing articles ends up in resulting banal and, on the other hand, being too specific with technical details ends up in something not easily understandable by non experts. This is thus a strange circle and makes it difficult to reach the objective, which is that of enlarging the fields of awareness.

More specific courses, trainings, tutorials should be kept with students, managers, stakeholders, decision makers in order to explain to a wider audience what grid is and how it has already changed the way of doing research, business, etc. This means, though, that a deeper knowledge of other fields needs is necessary. Dissemination team and training team should interact more strictly. This is what, as local NGI is being pursued. Unfortunately funding and efforts allocated are not always adequate to the amount of work that needs to be done.