EGI Ecosystem V1
Note: Obsoleted by this updated service decomposition.
This page contain the description of the EGI Ecosystem as defined in the EGI-InSPIRE deliverable D2.7. Within the following categories the services are grouped as follows: ‘Human Services’ for coordination and community building; ‘Technical Services’ for supporting the collaboration and interaction of the user, operations and technology communities; ‘Infrastructure Services’ for securely accessing resource hosted by different organisations.
The EGI Global tasks are the responsibility of the EGI.eu organisation and are undertaken by EGI.eu staff in Amsterdam or by staff based at participants and associated participants institutions funded through EGI.eu, the NGIs and the EC through the EGI-InSPIRE project for the benefit and use of all.
The strategic direction of the EGI ecosystem and the collaboration between the individual activities is undertaken by the EGI Council. It also acts as the senior decision-making and supervisory authority of EGI.eu and as the organisation representing the EGI collaboration.
User Community Board (UCB)
A forum whereby representatives from self-organised virtual research communities (VRCs) meet to review and agree on the prioritisation of the emerging requirements for their use of EGI resources on a regular basis. The VRC model encourages researchers to identify and communicate with others in their field in order to capture the needs particular to their field of expertise and articulate them to EGI.
Operations Management Board (OMB)
It drives future developments in the operations area by making sure that the infrastructure operations evolve to support the integration of new resources such as desktop grids, cloud computing and virtualisation, and high performance computing resources. It does this by providing management and developing policies and procedures for the operational services that are integrated into the production infrastructure through a set of distributed management and product teams.
Technical Coordination Board (TCB)
It coordinates the interactions that EGI has with its technology providers. This involves combining the prioritised requirements from the operations and end-user communities into a technology roadmap. Elements from this roadmap are sourced from technology providers within the EGI community into the Unified Middleware Distribution (UMD). Before their inclusion into UMD these components are verified against the original requirements to ensure that these have been met.
An organisation needs a secretariat to support its governance functions, but also to support the community and the staff it employs. Within EGI.eu support is provided during Council and Executive Boards meetings, community support is provided through a range of IT services to local staff and to the collaboration (e.g. website, wiki, meeting planner, mailing lists, document server, timesheet tool). In addition the community organises two large meetings a year (the User and Technical Forums) to continue the building collaborations within EGI and a number of additional workshops as required to support the community’s activities.
A concentrated management effort is essential to guarantee a harmonious and coordinated implementation of the strategic policies approved by the governance bodies. Operations, Technology and User Community Managers have both reactive (dealing with the daily technical decisions needed to run a complex organisation) and a proactive managerial role (in identifying issues that need to be brought to the relevant management bodies). They provide technical direction and leadership to the staff within EGI.eu and those in the community engaged in the activity to ensure the proper definition and implementation of a professionally managed infrastructure.
This activity is led by the EGI.eu Policy Development Team (PDT) and encompasses a number of important tasks. These include supporting the boards and committees within EGI that draft policies and define procedures for evolving the technical infrastructure, for its operation and for access by the various VRCs. Policy development includes the definition and implementation of the approval process of policies and procedures within EGI. It also includes the formulation and development of position papers, by gathering and elaborating material, to inform the EGI management bodies and the EGI community about the opportunities for aligning with strategic-level policies or for supporting a decision-making process. They also support the negotiation and monitoring of agreements via Memorandum of Understandings with external partners (specifically, technology providers, resource infrastructure providers and virtual research communities). The policy development team takes care of establishing and maintaining communications channels with policy makers from EGI.eu participants in order to fluently propagate policy-oriented information within the consortium.
This activity is coordinated by EGI.eu on behalf of the European NGIs and projects, and other international partners. The aim is to communicate the work of the EGI and its user communities and target audiences for the dissemination outputs to new and existing user communities, journalists, general public, grid research and standards communities, resource providers, collaborating projects, decision makers and governmental representatives. Means for dissemination include the project website, wiki site, materials and publications, media and public relations, social media channels and attendance at events in order to market EGI to new users.
Maintaining the technology roadmap for EGI requires the collection, prioritisation and analysis of requirements from the user and operations communities. From these requirements, new features are sourced from technology providers currently known to EGI, or from open-source or commercial technology providers. Components coming from within the EGI community, in order to provide bespoke functionality needed within the production infrastructure that cannot be sourced elsewhere, are captured within the UMD Roadmap. This evolving document translates users requirements and technology evolution into a roadmap describing the functional aspects, release dates, maintenance support, acceptance criteria and dependencies for software components that are offered to the Resource Infrastructure Providers for installation.
User & Community Support
The EGI.eu User Community Support Team (UCST) coordinates the work of the NGI User Support Teams around Europe. Much of the work focuses on an efficient information flow between the user communities and the NGIs and other EGI partners that provide the sites and resources that comprise EGI. The team drives coordination of the user community activities, the requirements collection and analysis, and the management of the user community technical services.
EGI.eu coordinates and supervises operations and network support activities provided by the individual NGIs to ensure that operational issues are properly handled at both Resource Centre and NGI level. It is also responsible of handling of Resource Centre suspension in case of operational issues.
Ticket Process Management
Through the EGI helpdesk support issues are routed through to NGI support teams. Some of these requests may be related to specific support units but others issues relating to users’ use of the e-infrastructure will require human intervention either from an operational or user support aspect.
A transparent requirements processing system is needed to offer a system where the user or operations community can requirements, or to share them within the whole EGI community. All of these requirements are investigated, analysed and prioritised within a transparent and structured process. The prioritised requirements can then be acted upon by other parties as appropriate. Depending on the domain and potential impact, identified needs might be met by the User Support Teams or Operations within EGI or by technology providers external to EGI be they community-based, project-based or commercial. The progress and outcomes of whichever solutions are adopted will be fed back to the requesting community on a regular basis.
Security vulnerabilities and risks presented by e-Infrastructures provide a rationale for coordination amongst the EGI participants at various levels. Central coordination groups ensure policies, operational security, and maintenance to guarantee secure access to users. In addition, security and incident response is provided through the EGI Computer Security and Incident Response Team by coordinating activity at the sites across the infrastructure. This coordination ensures that common policies are followed by providing services such as security monitoring, training and dissemination with the goal of improving the response to incidents (e.g. security drills).
Updates of deployed software need to be gradually adopted in production after internal verification. This process is implemented in EGI through staged rollout, i.e. through the early deployment of a new component by a selected list of candidate Resource Centres. The successful verification of a new component is a precondition for declaring the software ready for deployment. Given the scale of the EGI infrastructure, this process requires careful coordination to ensure that every new capability is verified by a representative pool of candidate sites, to supervise the responsiveness of the candidate sites and ensure that the staged rollout progresses well without introducing unnecessary delays, and to review the reports produced. It also ensures the planning of resources according to the foreseen release schedules from the Technology Providers. EGI.eu coordination is necessary to ensure a successful interoperation of the various stakeholders: Resource Centres, Technology Providers, the EGI.eu Technical Manager and the EGI repository managers.
A distributed monitoring framework is necessary to continuously test the level of functionality delivered by each service node instance in the production Resource Centres, to generate alarms and tickets in case of critical failures and to compute monthly availability and reliability statistics, and to monitor and troubleshoot network problems. The Monitoring Infrastructure is a distributed service based on Nagios and messaging. The central services – operated by EGI.eu – include systems such as the MyEGI portal for the visualisation of information, and a set of databases for the persistent storage of information about test results, availability statistics, monitoring profiles and aggregated topology information. The central services need to interact with the local monitoring infrastructures operated by the NGIs. The central monitoring services are critical and need to deliver high availability.
The EGI Accounting Infrastructure is distributed. At a central level it includes the repositories for the persistent storage of usage records, and a portal for the visualisation of accounting information. The central databases are populated through individual usage records published by the Resource Centres, or through the publication of summarised usage records. The Accounting Infrastructure is essential in a service-oriented business model to record usage information. Accounting data needs to be validated and regularly published centrally.
A Security Infrastructure is needed to monitor the status of the individual Resource Centres in case of security vulnerabilities. The monitoring infrastructure – currently based on Nagios and Pakiti - is dedicated. A central security dashboard is also needed to allow sites, NGIs and EGI Computer Security Incident Response Teams to access security alerts in a controlled manner. In addition, a ticketing system is needed to support security incident coordination.
EGI relies on a central database (GOCDB) to record static information about different entities such as the Operations Centres, the Resource Centres, and the service instances. It also provides contact, role and status information. GOCDB is a source of information for many other operational tools, such as the broadcast tool, the Aggregated Topology Provider, etc.
EGI.eu provides a central portal for the operations community that offers a bundle of different capabilities, such as the broadcast tool, VO management facilities, and a dashboard for grid operators that is used to display information about failing monitoring probes and to open tickets to the Resource Centres affected. The dashboard also supports the central grid oversight activities. It is fully interfaced with the EGI Helpdesk and the monitoring system through the message passing. It is a critical component as it is used by all EGI Operations Centres to provide support to the respective Resource Centres.
EGI provides support to users and operators through a distributed helpdesk with central coordination (GGUS). The central helpdesk provides a single interface for support. The central system is interfaced to a variety of other ticketing systems at the NGI level in order to allow a bi-directional exchange of tickets (for example, those opened locally can be passed to the central instance or other areas, while user and operational problem tickets can be open centrally and subsequently routed to the NGI local support infrastructures).
This central portal provides easy access to the key performance indicators of the production infrastructure for various stakeholders in the ecosystem such as the NGIs, VRCs, VOs and EGI.eu.
Auxiliary core services are needed for the good running of Infrastructure Services. Examples of such services are VOMS service and VO membership management for infrastructural VOs (DTEAM, OPS), the provisioning of middleware services needed by the monitoring infrastructure (e.g. top-BDII and WMS), the catch-all CA and other catch-all core services to support small user communities (central catalogues, workflow schedulers, authentication services).
The technical instantiation of a user community within the infrastructure is a VO. Members of Virtual Resource Communities are provided by various technical services to collect availability, accounting and monitoring information about their VOs. The VO Services group within EGI.eu currently provides a basic, Nagios-based, VO-specific testing and monitoring system for VRCs and is extending this service with additional components and capabilities as the communities’ needs evolve. The team also evaluates other VO services producing white papers and manuals for VRCs who wish to operate such services themselves.
Software Accptance Criteria
Based on the prioritised requirements obtained from the operations and end-user communities, software acceptance criteria are defined to capture the key functional and non-functional features expected from the delivered technologies.
Before new technology releases to EGI are made available for staged rollout, they are assessed to ensure that they meet the original requirements. This verification takes place by deploying and assessing the software against the publicly published criteria.
The software repository provides the coordination needed by EGI for the release of software, the UMD, into production. Technology providers can contribute their software components into the repository, it manages the workflow as the software components are validated to ensure they meet the defined quality criteria and then placed into staged rollout.
The EGI Applications Database stores tailor-made computing tools for scientists to use. It embraces all scientific fields, from resources to simulate exotic excitation modes in physics, to applications for complex protein sequences analysis. Storing pre-made applications and tools means that scientists do not have to spend research time developing their own software. The goal for AppDB is twofold: 1) to inspire scientists less familiar with programming to use EGI and its resources due to the immediate availability of the software that they need to use; 2) to avoid duplication of effort across the user community.
The training services are aimed at supporting cooperation between trainers and users in different localities by connecting the groups through the activities that are established within the NGIs and scientific clusters. The goal is to enable users to achieve better scientific performance when using EGI and guide the establishment of self-sustainable user communities. Among the provided services include training events list, which allows trainers to advertise their training events and to be made aware of other training events being run within the community; and a repository of training materials.
NGI International Tasks
The NGI International Tasks are the responsibility of the individual NGI to deliver the task to a satisfactory level, funded through the NGI‘s own budget with currently a contribution from the EC through the EGI-InSPIRE project. Staff in EGI.eu is there to coordinate the staff undertaking the NGI International Tasks – they have no managerial control over them.
While new requirements are gathered centrally, the collection of new requirements starts in the NGIs and EIROs. They have the contacts with the users and operations staff that are using or operating the EGI resources on a daily basis and can identify issues that need to be resolved.
The application database provides a mechanism for users to discover which applications are in use, or are being ported to use the production infrastructure. NGI staff has a vital role to play in adding new entries and keeping entries up to date as they work with their respective user communities.
Many NGIs are able to provide generic or specific training courses to help user communities use EGI resources. The training services (calendar, register of trainers and digital library) provide a means of enabling the coordination that NGIs need to do locally in collaboration with other NGIs to support particular user communities.
The staff within NGIs represent an excellent source of local expertise for new users or new sites wishing to make use of e-Infrastructure. This expertise can be disseminated through training, but more frequently requires in depth one on one work with particular applications or user groups.
Local policy development activities are integrated with those taking place within the EGI.eu Policy Development Team that supports the development of policies and procedures at a European level. It is the local partner who implements policies and procedures locally. Therefore, most of the NGIs responsibilities include implementing EGI policies and procedures, developing EGI policies and procedures by participation in EGI policy groups, communicating with national governments and national research councils about policy priorities for the DCIs, establishing agreements with Resource centres, and drafting national policies and procedures that are in alignment with EGI ones.
NGIs are responsible for coordinating internal operational activities and to participate to the OMB for coordination at the EGI level.
NGIs contribute to software vulnerability assessment and to internal Computer Security Incident Response activities.
NGIs promote their work and that of EGI to their local national audiences. Therefore, while the external liaison functions at a European level are coordinated by EGI.eu, NGIs are focused on dissemination and liaison at the regional and national level. NGIs also provide EGI representation at local and regional events. NGIs active on the international front are considered to represent themselves, but are of course free to propose coordination of any international activities with EGI.eu. NGIs report news stories and interesting user community events in their local area to the central EGI.eu team for further dissemination. They also get involved by providing people to be at these events. In addition, some of the NGI dissemination activities include publicising local success stories in suitable media, creating materials for various audiences (from politicians to scientists), writing up success stories, pointing potential users in the right direction, etc.
Infrastructure Services operated at the NGI–level are needed to integrate and complement the Global Tasks operated by EGI.eu.
While EGI.eu is responsible of the coordination and supervision of the process, individual Resource Centres are requested to participate as early adopters to staged rollout for proper verification of new deployed software releases in the production infrastructure.
The EGI Monitoring Infrastructure is distributed. The NGI Monitoring Infrastructure is responsible of running periodic functionality checks. Results are stored and displayed locally through NGI portals, and are collected centrally at an EGI-level to provide an overall view of the EGI Resource Infrastructure status.
Usage records are collected by each Resource Centre. Depending on the customisable set-up chosen by the NGI, the data gathered can be directly published in the central databases, or alternatively can be persistently stored at an NGI level and summarised for publication at an EGI level. NGIs are responsible of the validation of the data gathered and to supervise the record publication process to make sure that records are regularly collected centrally.
Configuration repository and operations portal
prototypes of the central configuration repository (GOCDB) and of the Operations Dashboard have been recently released for NGI deployment. These NGI tools are designed to allow for a greater level of customisation at an NGI-level. The deployment of such tools is currently optional.
A NGI support system fully integrated with the central instance – GGUS – is often required to support local users and Resource Centre administrators. This is typically required by medium and large NGIs. For small-scale NGIs operating a limited number of Resource Centres, the local support system can be simply implemented centrally through a dedicated support unit.
Core middleware services for user information discovery, authentication, workflow management, file cataloguing etc., are often provided by NGIs to support users and the local Infrastructure Services. The actual set of services operated can vary, and depends on the scale of the NGI and on the number of VOs supported.