EGI-Engage:WP4 (JRA2) Platforms for the Data Commons
|EGI-Engage project:||Main page||WP1(NA1)||WP3(JRA1)||WP5(SA1)||PMB||Deliverables and Milestones||Quality Plan||Risk Plan||Data Plan|
|WP2(NA2)||WP4(JRA2)||WP6(SA2)||AMB||Software and services||Metrics||Project Office||Procedures|
WP leader: Tiziana Ferrari/EGI.eu (interim)
This activity advances the current technical infrastructure of EGI by expanding the capabilities of the current platforms, and by integrating new ones. The result of the activity will be an integrated solution of data and compute services that will contribute to the Open Commons solution. It will do so by further evolving the EGI Federated Cloud infrastructure platform to provide the integrating services and users with greater flexibility and elasticity in the overall use of the platform, as well as ensuring continuity in the support for Cloud Middleware Frameworks. It will also introduce an Open Data Access platform that will provide capabilities to publish, use and reuse openly accessible data (including, but not limited to, scientific data sets released into the public domain, publicly funded research papers and project deliverables, and software artefacts and demonstrators coming out of public research projects). Ensuring support for a broad number of use cases and data commons needs, activities in this work package will also include integration activities of a number of partner e-Infrastructures both located in Europe and worldwide. This will include integrating existing cloud infrastructures with the EGI Federated Cloud platform (e.g. the Canadian CANFAR infrastructure) and accelerated computing facilities (e.g. GPGPUs – general-purpose computation on graphics processing unit). The work package objectives are:
- Expand the EGI federated Cloud platform with new IaaS capabilities;
- Prototype an open data platform;
- Provide a new accelerated computing platform;
- Integrate existing commercial and public IaaS Cloud deployments and e-Infrastructures with the current EGI production infrastructure.
To contact all task leaders (see below), send mail to
||Federated Open Data
TASK JRA2.1 Federated Open Data
(Lead: CYFRONET, M1 – M30)
Estimated task effort: 45PM
This task will focus on designing and prototyping an Open Data platform as a solution to integrate various data repositories available in EGI, offer the capability to make data open, and link them to the OpenAIRE open access infrastructure and other key open data catalogues following the available guidelines.
Analysis of open data use cases and requirements
Analyse and support test use cases of open data from different data providers, including fishery and marine sciences, agriculture (Agri-Know) and biodiversity datasets. This will be expanded with additional communities according to the requirements collected from the competence centres (SA2). In the initial phase, this task will analyse existing solutions involved in data storage virtualisation and open data standards.
Design and develop the Open Data platform prototype
This activity will be implementing a prototype of the Open Data platform that organises the flow of the open data between EGI infrastructures and the “outside” world. The solution will be decentralised and will reduce barriers and effort required to publish or process open data. The Open Data platform will focus on the integration with OpenAIRE, and optimisation of data access. The design will consider the possibility to integrate current EGI storage services into the platform backend.
Open Data platform demonstrator
The prototype of the Open Data platform will be demonstrated on resources provided by the NGIs that volunteered to offer capacity initial testing and feedback These providers are: CYFRONET (NGI-PL), IFCA (NGI-ES), CESNET (NGI-CZ). The list might grow during the execution of the project.
TASK JRA2.2 Federated Cloud
(Lead: CSIC, M1 – M30)
Estimated task effort: 63PM
The objectives of this task are:
- Evolve the federated IaaS Cloud platform with functionalities required by the CCs;
- Extend ‘open standards’-based interfaces exposing new capabilities;
- Maintain interface support for future versions of popular Cloud Management Frameworks (CMFs).
The EGI Applications Database (AppDB) will evolve from its current role as catalogue of applications and virtual machines (VM) to include a graphical user interface allowing authorised users to perform basic VM management operations.
Extending VM management standards support
This activity will design and define two new OCCI interface extensions supporting new capabilities requested from EGI-Engage’s CC:
- Support for users creating snapshots of running VM instances and make these available as first-class VM images that can be instantiated
- Support for changing attached resources to an executing VM instance.
Relocating VM instances between providers
This activity will design the workflow and interactions that are necessary to support relocating suspended VM instances from one federated cloud resource provider to another.
Integration support for CMFs
Participants in this activity will contribute the necessary backend (i.e. CMF) implementations of the capabilities that are developed in this task, as well as on-going maintenance support for new and existing features for future releases of OpenNebula and OpenStack deployed in the EGI Federated Cloud infrastructure.
TASK JRA2.3 e-Infrastructures Integration
(Lead: INFN, M1 – M30)
Estimated task effort: 35PM
This task will foster the expansion of the EGI capacity and capabilities by integrating its technical solutions with those offered by other e-Infrastructures. The task will gather requirements from user communities involved in WP6 and will coordinate the implementation of pilots and will liaise with the external partners.
EGI-EUDAT Harmonisation for Virtual Research Environments (BBMRI, EISCAT_3D) (M3 – M27)
The activity will be responsible for the collaboration with the service providers of the EC project EUDAT towards a harmonisation of the two infrastructures, including technical interoperability, authentication, authorisation and identity management, policy and operations. Effort for the implementation of pilots and for the adaptation of the user Virtual Research Environments will be provided by the Competence Centres, in particular BBMRI and EISCAT_3D. Following this, a joint call for additional cross-infrastructure case studies will be launched at PM15.
Canadian Advanced Network for Astronomical Research (Lead: INFN) (M6 – M30)
The Canadian Advanced Network for Astronomical Research (CANFAR) is a computing infrastructure for astronomers in Canada. International collaboration in the Astronomy discipline will be supported both by the Canadian Astronomy Data Centre (CADC) and EGI. CANFAR and EGI will work together to integrate both e-Infrastructures towards a seamless and uniform platform for international astronomy research collaboration. Community services will be provided on top of the federated cloud of EGI using open source solutions and re-using the CANFAR experience.
Integration for gCube and the D4Science infrastructure (M1 - M12)
This activity will integrate D4Science resources at Engineering (commercial) and CNR (public) into the EGI Federated Clouds infrastructure. The gCube framework will be extended to use EGI Federated Cloud resources through implementing OCCI client capabilities.
TASK JRA2.4 Accelerated Computing
(Lead: IPB, M1 – M15)
Estimated task effort: 13PM
Accelerated computing systems deliver energy efficient and powerful HPC capabilities. Many EGI sites are providing accelerated computing technologies to enable high performance processing such as GPGPUs or MIC co-processors. Currently these accelerated capabilities are not directly supported by the EGI platforms. To use the co-processors capabilities available at resource centre level, users must directly interact with the local provider to get information about the type of resources and software libraries available and which submission queues must be used to submit tasks of accelerated computing.
This task will implement the support in the information system, to expose the correct information about the accelerated computing technologies available – both software and hardware – at site level, developing a common extension of the information system structure, based on OGF GLUE standard, in order to have the capabilities published uniformly by all the sites. Users will then be able to extract all the information directly from the information system without interacting with the sites, and easily use resources provided by multiple sites. The task will also extend the HTC and cloud middleware support for co-processors, where needed, in order to provide a transparent and uniform way to allocate these resources together with CPU cores efficiently to the users.
The following gives an overview of deliverables scheduled
|Code||Title||Delivery PM||Delivery CM||Delivered date||Status|
||CANFAR integration roadmap (R)
||VM snapshot support: OCCI extension, final specification (OTHER)
||Resource template changes: OCCI extension, final specification (OTHER)
||Relocating VM instances between providers, final specification (OTHER)
||Deployment of a gCube release with Federated Cloud support (OTHER)
||e-Infrastructures integration report (R)
||Open Data Platform First Prototype (DEM)
||Cross-infrastructure case studies report (R)
||Open Data Platform: Demonstrator, Experience Report and Use Cases (Report + Prototype)
||Final CANFAR integration release (DEM)
The following gives an overview of milestones scheduled
|Milestone||Title||Lead-Task||Delivery PM||Delivery CM||Delivered||Status|
||Open Data Platform: requirements and implementation plans
||Launch of call for cross e-Infrastructure case studies
|M4.3||Open Data Platform in the Production||M30|