EGI-Engage:WP4 (JRA2) Platforms for the Data Commons
|EGI-Engage project:||Main page||WP1(NA1)||WP3(JRA1)||WP5(SA1)||PMB||Deliverables and Milestones||Quality Plan||Risk Plan||Data Plan|
|WP2(NA2)||WP4(JRA2)||WP6(SA2)||AMB||Software and services||Metrics||Project Office||Procedures|
WP leader: Matthew Viljoen/EGI.eu
WP contact: email@example.com
This activity advances the current technical infrastructure of EGI by expanding the capabilities of the current platforms, and by integrating new ones. The result of the activity will be an integrated solution of data and compute services that will contribute to the Open Commons solution. It will do so by further evolving the EGI Federated Cloud infrastructure platform to provide the integrating services and users with greater flexibility and elasticity in the overall use of the platform, as well as ensuring continuity in the support for Cloud Middleware Frameworks. It will also introduce an Open Data Access platform that will provide capabilities to publish, use and reuse openly accessible data (including, but not limited to, scientific data sets released into the public domain, publicly funded research papers and project deliverables, and software artefacts and demonstrators coming out of public research projects). Ensuring support for a broad number of use cases and data commons needs, activities in this work package will also include integration activities of a number of partner e-Infrastructures both located in Europe and worldwide. This will include integrating existing cloud infrastructures with the EGI Federated Cloud platform (e.g. the Canadian CANFAR infrastructure) and accelerated computing facilities (e.g. GPGPUs – general-purpose computation on graphics processing unit). The work package objectives are:
- Expand the EGI federated Cloud platform with new IaaS capabilities;
- Prototype an open data platform;
- Provide a new accelerated computing platform;
- Integrate existing commercial and public IaaS Cloud deployments and e-Infrastructures with the current EGI production infrastructure.
To contact all task leaders (see below), send mail to
||Federated Open Data
||egi-engage-wp4.1 at mailman.egi.eu|
||Álvaro López García/IFCA - CSIC
||egi-engage-wp4.2 at mailman.egi.eu|
||egi-engage-wp4.3 at mailman.egi.eu|
||egi-engage-wp4.4 at mailman.egi.eu|
TASK JRA2.1 Federated Open Data
(Lead: CYFRONET, M1 – M30) Contact: egi-engage-wp4.1 at mailman.egi.eu
Estimated task effort: 45PM
This task will focus on designing and prototyping an Open Data platform as a solution to integrate various data repositories available in EGI, offer the capability to make data open, and link them to the OpenAIRE open access infrastructure and other key open data catalogues following the available guidelines.
Analysis of open data use cases and requirements
Analyse and support test use cases of open data from different data providers, including fishery and marine sciences, agriculture (Agro-Know) and biodiversity datasets. This will be expanded with additional communities according to the requirements collected from the competence centres (SA2). In the initial phase, this task will analyse existing solutions involved in data storage virtualisation and open data standards.
Design and develop the Open Data platform prototype
This activity will be implementing a prototype of the Open Data platform that organises the flow of the open data between EGI infrastructures and the “outside” world. The solution will be decentralised and will reduce barriers and effort required to publish or process open data. The Open Data platform will focus on the integration with OpenAIRE, and optimisation of data access. The design will consider the possibility to integrate current EGI storage services into the platform backend.
Open Data platform demonstrator
The prototype of the Open Data platform will be demonstrated on resources provided by the NGIs that volunteered to offer capacity initial testing and feedback These providers are: CYFRONET (NGI-PL), IFCA (NGI-ES), CESNET (NGI-CZ). The list might grow during the execution of the project.
Go to the Community Requirements
TASK JRA2.2 Federated Cloud
(Lead: CSIC, M1 – M30) Contact: egi-engage-wp4.2 at mailman.egi.eu
Estimated task effort: 63PM
The objectives of this task are:
- Evolve the federated IaaS Cloud platform with functionalities required by the CCs;
- Extend ‘open standards’-based interfaces exposing new capabilities;
- Maintain interface support for future versions of popular Cloud Management Frameworks (CMFs).
This task is divided in the following activities:
The EGI Applications Database (AppDB) will evolve from its current role as catalogue of applications and virtual machines (VM) to include a graphical user interface allowing authorised users to perform basic VM management operations.
This activity is related with the following Federated Cloud Scenarios:
Extending VM management standards support
This activity will design and define two new OCCI interface extensions supporting new capabilities requested from EGI-Engage’s CC:
- Support for users creating snapshots of running VM instances and make these available as first-class VM images that can be instantiated
- Support for changing attached resources to an executing VM instance.
Relocating VM instances between providers
This activity will design the workflow and interactions that are necessary to support relocating suspended VM instances from one federated cloud resource provider to another.
Integration support for CMFs
Participants in this activity will contribute the necessary backend (i.e. CMF) implementations of the capabilities that are developed in this task, as well as on-going maintenance support for new and existing features for future releases of OpenNebula and OpenStack deployed in the EGI Federated Cloud infrastructure.
Activities and Federated Cloud scenarios
All of these activities are related with one or several Federated Cloud Scenarios, as shown in the following table:
||Main Fedcloud scenario
|Extending VM management standards support
|Relocating VM instances between providers
|Integration support for CMFs
TASK JRA2.3 e-Infrastructures Integration
(Lead: INFN, M1 – M30)
Estimated task effort: 35PM
This task will foster the expansion of the EGI capacity and capabilities by integrating its technical solutions with those offered by other e-Infrastructures. The task will gather requirements from user communities involved in WP6 and will coordinate the implementation of pilots and will liaise with the external partners.
EGI-EUDAT Harmonisation for Virtual Research Environments (BBMRI, EISCAT_3D) (M3 – M27)
The activity will be responsible for the collaboration with the service providers of the EC project EUDAT towards a harmonisation of the two infrastructures, including technical interoperability, authentication, authorisation and identity management, policy and operations. Effort for the implementation of pilots and for the adaptation of the user Virtual Research Environments will be provided by the Competence Centres, in particular BBMRI and EISCAT_3D. Following this, a joint call for additional cross-infrastructure case studies will be launched at PM15.
Canadian Advanced Network for Astronomical Research (Lead: INAF, INFN JRU) (M6 – M30)
The Canadian Advanced Network for Astronomical Research (CANFAR) is a computing infrastructure for astronomers in Canada. International collaboration in the Astronomy discipline will be supported both by the Canadian Astronomy Data Centre (CADC) and EGI. CANFAR and EGI through INAF, the Italian National Institute for Astrophysics, will work together to integrate both e-Infrastructures towards a seamless and uniform platform for international astronomy research collaboration. Via INAF, community services will be provided on top of the federated cloud of EGI using open source solutions and re-using the CANFAR experience.
Integration for gCube and the D4Science infrastructure (M1 - M12)
This activity will integrate D4Science resources at Engineering (commercial) and CNR (public) into the EGI Federated Clouds infrastructure. The gCube framework will be extended to use EGI Federated Cloud resources through implementing OCCI client capabilities.
Other e-Infrastructure integration activities:
- Cloud infrastructure integration in Brasil
- nectar, Australia (OpenStack)
- Hungarian cloud (OpenStack)
- ELIXIR cloud infrastructure including the Norwegian cloud federation
(Lead: INFN, M1 – M15) Contact: egi-engage-wp4.4 at mailman.egi.eu
Estimated task effort: 13PM
Accelerated computing systems deliver energy efficient and powerful HPC capabilities. Many EGI sites are providing accelerated computing technologies to enable high performance processing such as GPGPUs or MIC co-processors. Currently these accelerated capabilities are not directly supported by the EGI platforms. To use the co-processors capabilities available at resource centre level, users must directly interact with the local provider to get information about the type of resources and software libraries available and which submission queues must be used to submit tasks of accelerated computing.
This task will implement the support in the information system, to expose the correct information about the accelerated computing technologies available – both software and hardware – at site level, developing a common extension of the information system structure, based on OGF GLUE standard, in order to have the capabilities published uniformly by all the sites. Users will then be able to extract all the information directly from the information system without interacting with the sites, and easily use resources provided by multiple sites. The task will also extend the HTC and cloud middleware support for co-processors, where needed, in order to provide a transparent and uniform way to allocate these resources together with CPU cores efficiently to the users.
The following gives an overview of deliverables. Schedule
||CANFAR integration roadmap (R)|
||VM snapshot support: OCCI extension, final specification (OTHER)|
||Resource template changes: OCCI extension, final specification (OTHER)|
||Relocating VM instances between providers, final specification (OTHER)|
||Deployment of a gCube release with Federated Cloud support (OTHER)|
||e-Infrastructures integration report (R)|
||Open Data Platform First Prototype (DEM)|
||Cross-infrastructure case studies report (R)|
||Open Data Platform: Demonstrator, Experience Report and Use Cases (Report + Prototype)|
||Final CANFAR integration release (DEM)|
The following gives an overview of milestones. Schedule
||Open Data Platform: requirements and implementation plans|
||Launch of call for cross e-Infrastructure case studies|
|M4.3||Open Data Platform in the Production|
||Description||Type||How measured||Target PM12||Target PM24||Target PM30|
|KPI.1.JRA2.OpenData||Number of open research datasets that can be published, discovered, used and reused by EGI applications/tools||Cumulative||Number of open datasets published in the EGI Application DB and/or Market Place plus number of open data archives used by applications/tools run in EGI (the latter requires a survey to VOs and VRCs)||0
||Value 1st Intermediate Report|
|M.JRA2.Cloud.1||Number of VM instances managed through AppDB GUI||Average
Comment: AppDB functionality still in desing phase, sue in PM9.
|M.JRA2.Cloud.2||Percentage of cloud providers providing snapshot support||Per period||4.2||
Comment: draft of extension has not started yet.
|M.JRA2.Cloud.3||Percentage of cloud providers providing VM resizing support||Per period||4.2||
Comment: Comments were sent to OCCI 1.2 comment phase, not included yet into the standard.
|M.JRA2.Cloud.4||Number of OCCI implementation supporting OCCI 1.2||Per period||4.2||
Comment: OCCI 1.2 public comment phase ended July 2015 and it is still not release.
|M.JRA2.Cloud.5||Number of new OCCI implementations for existing or new CMFs.||Per period||4.2||
Comment: New ooi implemenation for OpenStack is available.
|M.JRA2.Integration.1||Number of European cloud providers in the federated Astronomy community cloud||Cumulative||4.3
Comment: Activity due to start at PM07
|M.JRA2.Integration.2||Number of virtual appliances shared||Cumulative||4.3||
Comment: source is appDB
|M.JRA2.Integration.3||Number of different datasets replicated across CADC and EGI||Cumulative||4.3||
Comment: Activity due to start at PM07
|M.JRA2.Integration.4||Number of EUDAT services integrated with the HTC and Cloud platforms of EGI||Cumulative||4.3||
Comment: B2STAGE. EGI Cookbook: https://wiki.egi.eu/wiki/EUDAT_B2STAGE_cookbook_for_EGI_VOs (should be updated and checked with real user cases)
|M.JRA2.Integration.5||Number of open research datasets replicated in the federated cloud for scalable access by iMARINE VREs||Cumulative||4.3||0|
|M.JRA2.Integration.6||Number of research clouds that interoperate with EGI federated cloud: community clouds, integrated, peer||Cumulative||4.3||
Comment: integration of iMARINE, CANFAR and NECTAR in planning/discussion stage
|M.JRA2.AcceleratedComputing.1||Number of batch systems for which GPGPU integration is possible to be supported through CREAM||Cumulative||4.4||
Comment: GPGPU support enabled on CREAM-CE for Torque batch system
|M.JRA2.AcceleratedComputing.2||Number of Cloud Middleware Frameworks for which GPGPU integration is supported and implemented||Cumulative||4.4||
Comment: GPGPU support has been implemented on OpenStack/Kilo
|M.JRA2.AcceleratedComputing.3||Number of level 3 disciplines with user applications that can use federated accelerated computing||Cumulative||4.4||
Comment: Molecular Dynamics (Moldyngrid VO) and Structural Biology (enmr.eu VO, MoBrain Competence Centre)