Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

EGI-Engage:WP4 (JRA2) Platforms for the Data Commons

From EGIWiki
Jump to navigation Jump to search
EGI-Engage project: Main page WP1(NA1) WP3(JRA1) WP5(SA1) PMB Deliverables and Milestones Quality Plan Risk Plan Data Plan
Roles and
responsibilities
WP2(NA2) WP4(JRA2) WP6(SA2) AMB Software and services Metrics Project Office Procedures



WP leader: Tiziana Ferrari/EGI.eu (interim)

WP contact: egi-engage-wp4@mailman.egi.eu


Objective

This activity advances the current technical infrastructure of EGI by expanding the capabilities of the current platforms, and by integrating new ones. The result of the activity will be an integrated solution of data and compute services that will contribute to the Open Commons solution. It will do so by further evolving the EGI Federated Cloud infrastructure platform to provide the integrating services and users with greater flexibility and elasticity in the overall use of the platform, as well as ensuring continuity in the support for Cloud Middleware Frameworks. It will also introduce an Open Data Access platform that will provide capabilities to publish, use and reuse openly accessible data (including, but not limited to, scientific data sets released into the public domain, publicly funded research papers and project deliverables, and software artefacts and demonstrators coming out of public research projects). Ensuring support for a broad number of use cases and data commons needs, activities in this work package will also include integration activities of a number of partner e-Infrastructures both located in Europe and worldwide. This will include integrating existing cloud infrastructures with the EGI Federated Cloud platform (e.g. the Canadian CANFAR infrastructure) and accelerated computing facilities (e.g. GPGPUs – general-purpose computation on graphics processing unit). The work package objectives are:

  1. Expand the EGI federated Cloud platform with new IaaS capabilities;
  2. Prototype an open data platform;
  3. Provide a new accelerated computing platform;
  4. Integrate existing commercial and public IaaS Cloud deployments and e-Infrastructures with the current EGI production infrastructure.

Task Leaders

To contact all task leaders (see below), send mail to 

Task Name Task Leader Deputy Email
JRA2.1
Federated Open Data
Lukas Dutka/CYFRONET

egi-engage-wp4.1 at mailman.egi.eu
JRA2.2
Federated Cloud
Álvaro López García/IFCA - CSIC

egi-engage-wp4.2 at mailman.egi.eu
JRA2.3
e-Infrastructures Integration
Enol Fernandez/EGI.eu

egi-engage-wp4.3 at mailman.egi.eu
JRA2.4
Accelerated Computing
Marco Verlato/INFN

egi-engage-wp4.4 at mailman.egi.eu

Involved partners

  • EGI.eu   
  • CESNET   
  • CSIC   
  • GRNET   
  • INFN   
  • CIRMMP
  • Engineering   
  • CYFRONET   
  • IISAS   
  • Agro-Know

Tasks

TASK JRA2.1 Federated Open Data

(Lead: CYFRONET, M1 – M30) Contact: egi-engage-wp4.1 at mailman.egi.eu

Estimated task effort: 45PM
This task will focus on designing and prototyping an Open Data platform as a solution to integrate various data repositories available in EGI, offer the capability to make data open, and link them to the OpenAIRE open access infrastructure and other key open data catalogues following the available guidelines.
Analysis of open data use cases and requirements
Analyse and support test use cases of open data from different data providers, including fishery and marine sciences, agriculture (Agro-Know) and biodiversity datasets. This will be expanded with additional communities according to the requirements collected from the competence centres (SA2). In the initial phase, this task will analyse existing solutions involved in data storage virtualisation and open data standards.
Design and develop the Open Data platform prototype
This activity will be implementing a prototype of the Open Data platform that organises the flow of the open data between EGI infrastructures and the “outside” world. The solution will be decentralised and will reduce barriers and effort required to publish or process open data. The Open Data platform will focus on the integration with OpenAIRE, and optimisation of data access. The design will consider the possibility to integrate current EGI storage services into the platform backend.
Open Data platform demonstrator
The prototype of the Open Data platform will be demonstrated on resources provided by the NGIs that volunteered to offer capacity initial testing and feedback These providers are: CYFRONET (NGI-PL), IFCA (NGI-ES), CESNET (NGI-CZ). The list might grow during the execution of the project.

TASK JRA2.2 Federated Cloud

(Lead: CSIC, M1 – M30) Contact: egi-engage-wp4.2 at mailman.egi.eu

Estimated task effort: 63PM

The objectives of this task are:

  1. Evolve the federated IaaS Cloud platform with functionalities required by the CCs;
  2. Extend ‘open standards’-based interfaces exposing new capabilities;
  3. Maintain interface support for future versions of popular Cloud Management Frameworks (CMFs).

Task Activities

This task is divided in the following activities:

Extend AppDB

The EGI Applications Database (AppDB) will evolve from its current role as catalogue of applications and virtual machines (VM) to include a graphical user interface allowing authorised users to perform basic VM management operations.

This activity is related with the following Federated Cloud Scenarios:

Extending VM management standards support

This activity will design and define two new OCCI interface extensions supporting new capabilities requested from EGI-Engage’s CC:

  1. Support for users creating snapshots of running VM instances and make these available as first-class VM images that can be instantiated
  2. Support for changing attached resources to an executing VM instance.
Relocating VM instances between providers

This activity will design the workflow and interactions that are necessary to support relocating suspended VM instances from one federated cloud resource provider to another.

Integration support for CMFs

Participants in this activity will contribute the necessary backend (i.e. CMF) implementations of the capabilities that are developed in this task, as well as on-going maintenance support for new and existing features for future releases of OpenNebula and OpenStack deployed in the EGI Federated Cloud infrastructure.

Activities and Federated Cloud scenarios

All of these activities are related with one or several Federated Cloud Scenarios, as shown in the following table:

Activity
Main Fedcloud scenario
Other Scenarios
Extend AppDB
Extending VM management standards support

Relocating VM instances between providers
Integration support for CMFs


TASK JRA2.3 e-Infrastructures Integration

(Lead: INFN, M1 – M30)

Estimated task effort: 35PM
This task will foster the expansion of the EGI capacity and capabilities by integrating its technical solutions with those offered by other e-Infrastructures. The task will gather requirements from user communities involved in WP6 and will coordinate the implementation of pilots and will liaise with the external partners.
EGI-EUDAT Harmonisation for Virtual Research Environments (BBMRI, EISCAT_3D) (M3 – M27)
The activity will be responsible for the collaboration with the service providers of the EC project EUDAT towards a harmonisation of the two infrastructures, including technical interoperability, authentication, authorisation and identity management, policy and operations. Effort for the implementation of pilots and for the adaptation of the user Virtual Research Environments will be provided by the Competence Centres, in particular BBMRI and EISCAT_3D. Following this, a joint call for additional cross-infrastructure case studies will be launched at PM15.
Canadian Advanced Network for Astronomical Research (Lead: INFN) (M6 – M30)
The Canadian Advanced Network for Astronomical Research (CANFAR)  is a computing infrastructure for astronomers in Canada. International collaboration in the Astronomy discipline will be supported both by the Canadian Astronomy Data Centre (CADC) and EGI. CANFAR and EGI will work together to integrate both e-Infrastructures towards a seamless and uniform platform for international astronomy research collaboration. Community services will be provided on top of the federated cloud of EGI using open source solutions and re-using the CANFAR experience.
Integration for gCube and the D4Science infrastructure (M1 - M12)
This activity will integrate D4Science resources at Engineering (commercial) and CNR (public) into the EGI Federated Clouds infrastructure. The gCube framework will be extended to use EGI Federated Cloud resources through implementing OCCI client capabilities.

Other e-Infrastructure integration activities:

  • Cloud infrastructure integration in Brasil
  • nectar, Australia (OpenStack)
  • Hungarian cloud (OpenStack)
  • ELIXIR cloud infrastructure including the Norwegian cloud federation

TASK JRA2.4 Accelerated Computing

(Lead: INFN, M1 – M15) Contact: egi-engage-wp4.4 at mailman.egi.eu

Estimated task effort: 13PM
Accelerated computing systems deliver energy efficient and powerful HPC capabilities. Many EGI sites are providing accelerated computing technologies to enable high performance processing such as GPGPUs or MIC co-processors. Currently these accelerated capabilities are not directly supported by the EGI platforms. To use the co-processors capabilities available at resource centre level, users must directly interact with the local provider to get information about the type of resources and software libraries available and which submission queues must be used to submit tasks of accelerated computing.
This task will implement the support in the information system, to expose the correct information about the accelerated computing technologies available – both software and hardware – at site level, developing a common extension of the information system structure, based on OGF GLUE standard, in order to have the capabilities published uniformly by all the sites. Users will then be able to extract all the information directly from the information system without interacting with the sites, and easily use resources provided by multiple sites. The task will also extend the HTC and cloud middleware support for co-processors, where needed, in order to provide a transparent and uniform way to allocate these resources together with CPU cores efficiently to the users.

Deliverables

The following gives an overview of deliverables scheduled

Code Title Delivery PM Delivery CM Delivered date Status
D4.1
CANFAR integration roadmap (R)
M06
08.2015

D4.2
VM snapshot support: OCCI extension, final specification (OTHER)
M09
11.2015

D4.3
Resource template changes: OCCI extension, final specification (OTHER)
M09
11.2015

D4.4
Relocating VM instances between providers, final specification (OTHER)
M12
02.2016

D4.5
Deployment of a gCube release with Federated Cloud support (OTHER)
M12
02.2016

D4.6
e-Infrastructures integration report (R)
M12
02.2016

D4.7
Open Data Platform First Prototype (DEM)
M20
10.2016

D4.8
Cross-infrastructure case studies report (R)
M24
02.2017

D4.9
Open Data Platform: Demonstrator, Experience Report and Use Cases (Report + Prototype)
M29
07.2017

D4.10
Final CANFAR integration release (DEM)
M30
08.2017

Milestones

The following gives an overview of milestones scheduled

Milestone Title Lead-Task Delivery PM Delivery CM Delivered Status
M4.1
Open Data Platform: requirements and implementation plans

M06
08.2015

M4.2
Launch of call for cross e-Infrastructure case studies

M15
05.2016

M4.3 Open Data Platform in the Production
M30 08.2017

Metrics

KPIs

Metrics
Description Type How measured Target PM12 Target PM24 Target PM30
KPI.1.JAR2.OpenData Number of open research datasets that can be published, discovered, used and reused by EGI applications/tools Cumulative Number of open datasets published in the EGI Application DB and/or Market Place plus number of open data archives used by applications/tools run in EGI (the latter requires a survey to VOs and VRCs) 0
10
20
KPI.2.SA1.Intergation Number of RIs and e-Infrastructures integrated with EGI Cumulative Number of RIs and e-Infrastructures that are NOT participants of egi using at least one service from either Core, Collaboration or a Community Platform (via MoU or OLA) 9
11
13
KPI.3.SA1.Software Number of new registered software items and VM appliances Per period Numbers of new registered software and VM Appliances in AppDB 50/50
60/60
70/70
KPI.4.SA1.Cloud Number of providers offering compute and storage capacity accessible through open standard interfaces Cumulative Number of Cloud resource centres registering in GOCDB interfaces exposing standard API: OCCI, CDMI... 25
30
35
KPI.5.SA2.Users Number of researchers served by EGI Cumulative Number of users registered in VOs 40 000
45 000
47 000
KPI.6.JRA1.AAI Number of users adopting federated IdP Cumulative Number of users accessing EGI services through the IdP Proxy/broker TBD
TBD TBD
KPI.7.SA2.Users Number of research communities served Per period Number of international and national VOs 20
20
20
KPI.8.SA1.Users Number of VO SLAs established Cumulative Number of VO SLAs established regarding to HTC, Cloud and Operations tools 4
8
10
KPI.9.NA2.Communication Number of scientific publications supported by EGI Cumulative The Communication Team requests NGIs to provide a list of publications; the publications are then aggregated in a master list and categorised by NGI NA
NA
NA
KPI.10.NA2.Communication Number of relevant authorities informed of the policy paper on procurement Cumulative Number of authorities that confirmed reception of the document 5
20
25
KPI.11.SA1.Users User satisfaction Average Satisfaction of Long tail of science and VO managers with whom EGI has SLA (1 to 5 scale ) 4
5
5
KPI.12.NA2.Industry Number of services, demonstrators and project ideas running on EGI for SMEs and industry Cumulative RT (dedicated queue for business engagement) 2
7
10
KPI.13.SA2.Support Number of delivered knowledge transfer events Cumulative Internal registry 15
30
45
KPI.14.SA1.Size Number of compute available to international research communities and long tail of science Per period Accouting portal TBD TBD TBD
KPI.15.SA1.Size Number of storage available to international research communities and long tail of science Per period Accouting portal TBD TBD TBD
KPI.16.SA2.Support Number of international support cases (for/with RIs, projects, industry) Cumulative Number of tickets in technical-support-cases RT queue 30
60
90
KPI.17.SA1.Size Number of compute resources available to the long tail of science Cumulative Amount of resources (Cores) supporting the long-tail VO 300
500
500



Activity Metrics

Metrics
Description Type Task
M.NA1.Quality.1 Percentage of deliverables and milestones delivered on Per period 1.3
M.NA2.Communication.1 Percentage of articles, news, blog posts about or contributed by user communities and NGIs/EIROs with respect to the total of items published in EGI’s channels Per period 2.1
M.NA2.Communication.2 Number of unique visitors to the website Per period 2.1
M.NA2.Communication.3 Number of pageviews on the website Per period 2.1
M.NA2.Communication.4 Number of news items published Per period 2.1
M.NA2.Communication.5 Number of events with participation of EGI Champions Per period 2.1
M.NA2.Communication.6 Number of case studies published Per period 2.1
M.NA2.Communication.7 Attendee-days per event Per period 2.1
M.NA2.Strategy.1 Number of EGI impact assessment reports circulated to the stakeholders Cumulative 2.2
M.NA2.Strategy.2 Number of MoUs involving EGI.eu or EGI-Engage as a project Cumulative 2.2
M.NA2.Strategy.3 Number of SLAs established paying customers Cumulative 2.2
M.NA2.Industry.1 Number of engaged SMEs/Industry contacts Cumulative 2.3
M.NA2.Industry.2 Number of establish collaborations with SMEs/Industry (with MoU) Per period 2.3
M.NA2.Industry.3 Number of requirements gathered from market analysis activities Per period 2.3
M.JRA1.AAI.1 Number of communities whose IdP framework integrates with EGI AAI Cumulative 3.1
M.JRA1.Marketplace.1 Number of entries in the EGI Marketplace (i.e. services, applications etc.) Cumulative 3.2
M.JRA1.Accounting.1 Number of kinds of data repository systems integrated with the EGI accounting software Cumulative 3.3
M.JRA1.Accounting.2 Number of kinds of storage systems integrated with the EGI accounting software Cumulative 3.3
M.JRA1.OpsTools.1 Number of new requirements introduced in the roadmap Cumulative 3.4
M.JRA1.OpsTools.2 Number of probes developed to monitor cloud resources Per period 3.4
M.JRA1.eGrant.1 Number of user requests handled in e-GRANT Per period 3.5
M.JRA2.Cloud.1 Number of VM instances managed through AppDB GUI Average
4.2
M.JRA2.Cloud.2 Percentage of cloud providers providing snapshot support Per period 4.2
M.JRA2.Cloud.3 Percentage of cloud providers providing VM resizing support Per period 4.2
M.JRA2.Cloud.4 Number of OCCI implementation supporting OCCI 1.2 Per period 4.2
M.JRA2.Cloud.5 Number of new OCCI implementations for existing or new CMFs. Per period 4.2
M.JRA2.Integration.1 Number of European cloud providers in the federated Astronomy community cloud Cumulative 4.3
M.JRA2.Integration.2 Number of virtual appliances shared Cumulative 4.3
M.JRA2.Integration.3 Number of different datasets replicated across CADC and EGI Cumulative 4.3
M.JRA2.Integration.4 Number of EUDAT services integrated with the HTC and Cloud platforms of EGI Cumulative 4.3
M.JRA2.Integration.5 Number of open research datasets replicated in the federated cloud for scalable access by iMARINE VREs Cumulative 4.3
M.JRA2.Integration.6 Number of research clouds that interoperate with EGI federated cloud: community clouds, integrated, peer Cumulative 4.3
M.JRA2.AcceleratedComputing.1 Number of batch systems for which GPGPU integration is possible to be supported through CREAM Cumulative 4.4
M.JRA2.AcceleratedComputing.2 Number of Cloud Middleware Frameworks for which GPGPU integration is supported and implemented Cumulative 4.4
M.JRA2.AcceleratedComputing.3 Number of level 3 disciplines with user applications that can use federated accelerated computing Cumulative 4.4
M.SA1.Operations.1 Amount of federated HTC compute capacity (EGI participants and integrated) Cumulative 5.1
M.SA1.Operations.2 Amount of federated HTC storage capacity (EGI participants and integrated): (Disk, Tape) Cumulative 5.1
M.SA1.Operations.3 Amount of allocated resources (storage) allocated through a EGI centrally managed pool of resources Cumulative 5.1
M.SA1.Operations.4 Amount of allocated resources (logical cores) allocated through a EGI centrally managed pool of resources Cumulative 5.1
M.SA1.Operations.5 Number of new products distributed with UMD Per period
5.1
M.SA1.SecurityOperations.1 Number of security policies and procedures updated reviewed and adapted to support new services Per period 5.2
M.SA1.Platforms.1 Number of gCUBE VREs instantiated on the Federated Cloud for the iMARINE community Cumulative 5.3
M.SA1.Platforms.2 Number of CPU time consumed by e-CEO challenges (hours * cores) Per period 5.3
M.SA2.UserSupport.1 Number of training modules produced and kept up-to-date Cumulative 6.2
M.SA2.UserSupport.2 HTC Absolute normalized time to a reference value of HEPSPEC06 (excluding OPS and dteam) per 1 level disciplines Cumulative 6.2
M.SA2.UserSupport.3 HTC Relative increase normalized time to a reference value of HEPSPEC06 (excluding OPS and dteam) per 1 level disciplines Per period 6.2
M.SA2.UserSupport.4 Relative increase of users per 1 level disciplines Per period 6.2
M.SA2.UserSupport.5 HTC Number of Low/Medium/High Activity VOs and total Per period 6.2
M.SA2.UserSupport.6 Number of VM instantiated in Federated Cloud per 1 level discipline Per period 6.2



Yearly plan

EGI-Engage:WP4-PY1-Roadmap

Internal Documents