EGI-Engage:WP4.3-D4Science

From EGIWiki
Jump to: navigation, search
EGI-Engage project: Main page WP1(NA1) WP3(JRA1) WP5(SA1) PMB Deliverables and Milestones Quality Plan Risk Plan Data Plan
Roles and
responsibilities
WP2(NA2) WP4(JRA2) WP6(SA2) AMB Software and services Metrics Project Office Procedures





Objective

Integration for gCube and the D4Science infrastructure (M1 - M12)
This activity will integrate D4Science resources at Engineering (commercial) and CNR (public) into the EGI Federated Clouds infrastructure. The gCube framework will be extended to use EGI Federated Cloud resources through implementing OCCI client capabilities.

Involved Partners

CNR - ISTI

Engineering

Roadmap

Two use cases were selected to start the integration:

  • running the gCube WN in FedCloud. The gCube worker node (the name of the gCube service is Executor) is the service where the D4Science computations are executed. The Executor can perform several types of job depending on the way the job is assigned to it. Among those different policies it is worth noticing the following ones: as a scheduler, it executes local code at regular intervals as specified in the job activation, similar to a cron command; as a worker node It fetches job from an ActiveMQ. This last case is the most demanding one, The job to be executed, the data to be processed and the configuration to be applied are all downloaded from the gCube storage via the persistent identifiers collected in the job description.
  • running the DataMiner service in FedCloud. The gCube DataMiner is a service hosting a number of models. It adopts a plugin-based architecture where each model is implemented via a service plugin. New models can easily be added by simply deploying a new plugin in the service. The interface exposed is a WPS standard Interface. Currently supported models include for example Clustering, Principal Component Analysis, Artificial Neural Networks, Trend Analysis, Periodicity Detection, Signal Processing.

The two use cases cover clearly different usage patterns.

With the first one we intend to satisfy the needs of D4Science users that daily access and exploit D4Science resources. By implementing this use case, the users will exploit transparently resources provided via the integration with FedCloud. FedCloud resources therefore will become virtually D4Science resource as already happen for many other kinds of resources. Accounting, authorization, and policy management will be managed in the same way they are currently managed for the gCube worker nodes running on D4Science hardware. With the second use case we intend to serve other potential communities. The DataMiner service can be exploited via WPS standard interface by any authorized client. To get a valid authorization token, the user has to register to D4Science and get his/her authorization token. From that point on, the user will be able to exploit the service without passing anymore through D4Science. This approach is very similar to the one adopted either by Google for its services or by other major SaaS providers. The DataMiner can be easily exploited in workflows via Taverna or similar technologies.

These WNs/DataMiner would be deployed by human triggered action from the D4Science UI. Plan to use jOCCI to interact with resources.

The first use case is planned to be achieved before the end of July. The second use is planned to be achieved before the end of October.

In a second stage of the project, we plan to enrich the models supported by the DataMiner service and to integrate the accounting provided by FedCloud with the one collected by D4Science.

Documentation