Difference between revisions of "Federated Cloud Communities"

From EGIWiki
Jump to: navigation, search
(Created page with "== Current FedCloud Users and Communities == This table provides a summary of the communities that are already collaborating with the EGI Federated Cloud offering use cases, re...")
 
Line 1: Line 1:
 +
This page provides detailed information about the user communities and use cases who are using/integrating the EGI Federated Cloud services.
 +
 
== Current FedCloud Users and Communities  ==
 
== Current FedCloud Users and Communities  ==
  

Revision as of 15:25, 19 December 2013

This page provides detailed information about the user communities and use cases who are using/integrating the EGI Federated Cloud services.

Current FedCloud Users and Communities

This table provides a summary of the communities that are already collaborating with the EGI Federated Cloud offering use cases, requirements and valuable feedback.

WeNMR The objective of WeNMR is to optimize and extend the use of the NMR and SAXS research infrastructures through the implementation of an e-infrastructure in order to provide the user community with a platform integrating and streamlining the computational approaches necessary for NMR and SAXS data analysis and structural modelling. Access to the e-NMR infrastructure is provided through a portal integrating commonly used software and GRID technology.
Peachnote.com Peachnote is a music score search engine and analysis platform. The system is the first of its kind and can be thought as an analog of Google Books Ngram Viewer and Google Books search for music scores. Hundreds of thousands of music scores are being digitized by libraries all over the world. In contrast to books, they generally remain inaccessible for content-based retrieval and algorithmic analysis. There is no analogue to Google Books for music scores, and there exist no large corpora of symbolic music data that would empower musicology in the way large text corpora are empowering computational linguistics, sociology, history, and other humanities that have printed word as their major source of evidence about their research subjects. We want to help change that. Peachnote provides visitors and researchers access to a massive amount of symbolic music data.
WS-PGRADE WS-PGRADE is a portal environment for the development, execution and monitoring of workflows and workflow based parameter studies on different Distributed Computing Infrastructures (DCI). The tool is developed by an open consortium lead by MTA SZTAKI, member of the Hungarian NGI. WS-PGRADE is used by various scientific collaborations and support teams in Europe and beyond. WS-PGRADE is interfaced with Service Grids (gLite, UNICORE, ARC, Globus), Desktop Grids (BOINC) and cluster (PBS, LFS) middleware through the elements of the gUSE service stack. WS-PGRADE and gUSE are actively used and further developed in the SHIWA and SCI-BUS FP7 projects.
GAIA-Space Gaia is a global space astrometry mission. Its goal is to make the largest, most precise three-dimensional map of our Galaxy by surveying an unprecedented number of stars - more than a thousand million.
BNCWeb BNCWeb is an interface to the British National Corpus, a dataset of 100 million words, carefully sampled from a wide range of texts and conversations to provide a snapshot of British English in the late 20th century. This is a key reference work in English studies, linguistics and language teaching and is widely used in a wide variety of computational linguistic applications. BNCWeb offers powerful search and analysis functions for searching the text and exploiting the detailed textual metadata. It is an open source project, and the BNC is freely available for educational and research purposes.
DIRAC interware for eScience communities The DIRAC interware project provides a framework for building ready to use distributed computing systems. It has been proven to be a useful tool for large international scientific collaborations integrating in a single system, their computing activities and distributed computing resources: Grids, Clouds and HTC clusters. In the case of Cloud resources, DIRAC is currently integrated with Amazon EC2, OpenNebula, OpenStack and CloudStack. Some Monte Carlo (MC)simulation campaign were realized at the large scale project Belle II, providing over 10.000 thousand CPU days from Amazon. Until this use case in Fedcloud-tf, all cases have made used of a single cloud at a time. The work integrates the resources provided by the multiple private clouds of the EGI Federated Cloud and additional WLCG resources, providing high-level scientific services on top of them by using the DIRAC framework. New design has been adopted by a federated hybrid cloud architecture (Rafhyc). Initial integration and scaling tests demonstrates the architecture is valid to manage federated hybrid cloud IaaS to provide eScience SaaS. The solution has been adopted by LHCb DIRAC for the LHCb computing on federated clouds, using end-points just like another computing resource.
Catania Science Gateway Framework The Catania Science Gateway Framework (CSGF) has been developed by INFN, Division of Catania (Italy), to provide application developers with a tool to create Science Gateways in short time and in a very easy way. CSGF is made of a set of libraries to manage Authentication & Authorization mechanisms and to interact with several different kinds of DCIs (grid, cloud, HPC, local, etc.). The CSGT would like to use the EGI Federated Cloud to develop a new CSGF plugin implementing the service model SaaS exploiting OCCI. The use case is an interoperability test, implemented as a new Liferay portlet in CSGF, to make the portal capable of submitting applications to the EGI Federated Cloud, grids and HPC resources in a user-trasparent way.
ENVRI ENVRI target is on developing common capabilities including software and services of the environmental and e-infrastructure communities. While the ENVRI infrastructures are very diverse, they face common challenges including data capture from distributed sensors, metadata standardization, management of high volume data, workflow execution and data visualization. The common standards, deployable services and tools developed will be adopted by each infrastructure as it progresses through its construction phase.

This table provides a summary of the use cases that are already benefiting from the EGI Federated Cloud, and whose recommendations and feedback helps EGI finalise the services of the Federated cloud platform.

Project and/or application Description of the use case Further information Contacts
WeNMR Use case 1: using VMs prepared with Gromacs and some other software to run MD simulations for educational purpose, possibly on multi-core VMs.

Use case 2: validating and improving biomolecular NMR structures using VirtualCing, a VM equipped with a complex suite of ~25 programs. A presentation of the current deployment at the Dutch National HPC Cloud is available here, and recently a paper has been published here. The cloud usage framework is based on a pilot job mechanism making use of the ToPoS tool. Therefore, such a framework would naturally allow for execution of VirtualCing tasks across multiple cloud providers. Do notice that the framework is independent on the cloud access interface: it would work also with simple grid jobs, as far as the user-defined (or VO manager defined) VirtualCing VM is available at the grid site e.g. in a SE (or in the VO software area mounted by the WNs) and the grid job is allowed to start the VM. Technical details about its current implementation are available here. A live demonstration about the deployment and use of VirtualCing on the WNoDeS testbed of the INFN-CNAF computing centre has been shown at the EGI TF 2012 held in September.
FedCloudWeNMR Marco Verlato (marco.verlato@pd.infn.it), Alexandre Bonvin (a.m.j.j.bonvin@uu.nl)
Peachnote Use case 1: the ability to upload and start a prepared VMware VM. The VM only has to be able to make outbound connections: to Amazon's SQS for job info, to HBase cluster to retrieve and store data, and to the peachnote server to regularly update the workflow code. No inbound connections are needed, which hopefully means less administrative and security concerns.

Use case 2: the ability to run a small Hadoop and HBase cluster in the cloud. Being able to spin a HBase cluster using Apache Whirr on the EGI cloud infrastructure would be perfect, but a custom deployment would be a great first step.
FedCloudPeachnote Gergely Sipos (gergely.sipos@egi.eu), Vladimir Viro (vladimir@viro.name)
WS-PGRADE The problem: the testing and debugging of new releases of WS-PGRADE and gUSE is a significant challenge for the developer team. The DCIs underpinning WS-PGRADE/gUSE are in a constant evolution and change, planned and unplanned downtimes within them are frequent. These disturbances make WS-PGRADE/gUSE test scenarios hard, often impossible to repeat, causing significant overhead on planning, developing and executing test cases.

The envisaged solution: the WS-PGRADE developer team is seeking for a solution that provides stable DCIs for the testing team in order to run pre-defined test scenarios in a repeatable way. The DCIs participating in these test are envisaged as small scale environments built from virtual images predefined and validated for WS-PGRADE tests. The environments should replicate the functional capabilities of large scale DCIs that are used by WS-PGRADE for production runs, but without the disturbances in availability and configuration. The WS-PGRADE team is interested in collaborating with the “Federated Clouds Task Force” of the European Grid Infrastructure in order to understand how the resources and skills from the task force and from EGI could be used to create and operate virtualised DCI infrastructures for WS-PGRADE tests.

Use case 1 (under preparation):
Biologists, chemists simulating molecular docking by the autodock software tool are potential users of this use case. This use case gives the ability to run a small BOINC based desktop grid infrastructure as a DCI and to submit a pre-defined application (called autodock) to this DCI through the WS-PGRADE/gUSE portal as a (predefined) workflow.

Use case 2 (under preparation):
Any scientists requiring an on-demand, scalable computing infrastructure are potential users of this use case. This use case gives the ability to run a small BOINC based desktop grid infrastructure providing virtualisation support (GBAC) on the computational resource (BOINC client). Therefore, it allows executing applications (batch-mode) without the need of being ported to different operating system platforms, due to the minimal linux OS used as the virtualised environment. Scalability can be improved by attaching external (non-cloud) resources to the desktop grid server.

Use case 3 (under design):
Any scientists requiring an on-demand, scalable computing infrastructure are potential users of this use case. This use case gives the ability to run a small BOINC based desktop grid infrastructure providing virtualisation support on the computational resource (BOINC client). The job submission interface in this scenario is the WS-PGRADE/gUSE system where compound applications (i.e. workflows) can be easily built and executed on the BOINC based desktop grid DCI. The submitted jobs of the workflow are executed on minimal linux OS used as the virtualised environment. Scalability can be improved by attaching external (non-cloud) resources to the desktop grid server.
FedCloudWSPGRADE Gergely Sipos (gergely.sipos@egi.eu), Peter Kotcauer (peter.kotcauer@sztaki.mta.hu)
BNCWeb Use case 1: Researchers in linguistics and other disciplines, teachers, language learners, writers and computational linguists all around the world are potential users of BNCWeb, which is a basic reference resource for the English language.

Use case 2: BNCWeb will be used as the main resource for teaching a masters course in 'Exploring English Usage' in October-November 2012, and 'Corpus Linguistics' in February-March 2013. Users will submit queries in interactive sessions with BNCWeb online. There will be usage peaks during the sessions.

Use case 3: Federated search in the CLARIN European e-Infrastructure: a secure and highly available BNCWeb can be used to contribute English-languge resources to the ongoing project to build a Europe-wide demonstrator for federated search across archives and across access federation boundaries.

Use case 4: Developers can build additional web services on top of BNCWeb, e.g. adding improved visualizations of the search results.
FedCloudBNCweb
DIRAC Use Case 1: Running LHCb simulations of Monte Carlo jobs using IaaS federated manner, for integration and scaling tests.(Finished)

Use Case 2: VMDIRAC as portal for VM scheduler, with third party job broker.

In September 2013 a collaboration with the EGI FedCloud WeNMR, project has started aiming at using the VMDIRAC portal as VM scheduler.

FedCloudDIRAC Gergely Sipos (gergely.sipos@egi.eu),
Victor Mendez (vmendez@pic.es)
OpenModeller OpenModeller (oM) is a generic framework that includes various modelling algorithms and which is compatible with different data formats. Its functionality is available through command line, desktop and web services interfaces, allowing integration with other specialized services that could help for example finding input data for the modelling process. The CRIA institute in Brazil operates an oM server. The server implements the oM web service API and runs as a CGI application. This use case aims to setup and operate an instance of the web service in Europe using resources from the EGI Federated Cloud and from the EGI Service Availability Monitoring infrastructure. The instance would serve the Biodiversity Virtual e-Laboratory (BioVel) that supports research on biodiversity issues using large amounts of data from cross-disciplinary sources. EUBrazilOpenBio is a project aiming to deploy an open-access platform from the federation and integration of existing European and Brazilian infrastructures and resources. BioVeL and the FedCloud are exploring ways to re-use the already implemented oM service by EUBrazilOpenBio. FedCloudOPENMODELLER EGI: Nuno Ferreira (nuno.ferreira@egi.eu),

EUBrazilOpenBio: Daniele Lezzi (daniele.lezzi@bsc.es),
BioVel: Renato De Giovanni (renato@cria.org.br)

Catania Science Gateway Framework Catania Science Gateway Framework (CSGF) has been developed by INFN, Division of Catania (Italy), to provide application developers with a tool to create Science Gateways in short time and in a very easy way. CSGF is made of a set of libraries to manage Authentication & Authorization mechanisms and to interact with several different kinds of DCIs (grid, cloud, HPC, local, etc.). The CSGF team would like to use the EGI Federated Cloud to develop a new CSGF plugin implementing the service model SaaS exploiting OCCI. The use case is an interoperability test, implemented as a new Liferay portlet in CSGF, to make the portal capable of submitting applications to the EGI Federated Cloud, grids and HPC resources in a user-trasparent way. The portlet would include a set of VMs, each pre-configured with some test applications and would provide an application specific SaaS environment built on grids and IaaS clouds. Users will see cloud sites are resources that are available to execute applications without worrying about technical matters. The CSGF will select and start a VM to execute an application on behalf of user, according to application characteristics. The VM management issues will be completely managed by CSGF and will be hidden from end users. FedCloudCSGF EGI: Nuno Ferreira (nuno.ferreira@egi.eu),

Gergely Sipos (gergely.sipos@egi.eu)
INFN: Diego Scardaci (diego.scardaci@ct.infn.it)

European Space Agency PoC In the context of the Helix Nebula initiative, the European Space Agency organized a Proof of Concept using FedCloud resources. The objective is to prove the interoperability between commercial (Helix Nebula) and academic (EGI Federated Cloud) cloud providers and to prove the possibility to provide processing services to scientists using the Federated Cloud IaaS system. ESA target is volcano and earthquake monitoring in the context of the SuperSites Exploitation Platform project.

The PoC will deploy and test performances of a computing cluster, by running a set of processing jobs on it. The cluster will use the Globus Grid middleware and will be connected to the ESA Grid-Processing On Demand system for job submission.

FedCloudESAPoC EGI: Salvatore Pinto (salvatore.pinto@egi.eu)

ESA: G-POD Team (eo-gpod@esa.int)

ENVRI In the context of the ENVRI project, the EGI Federated Cloud will host data access and dissemination service on the Federated Cloud Storage as a Service and provide computing resources to ENVRI processing services via the EGI Federated Cloud IaaS service. The objective is to offer the to the ENVRI partners a reliable, flexible and easy to use system to perform data discovery and dissemination and to support computing services. EGI_ENVRI EGI: Malgorzata Krakowian (malgorzata.krakowian@egi.eu), Salvatore Pinto (salvatore.pinto@egi.eu)

ESA: Roberto Cossu
CNR: Leonardo Candela