Federated Cloud user support
|Main||Roadmap and Innovation||Technology||For Users||For Resource Providers||Media|
Users of the EGI Federated Cloud are scientists working in many fields, who can benefit of a flexible environment for running their experiment. Also, the EGI cloud is suitable to projects aiming to provide services platforms to the scientific community.
How to get access to the FedCloud?
Access to EGI FedCloud resources can be requested in email through the EGI.eu User Community Support Team. The team will work with you to develop a technical plan for your envisaged use case, identifying the sites, the number and type of resources that your use case requires from EGI. The team will then arrange access for you to these resources and with the site operators provides you further assistance during their use.
To join the EGI FedCloud, please send an email to ucst@EGI.eu with the following information:
- Email address
- One paragraph long description of the use case
- Envisaged timeline (is there a deadline to finish the setup?; for how long do you think you will need the setup to exist?)
- Estimated number and size of machines that you need from EGI
- Link to webpage, document or other online resource for further information
Current FedCloud Users and Communities
This table provides a summary of the communities that are already collaborating with the EGI Federated Cloud offering use cases, requirements and valuable feedback.
|WeNMR||The objective of WeNMR is to optimize and extend the use of the NMR and SAXS research infrastructures through the implementation of an e-infrastructure in order to provide the user community with a platform integrating and streamlining the computational approaches necessary for NMR and SAXS data analysis and structural modelling. Access to the e-NMR infrastructure is provided through a portal integrating commonly used software and GRID technology.|
|Peachnote.com||Peachnote is a music score search engine and analysis platform. The system is the first of its kind and can be thought as an analog of Google Books Ngram Viewer and Google Books search for music scores. Hundreds of thousands of music scores are being digitized by libraries all over the world. In contrast to books, they generally remain inaccessible for content-based retrieval and algorithmic analysis. There is no analogue to Google Books for music scores, and there exist no large corpora of symbolic music data that would empower musicology in the way large text corpora are empowering computational linguistics, sociology, history, and other humanities that have printed word as their major source of evidence about their research subjects. We want to help change that. Peachnote provides visitors and researchers access to a massive amount of symbolic music data.|
|WS-PGRADE||WS-PGRADE is a portal environment for the development, execution and monitoring of workflows and workflow based parameter studies on different Distributed Computing Infrastructures (DCI). The tool is developed by an open consortium lead by MTA SZTAKI, member of the Hungarian NGI. WS-PGRADE is used by various scientific collaborations and support teams in Europe and beyond. WS-PGRADE is interfaced with Service Grids (gLite, UNICORE, ARC, Globus), Desktop Grids (BOINC) and cluster (PBS, LFS) middleware through the elements of the gUSE service stack. WS-PGRADE and gUSE are actively used and further developed in the SHIWA and SCI-BUS FP7 projects.|
|GAIA-Space||Gaia is a global space astrometry mission. Its goal is to make the largest, most precise three-dimensional map of our Galaxy by surveying an unprecedented number of stars - more than a thousand million.|
|BNCWeb||BNCWeb is an interface to the British National Corpus, a dataset of 100 million words, carefully sampled from a wide range of texts and conversations to provide a snapshot of British English in the late 20th century. This is a key reference work in English studies, linguistics and language teaching and is widely used in a wide variety of computational linguistic applications. BNCWeb offers powerful search and analysis functions for searching the text and exploiting the detailed textual metadata. It is an open source project, and the BNC is freely available for educational and research purposes.|
|DIRAC interware for eScience communities||The DIRAC interware project provides a framework for building ready to use distributed computing systems. It has been proven to be a useful tool for large international scientific collaborations integrating in a single system, their computing activities and distributed computing resources: Grids, Clouds and HTC clusters. In the case of Cloud resources, DIRAC is currently integrated with Amazon EC2, OpenNebula, OpenStack and CloudStack. Some Monte Carlo (MC)simulation campaign were realized at the large scale project Belle II, providing over 10.000 thousand CPU days from Amazon. Until this use case in Fedcloud-tf, all cases have made used of a single cloud at a time. The work integrates the resources provided by the multiple private clouds of the EGI Federated Cloud and additional WLCG resources, providing high-level scientific services on top of them by using the DIRAC framework. New design has been adopted by a federated hybrid cloud architecture (Rafhyc). Initial integration and scaling tests demonstrates the architecture is valid to manage federated hybrid cloud IaaS to provide eScience SaaS. The solution has been adopted by LHCb DIRAC for the LHCb computing on federated clouds, using end-points just like another computing resource.|
|Catania Science Gateway Framework||The Catania Science Gateway Framework (CSGF) has been developed by INFN, Division of Catania (Italy), to provide application developers with a tool to create Science Gateways in short time and in a very easy way. CSGF is made of a set of libraries to manage Authentication & Authorization mechanisms and to interact with several different kinds of DCIs (grid, cloud, HPC, local, etc.). The CSGT would like to use the EGI Federated Cloud to develop a new CSGF plugin implementing the service model SaaS exploiting OCCI. The use case is an interoperability test, implemented as a new Liferay portlet in CSGF, to make the portal capable of submitting applications to the EGI Federated Cloud, grids and HPC resources in a user-trasparent way.|
|ENVRI||ENVRI target is on developing common capabilities including software and services of the environmental and e-infrastructure communities. While the ENVRI infrastructures are very diverse, they face common challenges including data capture from distributed sensors, metadata standardization, management of high volume data, workflow execution and data visualization. The common standards, deployable services and tools developed will be adopted by each infrastructure as it progresses through its construction phase.|
This table provides a summary of the use cases that are already benefiting from the EGI Federated Cloud, and whose recommendations and feedback helps EGI finalise the services of the Federated cloud platform.
|Project and/or application||Description of the use case||Further information||Contacts|
|WeNMR||Use case 1: using VMs prepared with Gromacs and some other software to run MD simulations for educational purpose, possibly on multi-core VMs.
Use case 2: validating and improving biomolecular NMR structures using VirtualCing, a VM equipped with a complex suite of ~25 programs. A presentation of the current deployment at the Dutch National HPC Cloud is available here, and recently a paper has been published here. The cloud usage framework is based on a pilot job mechanism making use of the ToPoS tool. Therefore, such a framework would naturally allow for execution of VirtualCing tasks across multiple cloud providers. Do notice that the framework is independent on the cloud access interface: it would work also with simple grid jobs, as far as the user-defined (or VO manager defined) VirtualCing VM is available at the grid site e.g. in a SE (or in the VO software area mounted by the WNs) and the grid job is allowed to start the VM. Technical details about its current implementation are available here. A live demonstration about the deployment and use of VirtualCing on the WNoDeS testbed of the INFN-CNAF computing centre has been shown at the EGI TF 2012 held in September.
|FedCloudWeNMR||Marco Verlato (firstname.lastname@example.org), Alexandre Bonvin (email@example.com)|
|Peachnote||Use case 1: the ability to upload and start a prepared VMware VM. The VM only has to be able to make outbound connections: to Amazon's SQS for job info, to HBase cluster to retrieve and store data, and to the peachnote server to regularly update the workflow code. No inbound connections are needed, which hopefully means less administrative and security concerns.
Use case 2: the ability to run a small Hadoop and HBase cluster in the cloud. Being able to spin a HBase cluster using Apache Whirr on the EGI cloud infrastructure would be perfect, but a custom deployment would be a great first step.
|FedCloudPeachnote||Gergely Sipos (firstname.lastname@example.org), Vladimir Viro (email@example.com)|
|WS-PGRADE||The problem: the testing and debugging of new releases of WS-PGRADE and gUSE is a significant challenge for the developer team. The DCIs underpinning WS-PGRADE/gUSE are in a constant evolution and change, planned and unplanned downtimes within them are frequent. These disturbances make WS-PGRADE/gUSE test scenarios hard, often impossible to repeat, causing significant overhead on planning, developing and executing test cases.
The envisaged solution: the WS-PGRADE developer team is seeking for a solution that provides stable DCIs for the testing team in order to run pre-defined test scenarios in a repeatable way. The DCIs participating in these test are envisaged as small scale environments built from virtual images predefined and validated for WS-PGRADE tests. The environments should replicate the functional capabilities of large scale DCIs that are used by WS-PGRADE for production runs, but without the disturbances in availability and configuration. The WS-PGRADE team is interested in collaborating with the “Federated Clouds Task Force” of the European Grid Infrastructure in order to understand how the resources and skills from the task force and from EGI could be used to create and operate virtualised DCI infrastructures for WS-PGRADE tests.
Use case 1 (under preparation):
Biologists, chemists simulating molecular docking by the autodock software tool are potential users of this use case. This use case gives the ability to run a small BOINC based desktop grid infrastructure as a DCI and to submit a pre-defined application (called autodock) to this DCI through the WS-PGRADE/gUSE portal as a (predefined) workflow.
Use case 2 (under preparation):
Any scientists requiring an on-demand, scalable computing infrastructure are potential users of this use case. This use case gives the ability to run a small BOINC based desktop grid infrastructure providing virtualisation support (GBAC) on the computational resource (BOINC client). Therefore, it allows executing applications (batch-mode) without the need of being ported to different operating system platforms, due to the minimal linux OS used as the virtualised environment. Scalability can be improved by attaching external (non-cloud) resources to the desktop grid server.
Use case 3 (under design):
Any scientists requiring an on-demand, scalable computing infrastructure are potential users of this use case. This use case gives the ability to run a small BOINC based desktop grid infrastructure providing virtualisation support on the computational resource (BOINC client). The job submission interface in this scenario is the WS-PGRADE/gUSE system where compound applications (i.e. workflows) can be easily built and executed on the BOINC based desktop grid DCI. The submitted jobs of the workflow are executed on minimal linux OS used as the virtualised environment. Scalability can be improved by attaching external (non-cloud) resources to the desktop grid server.
|FedCloudWSPGRADE||Gergely Sipos (firstname.lastname@example.org), Sandor Acs (email@example.com)|
|BNCWeb||Use case 1: Researchers in linguistics and other disciplines, teachers, language learners, writers and computational linguists all around the world are potential users of BNCWeb, which is a basic reference resource for the English language.
Use case 2: BNCWeb will be used as the main resource for teaching a masters course in 'Exploring English Usage' in October-November 2012, and 'Corpus Linguistics' in February-March 2013. Users will submit queries in interactive sessions with BNCWeb online. There will be usage peaks during the sessions.
Use case 3: Federated search in the CLARIN European e-Infrastructure: a secure and highly available BNCWeb can be used to contribute English-languge resources to the ongoing project to build a Europe-wide demonstrator for federated search across archives and across access federation boundaries.
Use case 4: Developers can build additional web services on top of BNCWeb, e.g. adding improved visualizations of the search results.
|DIRAC||Use Case 1: Running LHCb simulations of Monte Carlo jobs using IaaS federated manner, for integration and scaling tests.(Finished)
Use Case 2: VMDIRAC as portal for VM scheduler, with third party job broker.
In September 2013 a collaboration with the EGI FedCloud WeNMR, project has started aiming at using the VMDIRAC portal as VM scheduler.
|FedCloudDIRAC||Gergely Sipos (firstname.lastname@example.org), |
Victor Mendez (email@example.com)
|OpenModeller||OpenModeller (oM) is a generic framework that includes various modelling algorithms and which is compatible with different data formats. Its functionality is available through command line, desktop and web services interfaces, allowing integration with other specialized services that could help for example finding input data for the modelling process. The CRIA institute in Brazil operates an oM server. The server implements the oM web service API and runs as a CGI application. This use case aims to setup and operate an instance of the web service in Europe using resources from the EGI Federated Cloud and from the EGI Service Availability Monitoring infrastructure. The instance would serve the Biodiversity Virtual e-Laboratory (BioVel) that supports research on biodiversity issues using large amounts of data from cross-disciplinary sources. EUBrazilOpenBio is a project aiming to deploy an open-access platform from the federation and integration of existing European and Brazilian infrastructures and resources. BioVeL and the FedCloud are exploring ways to re-use the already implemented oM service by EUBrazilOpenBio.||FedCloudOPENMODELLER||EGI: Nuno Ferreira (firstname.lastname@example.org), |
EUBrazilOpenBio: Daniele Lezzi (email@example.com),
|Catania Science Gateway Framework||Catania Science Gateway Framework (CSGF) has been developed by INFN, Division of Catania (Italy), to provide application developers with a tool to create Science Gateways in short time and in a very easy way. CSGF is made of a set of libraries to manage Authentication & Authorization mechanisms and to interact with several different kinds of DCIs (grid, cloud, HPC, local, etc.). The CSGF team would like to use the EGI Federated Cloud to develop a new CSGF plugin implementing the service model SaaS exploiting OCCI. The use case is an interoperability test, implemented as a new Liferay portlet in CSGF, to make the portal capable of submitting applications to the EGI Federated Cloud, grids and HPC resources in a user-trasparent way. The portlet would include a set of VMs, each pre-configured with some test applications and would provide an application specific SaaS environment built on grids and IaaS clouds. Users will see cloud sites are resources that are available to execute applications without worrying about technical matters. The CSGF will select and start a VM to execute an application on behalf of user, according to application characteristics. The VM management issues will be completely managed by CSGF and will be hidden from end users.||FedCloudCSGF||EGI: Nuno Ferreira (firstname.lastname@example.org),|
Gergely Sipos (email@example.com)
|European Space Agency PoC||In the context of the Helix Nebula initiative, the European Space Agency organized a Proof of Concept using FedCloud resources. The objective is to prove the interoperability between commercial (Helix Nebula) and academic (EGI Federated Cloud) cloud providers and to prove the possibility to provide processing services to scientists using the Federated Cloud IaaS system. ESA target is volcano and earthquake monitoring in the context of the SuperSites Exploitation Platform project.
The PoC will deploy and test performances of a computing cluster, by running a set of processing jobs on it. The cluster will use the Globus Grid middleware and will be connected to the ESA Grid-Processing On Demand system for job submission.
|FedCloudESAPoC||EGI: Salvatore Pinto (firstname.lastname@example.org)|
ESA: G-POD Team (email@example.com)
|ENVRI||In the context of the ENVRI project, the EGI Federated Cloud will host data access and dissemination service on the Federated Cloud Storage as a Service and provide computing resources to ENVRI processing services via the EGI Federated Cloud IaaS service. The objective is to offer the to the ENVRI partners a reliable, flexible and easy to use system to perform data discovery and dissemination and to support computing services.||EGI_ENVRI||EGI: Malgorzata Krakowian (firstname.lastname@example.org), Salvatore Pinto (email@example.com)|
ESA: Roberto Cossu
How to use the FedCloud?
A brief description on how to use the FedCloud resources is described in the text below. More information can be found on the FedCloud FAQ page and the Guides and Tutorials listed in this page. Technical support is always available via the EGI.eu UCST Team
After obtaining access to one or more sites of the EGI federated cloud, the prospective user can setup and operate custom services, applications, simulations within the virtualized hosting environments of these sites. To do so the user will needs to:
- Create a virtual machine image that encapsulates an operating system, their scientific software and any optional component that is needed for this software to function in a remote and possible distributed environment. (For example a middleware for distributed computing, a framework for remote monitoring, etc.).
- Tip: Images from the EGI Virtual Image Marketplace may be reusable, customisable for your use case.
- Tip: The format of the virtual machine image that the site you have access to can accept may be specific to the site. Check the required image format in the sites overview table before preparing your image!
- Tip: The EGI Applications Database is a registry of over 350 scientific applications, frameworks and tools that may be relevant for your use case and you may wish to incorporate in the virtual machine image. You can get in touch and ask for support from the providers of the software items you like through the Applications Database.
- Instantiate the virtual machine image(s) on the EGI cloud. This is possible through the ‘Open Cloud Computing Interface’ (OCCI) that every EGI cloud site implements. Using the OCCI interface guarantees that your work will be compatible with any site that currently exists or will join the EGI Federated Cloud in the future.
- Tip: These customisable scripts provide a command line shell on top of OCCI API for the management of virtual machine images in the EGI cloud.
- Operate services based on virtual images in the cloud. The EGI Monitoring system provides a backbone to collect availability information about services, to open and submit trouble tickets to service providers about failed probes, and to generate reliability and availability statistics for service operation.
Guides and tutorials
This list provides pointers to manuals and tutorials that may be useful for you to create, optimise, start up and operate Virtual Machine Images on the EGI Federated Cloud:
- FedCloud FAQ page
- Setup a Command Line Interface environment
- Creating custom images (OpenStack manual): http://docs.openstack.org/trunk/openstack-compute/admin/content/creating-custom-images.html
- rOCCI (OCCI client/server, used to manage computing resources): http://github.com/gwdg/rOCCI/
Cloud providers of EGI use hardware virtualization technologies to host scientific software on their resources. The cloud management platform that makes this possible vary from site to site, but they enable resource centers to manage virtualized computing, storage and networking resources and to empower scientific groups to setup and operate their domain specific services, applications and simulations within the virtualized environments.
To deploy a custom application in the cloud one needs to first create a virtual machine (VM) image(s) encapsulating both a operating system and the specific scientific software that implements the domain specific calculation. The image then needs to be instantiated on machines provided by the cloud. Every site of the EGI federated cloud exposes the same programming interfaces for virtual machine setup and data manipulation operations, therefore applications that are built for one site of the EGI Federated Cloud can run at any of the EGI cloud sites. The use of the cloud services are further simplified for scientific groups by a set of reusable, extendable virtual machine images from the EGI VM marketplace, and by technical assistance and support provided by the [email:firstname.lastname@example.org User Community Support Team] of EGI.eu.
EGI Federated Cloud Sites
The EGI FedCloud sites are the EGI resource providers.
Sites in the EGI Federated Cloud are still operating in a test bed mode, however some of the sites are already available for international research collaborations to use for application demonstrations and pilots. The full list of sites providing resources to the EGI FedCloud testbed is available here
Interfaces and protocols
The EGI Federated Cloud is designed to satisfy scenarios defined by the EGI community in consultation with potential users of pan-European cloud services. The initial set of scenarios that the community collected has been distilled down to capabilities that the EGI Federated Cloud must provide to enable the use cases. These capabilities were compared to state-of-the-art cloud computing technologies, standards, protocols and APIs to identify a technology stack which can help the National Grid Infrastructures and research communities connect resources into a federated cloud. The work has not finished yet, but there are already a few technologies in the stack and operated on sites of the EGI Federated Cloud test bed.
|Name of the technology||Description||What it’s used for in EGI?||Technology homepage|
|OCCI: Open Cloud Computing Interface||The Open Cloud Computing Interface comprises a set of open community-lead specifications delivered through the Open Grid Forum. OCCI is a Protocol and API for all kinds of management tasks. OCCI was originally initiated to create a remote management API for Infrastructure as a Service model based Services, allowing for the development of interoperable tools for common tasks including deployment, autonomic scaling and monitoring. It has since evolved into a flexible API with a strong focus on integration, portability, interoperability and innovation while still offering a high degree of extensibility.||Virtual Machine management||http://occi-wg.org/|
|CDMI: Cloud Data Management Interface||The Cloud Data Management Interface defines the functional interface that applications will use to create, retrieve, update and delete data elements from the cloud. As part of this interface the client will be able to discover the capabilities of the cloud storage offering and use this interface to manage containers and the data that is placed in them.||Data and meta-data management||http://www.snia.org/cdmi|
|GLUE Schema||The GLUE Schema is a common way of publishing information about grid or cloud resources. GLUE is developed by consortium of grid projects, including the two largest projects of the EGI collaboration: EGI-InSPIRE and EMI. GLUE describes attributes of sites and services, computing elements and storage elements. Implementations of the Schema exist for a range of systems, the EGI Federated Cloud uses the LDAP based BDII implementation.||Information system for cloud resources||http://www.ggf.org/gf/group_info/view.php?group=glue</span>|
|X509||User authentication is a means of identifying the user and verifying that the user is allowed to access some restricted service, particularly the sites of the EGI Federated Cloud. Public-key cryptography is a cryptographic technique that enables users to securely communicate on an insecure public network, and reliably verify the identity of a user via digital signatures. The X.509 specification defines a standard for managing digital signatures on the Internet. X.509 specifies, amongst other things, standard formats for public key certificates, certificate revocation lists, attribute certificates, and a certification path validation algorithm.||User authentication||http://en.wikipedia.org/wiki/X.509|
Users technical support is provided via the EGI.eu UCST Team mail.
Technical problems and questions relating to the use of the EGI Federated Cloud can be reported and dealt with through the EGI Helpdesk ticketing system. Note: Please choose 'Federated cloud' in the 'Type of problem' field of the ticket submission form!
Feedback and open issues
A list of open-issue and feedbacks reported by the FedCloud users is available at this page.
High Level Services
The EGI FedCloud, tough the FedCloud User Communities, provides a set of High Level Services to the community users. A non-exaustive list of this services is provided below: