Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Federated Cloud Architecture"

From EGIWiki
Jump to navigation Jump to search
Line 2: Line 2:
{{TOC_right}}
{{TOC_right}}


= Cloud Interfaces =
= EGI Cloud Federation =


To federate a cloud system there are several functions for which a common interface must be defined. These are each described below and overall provide the definition of the method by which a ‘user’ of the service would be able to interact.
The EGI Federated Cloud is a multi-national cloud system that integrates institutional clouds into a scalable computing platform for data and/or compute driven applications and services. The initial architecture of the EGI Federated Cloud was defined in 2011-2012 and was fully implemented by May 2014. Currently, the federation is a collaboration that enables various types of cloud federations to serve diverse demands of researchers from both academia and industry. The EGI Federated Cloud brings together scientific communities, R&D projects, technology and resource providers to form a community that integrates and maintains a flexible solution portfolio that enables various types of cloud federations with IaaS, PaaS and SaaS capabilities. The collaboration is committed to the use of open source tools and services that are reusable across scientific disciplines. These tools and services form a flexible portfolio from which a scientific community can mix and match items to establish its own, customised cloud federation.  


== VM Management ==


VM management allows users to run '''on demand''' any kind of workloads on '''virtual machines'''. Community cloud federations may chose any interoperable interface within their community to provide VM Management. The Public Federated Cloud uses the OCCI open standard for this purpose.
The EGI Federated Cloud provides the services and technologies to create federation of clouds (community, private or public clouds) that operate according to the preferences, choices and constraints set by its members and users. The EGI Cloud Federations are modelled around the concept of an abstract Cloud Management stack subsystem that is integrated with components of the EGI Core Infrastructure and that provides a set of agreed uniform interfaces within the community it provides services to.


=== OCCI ===
[[Image:Federated_Cloud_Model.png|thumb|center|600px|Federated Cloud Model]]


The Open Cloud Computing Interface (OCCI) is a RESTful Protocol and API designed to facilitate interoperable access to, and query of, cloud-based resources across multiple resource providers and heterogeneous environments. The formal specification is maintained and actively worked on by OGF’s OCCI-WG, for details see http://occi-wg.org/.
The EGI Cloud Federation (see Figure) is a hybrid cloud composed by public, community and private clouds, all supported by the EGI Core Infrastructure Platform services. The EGI Federated Cloud is composed by multiple “realms”, each realm having homogeneous cloud management interfaces and capabilities. A cloud realm is a subset of cloud providers exposing homogeneous cloud management interfaces and capabilities. The Open Standards Cloud Realm supports the usage of open standards for its interfaces and is completely integrated with the EGI Core Infrastructure Platform. A Community Platform provides community-specific data, tools and applications, which can be supported by one or more realms.


OCCI’s specification consists of three basic elements, each covered in a separate specification document:
== Services in cloud federations ==
'''OCCI Core''' describes the formal definition of the OCCI Core Model <ref>R. Nyren, A. Edmonds, A. Papaspyrou, and T. Metsch, Open Cloud Computing Interface - Core, GFD-P-R.183, April 2011. [Online]. Available: http://ogf.org/documents/GFD.183.pdf </ref>. '''OCCI HTTP Rendering''' defines how to interact with the OCCI Core Model using the RESTful OCCI API <ref>T. Metsch and A. Edmonds, Open Cloud Computing Interface - HTTP Rendering, GFD-P-R.185, April 2011. [Online]. Available: http://ogf.org/documents/GFD.185.pdf</ref>. The document defines how the OCCI Core Model can be communicated and thus serialised using the HTTP protocol. OCCI Infrastructure contains the definition of the '''OCCI Infrastructure''' extension for the IaaS domain <ref>Open Cloud Computing Interface - Infrastructure, GFD-P-R.184, April 2011. [Online]. Available: http://ogf.org/documents/GFD.184.pdf</ref>. The document defines additional resource types, their attributes and the actions that can be taken on each resource type. Detailed description of the abovementioned elements of the specification is outside the scope of this document. A simplified description is as follows.
Despite the large diversity in the type of cloud realms, a relatively small number of identical building blocks (or federator services) can be identified in almost all of them. These services turn individual clouds into a federation. The table collects these common services to help architects identify topics they should focus on when designing a cloud federation. Technical details for these are also available at [[Federated Cloud Technology]].


OCCI Core defines base types '''Resource''', '''Link''', '''Action''' and '''Mixin'''. Resource represents all OCCI objects that can be manipulated and used in any conceivable way. In general, it represents provider’s resources such as images (Storage Resource), networks (Network Resource), virtual machines (Compute Resource) or available services. Link represents a base association between two Resource instances; it indicates a generic connection between a source and a target. The most common real world examples are Network Interface and Storage Link connecting Storage and Network Resource to a Compute Resource. Action defines an operation that may be invoked, tied to a specific Resource instance or a collection of Resource instances. In general, Action is designed to perform complex high-level operations
{| class="wikitable" style="margin: auto;"
changing the state of the chosen Resource such as virtual machine reboot or migration. The concept of mixins is used to facilitate extensibility and provide a way to define provider-specific features.
|-
! Federation Service
! Role within the federation
! Existing technical solution in EGI
|-
! Service Registry
| A registry where all the federated sites and services are registered and state their capabilities. The registry provides the ‘big picture view’ about the federation for both human users and online services (such as service monitors).
|GOCDB
|-
!Information System
|A database that provides real-time view about the actual capabilities and load of federation participants. Can be used by both human users and online services.
|BDII
|-
! Virtual Machine Image Catalogue
| A catalogue of Virtual Machine Images (VMIs) that encapsulate those software configurations that is useful and relevant for the given community (typically pre-configured scientific models and algorithms).
|AppDB
|-
! Image replication mechanism
| A system that automatically replicates VMIs from the federation VMI catalogue to each of the member sites, as well as removes them when needed. Automated replication can ensure consistency of capabilities across sites and is very often coupled with a VMI vetting process to ensure that only properly working, and relevant VMIs are replicated to the cloud sites of the community.
| vmcatcher/vmcaster
|-
! Single sign-on for users
| Ensuring that users of the federation need to register for access only once before they can use the federated services. Single sign-on is increasingly implemented in the form of identity federations in both industry and academia.  
| IGTF X509 proxies with VOMS extensions
|-
! Integrated view about resource/service usage
|A system that pulls together usage (accounting) information from the federated sites and services, integrates the data and presents them in such a way that both individual users and communities can monitor their own resource/service usage across the whole federation.
| Cloud Usage Record, APEL Accounting repository and portal
|-
! Integrated interfaces or user environments
|Having interfaces through which users and user applications can interact with the services offered by the various cloud providers. In case of an IaaS cloud federation these interfaces offer compute, storage and network management capabilities.
|OCCI API and OpenStack API
|-
!Availability Monitoring
| Use a shared system to monitor and collect availability and reliability statistics about the distributed cloud service providers and to retrieve this information programmatically.
| ARGO monitoring system
|-
! Federated service management tools
| A set of processes, policies, activities and supporting tools customized to the federated cloud.
| EGI federated service management
|}


The [[Federated_Cloud_VM_Management|VM Management]] scenario page contains detailed information on the support for OCCI on different Cloud Management Stacks.
= EGI cloud realms =


====  OCCI extensions for FedCloud ====
The EGI Federated Cloud can support multiple cloud federations (community specific, private or public). Based on the EGI federation services and custom external solutions, any scientific community can create a federated cloud. Each community or e-infrastructure that wants to build a cloud federation decides the services required to support their computational needs. Because these cloud federations are largely built from tools and services of the same solution portfolio, they can maintain the portfolio together; they can share best practices, and can offer user support and training in a collaborative fashion.


Contextualization is the process of installing, configuring and preparing software upon boot time on a pre-defined virtual machine image (e.g. setting the hostname, IP addresses, SSH authorized keys, starting services, installing applications, etc.). OCCI v1.1 (current version of the standard) lacks of mechanisms to allow this contextualization of VMs, hence we have proposed the use of a new OCCI mixins that have attributes to hold user-provided data with the context information for the VM.  
EGI currently operates two realms: the '''Open Standards Realm''' and the '''OpenStack Realm'''. Both are completely integrated with the EGI federator services described above but use different interfaces to offer the IaaS capabilities to the users: the Open Standards Realm uses OCCI standard (supported by providers with OpenNebula, OpenStack or Synnefo cloud management frameworks), while the OpenStack Realm uses OpenStack native Nova API (support limited to OpenStack providers). This OpenStack Realm was introduced in the federation during November 2015 and can co-exist with the Open Standards Realm within the same resource provider.


The mixins are:
{| class="wikitable" style="margin: auto;"
 
|-
{| class="wikitable"
! Service
! Open Standards Realm
! OpenStack Realm
|-
! IaaS interface
| style="text-align: center;" | [[Federated_Cloud_Technology#OCCI|OCCI]]
| style="text-align: center;" | [[Federated_Cloud_Technology#OpenStack_Compute|OpenStack Compute API]]
|-
! Service Registry
| colspan="2" style="text-align: center;" | [[Federated_Cloud_Technology#GOCDB | GOCDB]]
|-
! Single sign-on
| colspan="2" style="text-align: center;" | [[Federated_Cloud_Technology#Virtual_Organisation_Management_.26_AAI | X.509 proxies with VOMS extensions]]
|-
! Accounting
| colspan="2" style="text-align: center;" | [[Federated_Cloud_Technology#Accounting |Cloud Usage Record]]
|-
! Information discovery
| colspan="2" style="text-align: center;"| [[Federated_Cloud_Technology#Information_Discovery|BDII]]
|-
|-
! term !! scheme !! attributes
! VM Image catalogue
| colspan="2" style="text-align: center;" | [[Federated_Cloud_Technology#AppDB_Cloud_MarketPlace|AppDB]]
|-
|-
| <code>user_data</code>
! VM Image distribution
| <code><nowiki>http://schemas.openstack.org/compute/instance#</nowiki></code>
| colspan="2" style="text-align: center;" | [[Federated_Cloud_Technology#HEPiX_image_lists|HEPiX image lists]]
| <code>org.openstack.compute.user_data</code>: string that holds base64 encoded data to be available at the VM upon instantiation
|-
|-
| <code>public_key</code>
! Monitoring
| <code><nowiki>http://schemas.openstack.org/instance/credentials#</nowiki></code>
| colspan="2" style="text-align: center;" | [[Federated_Cloud_Technology#EGI_A.2FR_Monitoring|ARGO]]
| <code>org.openstack.credentials.publickey.name</code>: string with the name of the public key (optional)<br/><code>org.openstack.credentials.publickey.data</code>: string with the public key
|}
|}
Each Cloud Management stack provides its own mechanisms to make these data available at the VM. FedCloud recommends using [http://cloudinit.readthedocs.org/en/latest/ cloud-init] for handling the data. Cloud-init frees the user from managing the specific ways for handling the contextualization information and it's widely available in most OS versions and IaaS cloud platforms. By default cloud-init will:
* Put the ssh-key into the ~/.ssh/authorized_keys of root user (or equivalent)
* If the user provided data is a script, it will be executed upon instantiation.
More complex use-cases are supported, with documented examples in regular cloud-init documentation.
== Data management ==
The Data management provides a data infrastructure for storing and retrieving data from anywhere at any time. Community cloud federations may chose any interoperable interface within their community to provide Data Management capabilities. The Public Federated Cloud uses the CDMI open standard for this purpose.
=== CDMI ===
The SNIA Cloud Data Management Interface (CDMI)<ref>SNIA Cloud Data Management Interface. http://www.snia.org/cloud</ref> defines a RESTful open standard for operations on storage objects. Semantically the interface is very close to AWS S3 and MS Azure Blob, but is more open and flexible for implementation.
CDMI offers clients a way for operating both on a storage management system and single data items. The exact level of support depends on the concrete implementation and is exposed to the client as part of the protocol.The design of the protocol is aimed both at flexibility and efficiency. Certain heavyweight operations, e.g. blob download, can be performed also with a pure HTTP client to make use of the existing ecosystem of tools. CDMI is built around the concept of Objects, which vary in supported operations and metadata schema. Each Object has an ID, which is unique across all CDMI deployments.
There are 4 CDMI objects most relevant in the context of EGI’s Federated Cloud:
* '''Data object''': Abstraction for a file with rich metadata.
* '''Container''': Abstraction for a folder. Export to non-HTTP protocols is performed on the container level. Container might have other containers inside of them.
* '''Capability''': Exposes information about a feature set of a certain object. CDMI supports partial implementation of the standards by defining optional features and parameters. In order to discover what functionality is supported by a specific implementation, CDMI client can issue a GET request to a fixed url: /cdmi_capabilities.
* '''Domain''': Deployment specific information.
Attachment of the storage items to a VM can often be performed more efficiently using protocols like NFS or iSCSI. CDMI supports exposing of this information via container metadata. A client can make use of this information to attach a storage item to a VM over an OCCI protocol.
= VM Image management =
In a distributed, federated Cloud infrastructure, users will often face the situation of efficiently managing and distributing their VM Images across multiple resource providers. The VM Image management subsystem provides the user with an interface into the EGI Cloud Infrastructure Platform to notify supporting resource providers of the existence of a new or updated VM Image. Sites then examine the provided information, and pending their decision pool the new or updated VM Image locally for instantiation.
This concept introduces a number of capabilities into the EGI Cloud Infrastructure Platform:
* '''VM Image lifecycle management''' – Apply best practices of Software Lifecycle Management at scale across EGI
* '''Automated VM Image distribution''' – Publish VM images (or updates to existing images) once, while they are automatically distributed to the Cloud resource providers that support the publishing research community with Cloud resources.
* '''Asynchronous distribution mechanism''' – Publishing images and pooling these locally are intrinsically decoupled, allowing federated Resource Providers to apply local, specific processes transparently before VM images are available for local instantiation, for example:
* '''Provider-specific VM image endorsement policies''' – Not all federated Cloud resource providers will be able to enforce strict perimeter protection in their Cloud infrastructure as risk management to contain potential security incidents related to VM images and instances. Sites may implement a specific VM Image inspection and assessment policy prior to pooling the image for immediate instantiation.
The [https://appdb.egi.eu/ EGI AppDB] service has been extended to a Virtual Appliance Marketplace. This brings about a new category of software entries, called virtual appliances, which are, in all practical manners, clean-and mean virtual machine images designed to run on a virtualization platform, that provide a software solution out-of-the-box, ready to be used with minimal or no set-up needed within the EGI Federated Cloud infrastruture. AppDB's Virtual Appliance Marketplace provides the ground for managing and publishing versioned repositories of virtual appliances, in a way that integrates with the existing HEPiX [https://github.com/hepix-virtualisation/vmcaster VMCaster] / [https://github.com/hepix-virtualisation/vmcatcher VMCatcher] framework, currently in use by the EGI. Besides basic features embedded in the AppDB portal itself, such as creating, publishing, enabling/disabling, and archiving or deleting VA versions, it also uses the HEPiX [https://github.com/hepix-virtualisation/vmcaster VMCaster] command line tool for uploading full VA versions, as well as a web-based dashboard for monitoring the individual uploads injected from the command line.
Research Communities ultimately create and update VM Images (or delegate this functionality). The Images themselves are stored in Appliance repositories that are provided and managed elsewhere, typically by the Research Community itself. A representative of the Research Community then generates a VM Image list (or updates an existing one) using AppDB. Federated Clouds Resource Provider then subscribe to changes in VM Image lists by regularly downloading the list from AppDB, and comparing it against local copies. New and updated VM Images are downloaded from the appliance repository referenced in the VM Image list into a local staging cache and, where required, made available for further examination and assessment.
Ultimately, Cloud resource Providers will make VM Images available for immediate instantiation by the Research Community.
Information on the AppDB and the Cloud Marketplace can be found at [https://wiki.appdb.egi.eu AppDB documentation page]:
* [https://wiki.appdb.egi.eu/main:guides#cloud_marketplace Cloud MarketPlace Guide]
* [https://wiki.appdb.egi.eu/main:faq#cloud_marketplace Cloud MarkeyPlace FAQ]
= EGI Core services integration =
== Virtual Organisation Management & AAI: VOMS ==
Within EGI, research communities are generally identified and, for the purpose of using EGI resources, managed through “Virtual Organisations” (VOs). The Public EGI Cloud currently also uses VOs for authorization and authentication. Three VOs must be supported at every Resource Provider:
* [http://operations-portal.egi.eu/vo/view/voname/ops <code>ops</code> VO], used for monitoring purposes;
* [http://operations-portal.egi.eu/vo/view/voname/dteam<code>dteam</code> VO], used for testing purposes by site operators; and
* [http://operations-portal.egi.eu/vo/view/voname/dteam <code>fedcloud.egi.eu</code> VO], a catch-all VO that provides resources to users for a limited period of time (6 months initially) for protopying and validation.
Resource Providers may support additional VOs in order to give access to other user communities.
Integration modules are available for each Cloud Management Framework that been developed by the task members. Configuring these modules into a provider’s cloud installation will allow members of these VOs to access the cloud. The user retrieves a VOMS attribute certificate from the VOMS server of the desired VO (currently, Perun server for <code>fedcloud.egi.eu</code> VO) and thus creates a local VOMS proxy certificate. The VOMS proxy certificate is use in subsequent calls to the OCCI endpoints of OpenNebula or OpenStack using the rOCCI client tool. The rOCCI client directly talks to OpenNebula endpoints, which map the certificate and VO information to local users. Local users need to have been created in advance, which is triggered by regular synchronizations of the OpenNebula installation with Perun.
In order to access an OpenStack OCCI endpoint, the rOCCI client needs to retrieve a Keystone token from OpenStack Keystone first. The retrieval is transparent to the user and automated in the workflow of accessing the OpenStack OCCI endpoint. It is triggered by the OCCI endpoint rejecting invalid requests and sending back an HTTP header referencing the Keystone URL for authentication. Users are generated on the fly in Keystone, it does not need regular synchronization with the VO Management server Perun.
Generic information about how to configure VOMS support for the supported Cloud Management Frameworks is available at [[MAN10]]. Information to how to add the support for a new Virtual Organisation on the EGI Federated Cloud can be found at [[HOWTO16]].
== Information discovery: BDII ==
Users and tools can discover the available resource in the infrastructure by querying EGI information discovery services. The common information system deployed at EGI is based on the Berkeley Database Information Index (BDII) with a hierarchical structure distributed over the whole infrastructure.
The information system is structured in three levels: the services publish their information (e.g. specific capabilities, total and available capacity or user community supported by the service) using an OGF recommended standard format, GLUE2<ref>GLUE Specification V2.0, GFD-R-P.147, March 2009.http://www.ogf.org/documents/GFD.147.pdf</ref>. The information published by the services is collected by a Site-BDII, a service deployed in every site in EGI. The Site-BDIIs are queried by the Top-BDIIs - a national or regional located level of the hierarchy, which contain the information of all the site services available in the infrastructure and their services. NGIs usually provide an authoritative instance of Top-BDII, but every Top-BDII, if properly configured, should contain the same set of information.
Resource Providers must provide a Site-BDII endpoint that published information on the available resource following the GLUE2 schema. Even if the GLUE2 schema defines generic computing and storage entities, it was developed originally for Grid resources and can represent only partially the information needed by the Cloud users. Thus, the EGI Federated Cloud is working within the GLUE2 WG at OGF to profile and extend the schema to represent Cloud Computing, Storage and in the future Platform and Software services. The proposed extensions are currently under discussion at the WG.
EGI provides an implementation for service-level information that generates information supporting OpenStack and OpenNebula, Synnefo support is currently being added. The information is published in a different subtree (<code>Glue2GroupID=cloud</code>) so it can coexist with grid information and is easily discoverable by users.
Information available for each provider:
* Cloud computing resources
* Service endpoint
* Capabilities provided by the service, such as: virtual machine management or snapshot taking. The labels that identify the capabilities are agreed within the taskforce.
* Interface, the type of interface – e.g. webservice or webportal – and the interface name and version, for example OCCI 1.2.0
* User authentication and authorization profiles supported by the service, e.g. X.509 certificates
* Virtual machines images made available by the cloud provider
* Resource templates (number of cores and physical memory) allocable in a virtual machine.
== Central service registry: GOCDB ==
EGI’s central service catalogue is used to catalogue the static information of the production infrastructure topology. The service is provided using the GOCDB tool that is developed and deployed within EGI. To allow Resource Providers to expose Cloud resources to the production infrastructure, a number of new service types were added to GODCB:
* <code>eu.egi.cloud.accounting</code>
* <code>eu.egi.cloud.storage-management.cdmi</code>
* <code>eu.egi.cloud.vm-management.occi</code>.
* <code>eu.egi.cloud.vm-metadata.marketplace</code>
* <code>eu.egi.cloud.vm-metadata.vmcatcher</code>
* <code>eu.egi.cloud.vm-metadata.appdb-vmcaster</code>
Special rules apply for the following service types:
* eu.egi.cloud.storage-management.cdmi: '''Endpoint URL''' field must contain the following info:
http[s]://hostname:port
* eu.egi.cloud.vm-management.occi: '''Endpoint URL''' field must contain the following info:
https://hostname:port/?image=&lt;image_name&gt;&amp;resource=&lt;resource_name&gt;
Both &lt;image_name&gt; and &lt;resource_name&gt; cannot contain spaces. These attributes map to the OCCI os_tpl and resource_tpl respectively.
Higher level broker services also have its own service types:
* <code>eu.egi.cloud.broker.compss</code>
* <code>eu.egi.cloud.broker.proprietary.slipstream</code>
* <code>eu.egi.cloud.broker.vmdirac</code>
Further information about GOCDB can be find on the following page: [[GOCDB/Input System User Documentation]].
== Monitoring: SAM ==
Services in the EGI infrastructure are monitored via [[SAM|SAM (Service Availability Monitoring)]]. Specific probes to check functionality and availability of services must be provided by service developers, The current set of probes used for monitoring cloud resources consists of:
* OCCI probes (eu.egi.cloud.OCCI-VM and eu.egi.cloud.OCCI-Context): OCCI-VM creates an instance of a given image by using OCCI, checks its status and deletes it afterwards. OCCI-Context checks that the OCCI interfaces correctly supports the standard and the FedCloud contextualization extension.
* Accounting probe (eu.egi.cloud.APEL-Pub): Checks if the cloud resource is publishing data to the Accounting repository
* TCP checks (org.nagios.Broker-TCP, org.nagios.CDMI-TCP, org.nagios.OCCI-TCP and org.nagios.CloudBDII-Check): Basic TCP checks for services.
* VM Marketplace probe (eu.egi.cloud.AppDB-Update): gets a predetermined image list from AppDB and checks its update interval.
* Perun probe (eu.egi.cloud.Perun-Check): connects to the server and checks the status by using internal Perun interface
Probes for CDMI and the image synchronization mechanism are currently under development. More information on cloud probes can be at [[Cloud SAM tests]].
Currently a [https://cloudmon.egi.eu/nagios central SAM instance] specific to the activities of the EGI Federated Clouds Task has been deployed for monitoring test bed Results of cloud probes are visible on the [http://mon.egi.eu/myegi/sa/ central SAM interface] under profile <code>ch.cern.sam-CLOUD-MON</code> and <code>ch.cern.sam-CLOUD-MON_CRITICAL</code>.
== Accounting ==
EGI Federated Cloud has agreed on a Cloud Usage Record -which inherits from the OGF Usage record <ref>R. Mach, R. Lepor-Metz, S.Jackson, L.McGinnis, "Usage Record - Format Recommendation", GFD-R-P.098, https://www.ogf.org/documents/GFD.98.pdf</ref>- that defines the data that resource providers must send to EGI’s central Accounting repository.
Support for retrieving the accounting data in this format is available from:
* OpenNebula: https://github.com/EGI-FCTF/opennebula-cloudacc
* Openstack – https://github.com/IFCA/caso
* Synnefo provides its own internal component
Once generated, records are delivered via the network of EGI message brokers to the central accounting repository using APEL SSM (Secure STOMP Messenger) provided by STFC. SSM client packages can be obtained at https://apel.github.io. A Cloud Accounting Summary Usage Record has also been defined and summaries created on a daily basis from all the accounting records received from the Resource Providers are sent to the EGI Accounting Portal. The [http://accounting-devel.egi.eu/egi.php EGI Accounting Portal] also runs SSM to receive these summaries and provides a web page displaying different views of the Cloud Accounting data received from the Resource Providers.
= References =
<references/>

Revision as of 14:59, 3 March 2016

Overview For users For resource providers Infrastructure status Site-specific configuration Architecture




EGI Cloud Federation

The EGI Federated Cloud is a multi-national cloud system that integrates institutional clouds into a scalable computing platform for data and/or compute driven applications and services. The initial architecture of the EGI Federated Cloud was defined in 2011-2012 and was fully implemented by May 2014. Currently, the federation is a collaboration that enables various types of cloud federations to serve diverse demands of researchers from both academia and industry. The EGI Federated Cloud brings together scientific communities, R&D projects, technology and resource providers to form a community that integrates and maintains a flexible solution portfolio that enables various types of cloud federations with IaaS, PaaS and SaaS capabilities. The collaboration is committed to the use of open source tools and services that are reusable across scientific disciplines. These tools and services form a flexible portfolio from which a scientific community can mix and match items to establish its own, customised cloud federation.


The EGI Federated Cloud provides the services and technologies to create federation of clouds (community, private or public clouds) that operate according to the preferences, choices and constraints set by its members and users. The EGI Cloud Federations are modelled around the concept of an abstract Cloud Management stack subsystem that is integrated with components of the EGI Core Infrastructure and that provides a set of agreed uniform interfaces within the community it provides services to.

Federated Cloud Model

The EGI Cloud Federation (see Figure) is a hybrid cloud composed by public, community and private clouds, all supported by the EGI Core Infrastructure Platform services. The EGI Federated Cloud is composed by multiple “realms”, each realm having homogeneous cloud management interfaces and capabilities. A cloud realm is a subset of cloud providers exposing homogeneous cloud management interfaces and capabilities. The Open Standards Cloud Realm supports the usage of open standards for its interfaces and is completely integrated with the EGI Core Infrastructure Platform. A Community Platform provides community-specific data, tools and applications, which can be supported by one or more realms.

Services in cloud federations

Despite the large diversity in the type of cloud realms, a relatively small number of identical building blocks (or federator services) can be identified in almost all of them. These services turn individual clouds into a federation. The table collects these common services to help architects identify topics they should focus on when designing a cloud federation. Technical details for these are also available at Federated Cloud Technology.

Federation Service Role within the federation Existing technical solution in EGI
Service Registry A registry where all the federated sites and services are registered and state their capabilities. The registry provides the ‘big picture view’ about the federation for both human users and online services (such as service monitors). GOCDB
Information System A database that provides real-time view about the actual capabilities and load of federation participants. Can be used by both human users and online services. BDII
Virtual Machine Image Catalogue A catalogue of Virtual Machine Images (VMIs) that encapsulate those software configurations that is useful and relevant for the given community (typically pre-configured scientific models and algorithms). AppDB
Image replication mechanism A system that automatically replicates VMIs from the federation VMI catalogue to each of the member sites, as well as removes them when needed. Automated replication can ensure consistency of capabilities across sites and is very often coupled with a VMI vetting process to ensure that only properly working, and relevant VMIs are replicated to the cloud sites of the community. vmcatcher/vmcaster
Single sign-on for users Ensuring that users of the federation need to register for access only once before they can use the federated services. Single sign-on is increasingly implemented in the form of identity federations in both industry and academia. IGTF X509 proxies with VOMS extensions
Integrated view about resource/service usage A system that pulls together usage (accounting) information from the federated sites and services, integrates the data and presents them in such a way that both individual users and communities can monitor their own resource/service usage across the whole federation. Cloud Usage Record, APEL Accounting repository and portal
Integrated interfaces or user environments Having interfaces through which users and user applications can interact with the services offered by the various cloud providers. In case of an IaaS cloud federation these interfaces offer compute, storage and network management capabilities. OCCI API and OpenStack API
Availability Monitoring Use a shared system to monitor and collect availability and reliability statistics about the distributed cloud service providers and to retrieve this information programmatically. ARGO monitoring system
Federated service management tools A set of processes, policies, activities and supporting tools customized to the federated cloud. EGI federated service management

EGI cloud realms

The EGI Federated Cloud can support multiple cloud federations (community specific, private or public). Based on the EGI federation services and custom external solutions, any scientific community can create a federated cloud. Each community or e-infrastructure that wants to build a cloud federation decides the services required to support their computational needs. Because these cloud federations are largely built from tools and services of the same solution portfolio, they can maintain the portfolio together; they can share best practices, and can offer user support and training in a collaborative fashion.

EGI currently operates two realms: the Open Standards Realm and the OpenStack Realm. Both are completely integrated with the EGI federator services described above but use different interfaces to offer the IaaS capabilities to the users: the Open Standards Realm uses OCCI standard (supported by providers with OpenNebula, OpenStack or Synnefo cloud management frameworks), while the OpenStack Realm uses OpenStack native Nova API (support limited to OpenStack providers). This OpenStack Realm was introduced in the federation during November 2015 and can co-exist with the Open Standards Realm within the same resource provider.

Service Open Standards Realm OpenStack Realm
IaaS interface OCCI OpenStack Compute API
Service Registry GOCDB
Single sign-on X.509 proxies with VOMS extensions
Accounting Cloud Usage Record
Information discovery BDII
VM Image catalogue AppDB
VM Image distribution HEPiX image lists
Monitoring ARGO