Federated Cloud Information Discovery

From EGIWiki
Jump to: navigation, search
Overview For users For resource providers Infrastructure status Site-specific configuration Architecture



Scenarios: Federated AAI Accounting VM Image Management Brokering IntraCloud Networking
Monitoring VM Management Data Management Information Discovery Security


Contents


Scope

Integrating information from multiple resource providers

Members

Role Institution Name
Leader EGI.eu Peter Solagna


Roadmap

Documentation

Information that should be published by a cloud service

The following are the information identified during the TF F2F meeting:

Please add more points edit/comments the list

  1. What is the name of the resource and what type of interface can I use to manage instances on the resource?
    1. What is the endpoint I should contact to interact with the cloud management interface? (E.g. the url of the web-service/portal)
  2. What are the AuthN and AuthZ rules that operate on your cloud?
  3. What instances are already installed on the resource and am I allowed to upload my own instances?
  4. If I am able to upload instances what format of instances does the resource accept?
  5. Is there a data interface available and if so what is it?
  6. What is the overall size of the resource?
  7. Are instance templates defined that limit the choice of instance scales I am able to run?
  8. What type of virtual network can I establish on the resource?
  9. Does the resource support cloud scalability through managed bursting to another external provider?

The following are questions on the dynamic information;

  1. I have a virtual instance that requires X,Y,Z resources, does your cloud have A>X, B>Y,C>Z resource available?
  2. My instance is short lived is its utilisation of resources going to be captured in the information system such that overprovisioning will/will not occur?
  3. What is the charging scheme and how much will using your cloud cost?

More information to publish

Storage capabilities

In this section the storage capabilities information to be published are analyzed not (necessarily) considering the possibilities in the GLUE2 schema currently available.

Relevant information

The following table contains what is possible to insepct through the OCCI 1.2 spec:

Attribute Comment
occi.storage.size Size of the storage resource instance
occi.storage.status Status of the storage instance (online,offline,backup,snapshot..)

These attributes are describing an actual instance, which is not going to be published in the information system. What we want to advertise are the capabilities that could be requested to the cloud service.

Attribute Comment
Max Storage installed in the site This is the total amount of disk space that the cloud site provides as virtualized storage resource.
Max size of a single virtual storage resource This is the max size of storage
Interfaces How users and VMs can interact with the storage resources. E.g. CDMI.
Storage throughput Max I/O speed allowed to VMs writing/reading to the storage area.
Capabilities This are additional capabilities of a storage service, on top of the create/delete/link. Examples could be: backup or snapshot.

Please add more options in the table, or participate to the discussion in the FedCloud task force mailing list.

Network capabilities

The following table contains some of the Network capabilities that could be advertised through the information system:

Attribute Comment
Internal Bandwith The maximum bandwith available between the virtual machines in the cloud
Outbound bandwith Bandwith that can be allocated to each virtual machine outside the cloud
Average latency If the VM are deployed on different physical sites the latency between the instances can be higher and affect network performances. Low values of network latency assure that the virtual machine are physically instantiated in the same network.
IPv6 enabled Can the virtual network be configured for IPv6?
Virtual private network enabled Is it possible to set up a virtual private network, in order to increase the security and the isolation of the instantiated machines?

How to render those information in GLUE2

Note: BDII service speaks only GLUE2. The Cloud information need to be squeezed in the current set of GLUE2 Entities. If the schema is extended to include Cloud-specific entities, it needs to be officially approved by OGF and implemented in the various glue-schema glue-validator components deployed with the BDII.

Use the currently available GLUE2.0 entities

Currently the GLUE2 includes two main conceptual models for Computing Elements and Storage Elements. These elements should be used to model the Cloud capabilities remaining compliant to the current GLUE2.0 schema.

Capabilities for cloud services

Note: bold capabilities are new, not already in GLUE2 specification. Adding new capabilities do not requires an extension of the GLUE2 schema.
Please: add new high level capabilities if you feel that something is missing. These capabilities are used in the following entities.

Capability Description
cloud.VMmanagement This is the standard capability that every cloud service should publish if it allows to instantiate/suspend/delete virtual machines
cloud.virtualImagesUpload This is the capability that allows users to upload their own virtual images through the cloud interface
security.authentication/security.authorization I would leave those capability, given that every cloud provider has authentication

Computing Service entity description

Attribute Type Multiplicity Description
Creation time .. .. ..
Validity .. .. ..
ID .. .. ..
Name String 1 Human readable name. It could be used to fill the information: "what is the name of the resource"
OtherInfo String n Placeholder to add information that does not fit into any other attribute. Cloud information that cannot be mapped in other attributes could be added here.
Capability Capability_t n This attribute lists the capabilities available for this service, currently the type Capability_t does not include specific cloud capabilities. Being an open enum type it can be extended with additional capabilities. Currently some of the already available capabilities are: security.accounting, security.authentication or information.logging. We could consider to add capabilities like "cloud.vm.uploadImage" to add the information in the quesiton: "am I allowed to upload my own instances?". To identify cloud services there would be the need to add a new capability, common to all the cloud services regardless of their specific capabilities, like: "cloud.managementSystem" (nb: stupid example). Resource providers, in this design stage, could provide just descriptions of the capabilities they would like to publish. I (Peter) will try to group them proposing some labels for the different capabilities.
Type ServiceType_t 1 Type of service in a reverse namespace model, e.g.: org.glite.lb or org.glite.wms. It could be org.opennebula, org.stratuslab or com.cloudsigma

There are, then, a number of more attributes (static and dynamic) that could be used by cloud services, like: StatusInfo,TotalJobs, RunningJobs etc etc. Please note that Location is a GLUE2 entity that can be linked to the Service entity, this could answer to the "Where is located the cloud facility?" question.

ComputingEndpoint description

Every ComputingService has associated one or more Computing Endpoint. The endpoint is used to create, control am monitor computational activities.

Attribute Type Multiplicity Description
CreationTime .. .. I will skip the most general, attributes like OtherInfo and Capability(described above).
URL URI 1 Network location of the endpoint.
Capability Capability_t 0..n It's the same field of the Service entity. Some capability could be interface-specific. I would replicate all the general capability also for this instance.
Technology EndpointTechnology_t 1 Examples are "webservice" and "corba". We could add "webportal" or something like this to clarify that the endpoint refers to a web application.
InterFaceName InterFaceName_t 1 (mandatory) The interface in the cloud case could be OCCI, EC2, jclouds or "webinterface". This can answer to the question: "what type of interface can I use to manage instances on the resource?"
InterfaceVersion .. .. No description needed.
Supported profile URI * We can define, here, a set of profiles for the authN/authZ of the users, like uri:sec:x509.

ExecutionEnvironment

The ExecutionEnvironment class describes the hardware and operating system environment in which a job will run. It could be used to describe the VM images already available in the Cloud service.

Attribute Type Multiplicity Description
Platform Platform_t 1 The platform atchitecture, can be: amd64,i386,itanum,powerpc,sparc
TotalInstances/used instances - - These attributes are not relevant in a cloud environment, where the execution environment are deployed dynamically.
PhysicalCPUs UInt32 0..1 The physical CPUs are not relevant - I would say- in a virtualised environment.
LogicalCPUs UInt32 0..1 This attribute could be used to express the maximum number of cores that is possible to instantiate in a single VM of this type (likely it will be common to all the execution environments of the same cloud service).
MainMemorySize UInt64 1 Max physical memory that is possible to instantiate on a single VM.
  • OSFamily
  • OSName
  • OSVersion
(*) 1 Attributes which define the operating system available. There will be an execution environment for every virtual machine available in the cloud service. We should define some placeholders to create an ExecutionEnvironment stub to describe the max cores/memory for the virtual machines uploaded by a user.

Deploy a new set of entities

This is the next step: define cloud specific GLUE entities to extend the GLUE2 schema in order to publish the cloud services in a standard way.

Technical implementation

For a first demo the best technical choice is to go for openldap, which is available in almost all the *nix machines in the world. On top of that, openldap is the server used by the gLite BDIIs, therefore it would be easy to use the same configuration files set-up used for the GRIS or the GIIS.

Resource providers LDAP servers

!!! NEW !!!: IMPORTANT: Fill the table with the address of the LDAP server set up as an information provider for your test bed.

RP Name Resource Centre Address of the LDAP server "ldap://hostname:2170"
CESNET CESNET Cloud ldap://carach5.ics.muni.cz:2170
KTH KTH Cloud ldap://egi.cloud.pdc.kth.se:2170
GWDG GWDG Cloud ldap://one.cloud.gwdg.de:2170
SARA SARA Cloud ldap://bdii.cloud.sara.nl:2170
CESGA CESGA Cloud ldap://ui.egi.cesga.es:2170
CYFRONET CYFRONET Cloud ldap://head.cloud.cyf-kr.edu.pl:2170 
TCD TCD Cloud ldap://cagnode42.cs.tcd.ie:2170
GRNET GRNET_OKEANOS ldap://okeanos-is.hellasgrid.gr:2170
FZJ FZJ Testbed ldap://egi-cloud.zam.kfa-juelich.de:2170
CC-IN2P3 CC-IN2P3 Cloud ldap://cccldbdii01.in2p3.fr:2170
INFN CNAF WNoDeS Cloud ldap://test-wnodes-is.cnaf.infn.it:2170
CSIC CSOC Scientific Cloud ldap://cloud.ifca.es:2170

EGI Community Forum 2012 demo

RP Name RP contact name Resource Centre name to be published (was Site Name) Country Capabilities to be published (specify the endpoints supporting the capabilities!) Other info to publish VM Manager V.Images available (OSFamily,OSName,OSVersion) Max cores Max CPU speed Max RAM
CESNET Miroslav Ruda CESNET Cloud Czech Republic cloud.managementSystem, cloud.vm.uploadImage, cloud.data.cdmi XEN 1.) Linux, OpenSUSE, 11.4
2.) Linux, Debian, 6.0.3
24 96GB
KTH Zeeshan Ali Shah KTH-PDC Cloud Sweden cloud.managementSystem, cloud.vm.customimage, cloud.data.cdmi
GWDG Piotr Kasprzak GWDG Cloud Germany cloud.managementSystem, cloud.vm.uploadImage KVM 1.) Linux, Scientific Linux, 6.1
2.) Linux, Ubuntu, 11.10
8 2.4 GHZ 16GB
CYFRONET Jan Meizner CYFRONET Cloud Poland cloud.managementSystem, cloud.vm.uploadImage KVM 24 48GB
CESGA Alvaro Simon CESGA Cloud Spain cloud.managementSystem, cloud.vm.customimage KVM 1.) Linux, Scientific Linux, 5.5 264 2.6 GHZ 264GB

Distributed implementation

Publishing correct information in the information system must be responsibility of the resource provider. To build a decentralized information system there is the need for:


A possible strategy is to base everything on the Top-BDII, which is the currently available technology. Ideally one LDAP server is sufficient for the resource providers. The Top-BDII can be configured to get the data from different LDAP servers, and merge them.

Pros:

Cons:

Find here some guidelines for the ldap installation.

Example queries

  1. Get all the endpoints published by the resource providers, with the interface name and the version.
$ ldapsearch -x -H ldap://test03.egi.cesga.es:2170 -b o=glue '(objectClass=GLUE2Endpoint)' | perl -p00e 's/\r?\n //g' | grep -E 'GLUE2EndpointURL|GLUE2EndpointInterfaceName|GLUE2EndpointInterfaceVersion|dn\:' | awk '{printf("%s%s", $0, (NR%4 ? " === " : "\n"))}' | awk '{print ""$2" "$5" "$8" "$11}' | awk -F "GLUE2DomainID=" '{print $2}' | awk -F "," '{print $1 " "$3}' | awk '{print $1" "$4" "$3" "$5}' | sort
 
CC-IN2P3 OCCI 1.1 https://ccocci.in2p3.fr:8788
CESGA OCCI 1.1 http://cloud.cesga.es:3200
CESGA OCCI 1.1 http://meghacloud.cesga.es:3200
CESGA OCCI 1.1 https://cloud.cesga.es:3202
CESGA OCCI 1.1 https://meghacloud.cesga.es:3202
CESNET CDMI 1.0 https://carach3.ics.muni.cz:8080/
CESNET OCA 3.4.1 https://carach5.ics.muni.cz:6443/RPC2
CESNET OCCI 0.8 https://carach5.ics.muni.cz:9443/
CESNET OCCI 1.1 http://carach5.ics.muni.cz:3333/
CESNET OCCI 1.1 https://carach5.ics.muni.cz:10443/
CESNET Sunstone 3.4.1 https://carach5.ics.muni.cz/
csTCDie OCCI 1.1 https://cagnode42.cs.tcd.ie
csTCDie XML-RPC 1.4 https://cagnode42.cs.tcd.ie:2634
CYFRONET OCCI 1.1 http://cloud-lab.grid.cyf-kr.edu.pl:3200/
CYFRONET OCCI 1.1 https://cloud-lab.grid.cyf-kr.edu.pl:3443/
FZJ OCCI 1.1 https://egi-cloud.zam.kfa-juelich.de:8788/
GRNET_OKEANOS OCCI 1.1 http://okeanos-occi.hellasgrid.gr:8888
GWDG CDMI 1.0.1 http://cdmi.cloud.gwdg.de:4001
GWDG CDMI 1.0.1 https://cdmi.cloud.gwdg.de:4000
GWDG OCCI 0.8 http://occi.cloud.gwdg.de:3400
GWDG OCCI 1.1 http://occi.cloud.gwdg.de:3200
GWDG OCCI 1.1 http://occi.cloud.gwdg.de:5000
GWDG OCCI 1.1 https://occi.cloud.gwdg.de:3100
INFN_CNAF OCCI 1.1 https://test-wnodes-web01.cnaf.infn.it:8443/
SARA OCCI 0.8 https://occi.cloud.sara.nl/
SARA OCCI 1.1 https://occi11.cloud.sara.nl/


Implementation of the cloud service types in GOCDB

GOCDB is the EGI service registry. It contains the services endpoints (example), the grid site topology and other information like downtimes register or contact lists. GOCDB does not contain dynamic information such as number of cores available or resources capacity.

The plans for the inclusion in GOCDB may be the following:

  1. Definition of the new service types:
    1. Resource providers service types:
      • org.ogf.OCCI: service exposing OCCI interface
      • org.snia.CDMI: service exposing CDMI interface
      • org.opennebula.OCA: open nebula management interface
      • eu.egi.cloud-site-bdii: cloud site information provider
      • eu.egi.cloud-accounting: Accounting data parser
      • ..more?
    2. Infrastructure services
      • org.stratuslab.marketplace: StratusLab marketplace
      • .. more?
  2. Registration of the cloud resource centres
    1. As non-EGI for the first iteration (the flag can be changed easily)
    2. Fill the resource provider contacts (site manager, security officer..)
    3. Add the service endpoints to the site
    4. The GIIS URL must be in the format: ldap://<site bdii url>:2170/o=glue

References

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox
Print/export