HOWTO10 How to port application into EGI Federated Cloud
How To Port your application into the EGI Federated Cloud IaaS
Scope of this page is to provide a brief guide on how to integrate your application/web service into the EGI Federated Cloud environment. Target of this page are applications/web services developers/administrators, with minimal or null knowledge of Cloud technologies.
This guide does not aims to be exhaustive of the overall problem of porting of existing application/web services to a Cloud environment, but to give only a generic overview. For more information, you can ask direct support to the EGI Users Support Mailing List.
Even if this guide aims to be as more generic as possible, the particular technology solutions cited are referred to the EGI Federated Cloud environment and may differ on other environments.
Porting of your application to a Cloud Infrastructure-as-a-Service (IaaS) environment is a task that can be performed at different levels and with different strategies, according on the type of your application/web service, its resource consumption, the effort you want to spend in adapting the application logic to the cloud environment and the advantages that the advanced cloud features could bring to your application.
At the most basic level, Cloud IaaS lets you start manually (usually in a matter of minutes) a virtual server with an assigned IP address and a specific OS. This machine behaves like a normal server. You will have administrator access to it. You can login to it with your favorite remote shell client (usually based on SSH), you can install your own software on the machine and configure the OS as you would do with a normal server. When you do not need the machine anymore, you can just delete it.
Going a step above, since Cloud is based on virtualization technologies, you have the possibility to use a custom virtual disks, which you can prepare offline. For example, instead of going into the machine after startup via SSH to install your web service, you may build a custom OS disk with the web service already installed into it, upload it to the Cloud and start it directly, having your machine up and running already configured.
You own custom OS image or a basic OS image may be also personalized at start-up by running custom configuration script. The most fundamental personalization is the setup of the credentials to administer the machine remotely (usually temporary SSH keys generated by the user). This is very important if you use basic OS images, which are shared between users and needs to be secured as soon as they start. This process is named contextualization. In theory, there is no limit on what can be done during contextualization, thus you may not only inject your own access credentials, but also install your software, update it or set it up.
Moving from a single server application/web service to a more complex application, with multiple servers hosting the different components (ex. database, web interface, load balancer, etc...), the deployment of the different servers can be performed manually (setting up each service components one by one) or automatically, by using automatic deployment tools provided by an Infrastructure Broker.
Another main feature of the Cloud IaaS is the possibility to access programmatically all its features via a set of standard APIs. These functions may be exploited directly by your application, which could dynamically start and stop additional servers only when they are needed (ex. under heavy workload). Such scenario usually implies a modification to the application to support the different cloud APIs and a logic to decide when to start and stop new machines. This task may be eased by using an application brokers like the one listed in the paragraph below. Using an application broker, you will need only to integrate the application inside the broker and it will take care of the dynamic deployment of the servers on the cloud.
There are many different integration strategies you can use to port your application/web service into a cloud environment. Which one to use will depend on your particular application. In this section we will give a brief introduction of the most common strategies, trying to help you in finding the best one for your needs.
The details on how to actually implement the strategies and perform the integration in the FedCloud are reported in the following sub-sections.
1. Manual server setup
This is the most basic way to port your application, and the one which uses less of the Cloud advanced features.
Using the cloud client you can start a virtual server with a pre-defined OS. You can select the OS you need from a list of generic images (ex. Ubuntu 11.04, CentOS 6.4, etc...). Once you have started the server, you just login into it (via SSH) and install your own software. Then, the server is ready to be used, you can test it and give access to the final web service/application users.
In this scenario, the contextualization is used only to setup your user credentials in the machine (pushing your SSH public key), but may be also completely ignored if the cloud infrastructure itself stores your access credentials and inject them into the machine.
- The easiness.
- If your application is composed by more than one server, you have to start them one by one, set them up and test connectivity between them as you would do for multiple physical servers.
2. Basic OS image with contextualization
This is a small step above the manual server setup. The addition is the automatic installation/configuration of your application at startup. This is done by running a custom contextualization script. This script maybe nothing more than a wrapper to a Bash or Python script, which is automatically executed at startup to install and configure the application. The required application packages and data are usually retrieved by the contextualization script from a web repository.
- The configuration of the server is faster with respect to the manual setup.
- Higher portability permitting to easily migrate your application on different cloud environment. You can easily update your deployment by just modifying the contextualization script and you can rely on externally maintained basic OS images, without the need to maintain your own custom OS instance updated.
- There is an effort needed to build the contextualization script, which can be not negligible especially for old and not well documented applications.
3. Custom OS image
Packaging your application in a custom OS image is a suggested solution in one of the following cases:
- your particular OS flavor is not available into the EGI.eu OS images repository
- it is too complex to build a contextualization script or even install manually your application
Your custom image will be uploaded to the cloud and started on demand. This is a virtual OS disk, which can be built and tested on your own computer. You can put everything you want into this OS image (application, dependencies and data), but you need to be aware that bigger this image will be, slower will be its start on the Cloud, so a general recommendation is to reduce as more as possible the size of the image, removing unnecessary data or packages which are not needed or can be easily accessed from remote web repositories on demand.
If you have different application components split into multiple servers, you can prepare different OS images (one for each application components) or you can use contextualization to setup the server for the specific purpose. For example, a custom OS disk image equipped with Apache may act as load balancer or server back-end, according to the contextualization script will use to start the server.
- Possibility to build the virtual disk directly from a legacy machine, dumping the contents of the disk.
- Possibility to speed-up the deployment for applications with complex and big installation packages. This because you do not need to install the application at startup, but the application is already included in the machine.
- Build a virtual disk directly from a legacy machine poses a set of compatibility issues with hardware drivers, which usually differs from a virtual and physical environment and even between different virtual environments.
- You need to keep updated your machine. Outdated OS disk images may take long time to startup due to the need to download and install the latest OS updates.
- If you are using special drivers or you are not packaging correctly the OS disk, your custom OS image may not run (or run slowly) on different cloud providers based on different virtualization technologies.
- OS disk images images on public clouds are sometimes public, thus be aware of installing proprietary software on custom OS images, since other users may be able to run the image or download it.
- In general, the effort to implement this solution is higher than the basic contextualization.
4. Infrastructure broker
When you have complex applications with many components hosted by different servers, which shall be located on a single cloud site or split on multiple cloud sites, an Infrastructure broker solutions may help you in the deployment of the applications. Infrastructure broker may run, usually via a web interface or custom APIs, a full deployment of multiple servers (in the order of thousands or more), using contextualization to setup the component dependencies.
Configuration of the deployments depends on the particular broker solution, but, in general, each server may be started using a basic OS images or a custom OS image. Contextualization scripts are in any case needed to orchestrate the deployment.
5. Application broker
An application broker abstracts the Cloud APIs and frees you from the need to control the deployment of the application. Once you have integrated your application into the broker, it will take care of starting the virtual servers according to the workload, configure them and dispatch the user jobs. The effort to integrate your application into the application broker depends on the application and the broker itself. The most common process, anyway, is to use basic OS images with contextualization or a custom OS image, to which you add the application broker worker node routines.
When your application performs parallel processing, using an application broker may speed-up the application execution, since the broker will take the task to submit the processing to different servers, using a wider number of resources.
The following table tries to summarize the different solutions reported in the paragraphs below
|Integration Strategy||Description||Recommended for|
|Manual server setup||
||Test, self-cointaned applications, "disposable" applications|
|Basic OS image with contextualization||
||Web service applications, which usually stay on 24/7, with relatively infrequent application updates (ex. monthly)|
|Custom OS image||
||Applications who need specials OS flavors or complex installation procedures, applications who are started and stopped very frequently (ex. virtual servers started hourly or daily)|
||Multi-server applications which are started manually relatively frequently (ex. processing clusters for absorbing peaks in processing, web service instances for training or demonstration purposes, etc...)|
||Asynchronous processing applications, where workload is not constant but comes in burst and there is the need to dynamically adapt the infrastructure utilization to the application needs.|
Porting your application to the EGI Federated Cloud
This chapter provides a step-to-step guide to port your application/service to the cloud. The instructions reported below are separated accordingly to the integration strategy you have identified for your application.
Note that this chapter instructions are specific to the technology solutions used in the EGI Federated Cloud.
For the nature of the EGI Cloud Federation, prior to the porting of your application to the cloud, you need to perform a set of preliminary steps, which are:
- Get the credentials to access the FedCloud and join a virtual organization
- Contact the EGI.eu User Community Support Team to get the access to the resources of the EGI Cloud resource providers
The complete guide on how to perform these preliminary steps is reported in the Federated Cloud CLI environment configuration page.
1. Manual server setup
Step 1. Setup the command line CLI environment:
The easiest way to manually start/stop your servers (and access the other FedCloud services) is to use the FedCloud CLI tools. More details on how to setup the FedCloud CLI environment are reported here.
Step 2. Browse AppDB and find a basic image
AppDB is the FedCloud marketplace. You can browse it to find a basic OS image which suits your application (ex. Ubuntu 12.1, CentOS 5.2, etc...). Once you have your OS in mind, you need to get the OS image id of the FedCloud site you want to use. To do so, you can follow this guide or just ask the FedCloud site supporting your use case or contact the email@example.com.
NOTE: AppDB Virtual Appliances are not yet in operations. Manual mapping need to be done to get the OS image ID, please ask firstname.lastname@example.org for more information.
Step 3. Generate a set of SSH keys
In order to login into the server, you need to have a set of SSH keys. To generate a set of authentication keys, in a Linux machine, you can run
ssh-keygen -t rsa -b 2048 -f tmpfedcloud
Step 4. Create a contextualization script to inject the key
A basic contextualization script is needed to configure your access credentials into the server. You can use the following commands to create the script
cat > tmpfedcloud.login << EOF #cloud-config users: - name: cloudadm sudo: ALL=(ALL) NOPASSWD:ALL lock-passwd: true ssh-import-id: cloudadm ssh-authorized-keys: - `cat tmpfedcloud.pub` EOF
NOTE: For most of the Virtual Organizations, your login SSH key can be configured in the Virtual Organization VOMS, avoiding the necessity to use a contextualization script. Anyway, if you do not know how to inject your key into your VOMS server or you are using basic OS disks not supported by your VO, it is highly recommended to use this method'
Step 5. Start the virtual server
Using the CLI environment, generate a temporary proxy and run the server using the contextualization script created in the step before, as reported in the sample commands below (for the fedcloud.egi.eu VO and the CESNET site). For more information on how to start a server using contextualization you can refer to the FedCloud FAQ page.
[myuser@mymachine ~]$ voms-proxy-init -voms fedcloud.egi.eu -out myproxy.out -rfc -dont-verify-ac Enter GRID pass phrase: Your identity: /O=dutchgrid/O=users/O=egi/CN=Name Surname Creating temporary proxy ............................... Done Contacting voms2.grid.cesnet.cz:15002 [/DC=org/DC=terena/DC=tcs/C=CZ/O=CESNET/CN=voms2.grid.cesnet.cz] "fedcloud.egi.eu" Done Creating proxy ............................................. Done Your proxy is valid until Thu Jan 9 22:22:32 2014 [myuser@mymachine ~]$ occi --endpoint https://carach5.ics.muni.cz:11443/ --action create -r compute -M resource#small -M os#sl6 --auth x509 --voms --user-cred myproxy.out -t occi.core.title="test" --context user_data="file://$PWD/tmpfedcloud.login" https://carach5.ics.muni.cz:11443/compute/957d4dfe-ac80-48da-87cc-95430122d174
Step 6. Login and setup
Now, you can check if your server has started (active state), get its IP, connect to it via SSH using the generated temporary key and install your own application. For more information you can refer to the FedCloud FAQ page.
[myuser@mymachine ~]$ occi --endpoint https://carach5.ics.muni.cz:11443/ --action describe --resource /compute/957d4dfe-ac80-48da-87cc-95430122d174 --auth x509 --voms --user-cred myproxy.out COMPUTE: ID: 957d4dfe-ac80-48da-87cc-95430122d174 TITLE: test STATE: active MEMORY: 2.0 GB CORES: 1 LINKS: LINK "http://schemas.ogf.org/occi/infrastructure#networkinterface": ID: /network/interface/1b3eab9b-d50d-4a09-ae07-bf20eb4b2957 TITLE: admin TARGET: /network/admin IP ADDRESS: 22.214.171.124 MAC ADDRESS: fg:16:3e:4d:7d:73 [myuser@mymachine ~]$ ssh -i tmpfedcloud email@example.com Last login: Sat Dec 7 03:13:02 2013 from 126.96.36.199 [cloudadm@test ~]$ sudo su - [root@test ~]$
2. Basic OS image with contextualization
Step 1. Setup the command line CLI environment:
The easiest way to manually start/stop your servers (and access the other FedCloud services) is to use the FedCloud CLI tools. To do so, you need also to request an authorized account and support from a Virtual Organization. More details on how to setup the FedCloud CLI environment are reported here.
Step 2. Browse AppDB and find a basic image
AppDB is the FedCloud marketplace. You can browse it to find a basic OS image which suits your application (ex. Ubuntu 12.1, CentOS 5.2, etc...). Once you have your OS in mind, you need to get the OS image id of the FedCloud site you want to use. To do so, you can follow this guide or just ask the FedCloud site directly.
Step 3. Start the basic image
For building up your contextualization script, you need to start a test server. You can do this in three different ways:
- Start manually a test server on the cloud, using the instruction reported in the paragraph "1. Manual server setup"
- Download the VM basic disk image from AppDB and run it on your local machine (You can use any virtualization server you want, VirtualBox is recommended solution)
The second option permits to have more control on the test server, with the possibility to perform snapshots, etc..., but requires a minimal knowledge of virtualization technologies, thus is not recommended for normal skilled users.
Step 4. Build a deployment script
This step is optional, but recommended for portability of your application. A deployment script is an automated script which installs, on a clean OS, all the dependencies of your web service/application, the web service/application itself, the data required for the application to run, configures it and test it.
The deployment script usually downloads the application packages, dependencies and data from a remote repository.
A deployment script is usually written in a scripting language, such as Bash or Python and runs as root user in the machine. The easiest way to build this script is to copy all the commands you perform from the shell to install the software. A sample deployment script is here.
You can test the deployment script from the VM you have started. It is recommended to reset the VM (reverting back to a clean OS snapshot or recreating the server) each time you run the deployment script.
To use the deployment script from your VM, it is recommended to upload it into a remote public repository. If you do not have one, you can use the FedCloud repository or services like pastebin.
Step 5. Build a contextualization script
EGI Federated Cloud uses CloudInit as contextualization system. CloudInit offers a huge set of features to customize your machine at startup. A detailed documentation is provided here.
As example, if you have a deployment script stored on a remote repository, you can use the following CloudInit script to run it on the VM
Content-Type: multipart/mixed; boundary="===============4393449873403893838==" MIME-Version: 1.0 --===============4393449873403893838== Content-Type: text/x-include-url; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="include.txt" #include http://appliance-repo.egi.eu/images/contextualization/Test-0.1/Test-deployment.sh --===============4393449873403893838== Content-Type: text/cloud-config; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="userdata.txt" #cloud-config users: - name: cloudadm sudo: ALL=(ALL) NOPASSWD:ALL lock-passwd: true ssh-import-id: cloudadm ssh-authorized-keys: - <your SSH key> --===============4393449873403893838==--
You can, of course, add more multiple deployment scripts to the include section to setup the different components of your application.
Step 6. Start the server
To test the successful deployment on the cloud, you can start your server using the contextualization script created in the previous step. To do so, you can refer to the FedCloud FAQ page.
3. Custom OS image
Custom OS images can be obtained in different ways. The two main possibilities are to start from scratch, creating a virtual machine, installing an OS and the software on top of it, then taking the virtual machine OS disk as custom image, or to dump an existing disk from a physical server and modify it, if needed, to run on a virtualization platform. In this guide we will focus on the first option, because it tends to produce cleaner images and reduces the risks of hardware conflicts.
NOTE: Basic knowledge of virtualization technologies is required for this part of the guide
Step 1. Create a virtual machine on your local PC
As pre-requisite to this step, you need to install a virtualization server (hypervisor) into your machine. Many different hypervisor technologies exists, here we recommend the usage of Oracle VirtualBox.
After you have installed and configured your hypervisor, you need to create a new machine. Select a Thin disk (space for the disk is not allocated if not used) for the OS, whith a limited size (ex. 10GB). Then, run the VM and install your OS on top of it. For Linux, during the installation, it is good practice to do not use LVM or Swap partitions. To keep the size of the VM low, it is highly recommended to install only a Minimal version of the OS, and then add the required features for your application later.
NOTE: Instead of installing a new OS in the virtual machine, you can download an existing basic OS image from the marketplace and run it on your local PC, as explained in the previous chapter.
Step 2. Configure the network and contextualization on the VM
If you installed the OS from scratch, you will probably need to setup the OS to dynamically configure the network. To do so, you need to enable DHCP protocol and, for linux, disable udev rule generation (in order to ignore changes in network virtual hardware). Check your OS administration guide on how to perform these tasks.
If you are going to use contextualization with your VM, you can setup a contextualization script. We recommend the use of CloudInit, available for many OS distributions here.
Step 3. Install your software in the machine
In the newly created VM you can install your application/web service and test it. When everything is installed, it is recommended to optimize the machine, by removing all the unnecessary services, packages, etc...
Step 4. Package the VM
After the cleaning of the disk, shut down your VM and export it in OVF format. If the contents of your VM are private (ex. proprietary software is installed on the VM), you can crypt the image using a fixed random pass-phrase using GPG.
Step 5. Upload the image to the FedCloud repository
After you have a VM, you need to upload it to a remote repository. You can use your own repository or the EGI Federated Cloud repository (as indicated here).
Step 6. Register the virtual appliance and its associated image(s) in AppDB
For this step, you can follow this guide.
Step 7. Start the server
After your image is correctly uploaded and registered, the FedCloud Site who is supporting your VO will register into the system and assign to it a given OS disk image ID. Now, you can start the server as you would have done for a basic image. For more info, you can refer to the FedCloud FAQ page.
4. Infrastructure broker
Integration of your application into an infrastructure broker depends on the broker solution itself. In general, if possible, before selecting an infrastructure broker, it is always recommended to first test the single components separately, for example manually starting a server into the FedCloud.
For information about the infrastructure broker solutions which supports the FedCloud (and other cloud technologies), you can refer to the FedCloud Brokering page.
5. Application broker
Integration of your application into an application broker depends on the broker solution itself. Application brokers may offer a sort of PaaS environment for the applications (ex. Grid clusters, Hadoop clusters, etc...) or integrate applications via wrappers written in Java or other programming languages. For more details about the application broker solutions which supports the FedCloud (and other cloud technologies), you can refer to the FedCloud Brokering page.