Difference between revisions of "GPGPU-FedCloud"
Jump to navigation
Jump to search
Line 65: | Line 65: | ||
Contact with cloud-admin _at_ savba.sk to get account and access to full-featured graphical portal | Contact with cloud-admin _at_ savba.sk to get account and access to full-featured graphical portal | ||
= How to create your own GPGPU server in cloud = | = How to create your own GPGPU server in cloud = | ||
It is a short instruction to create a GPGPU server in cloud from Ubuntu vanilla image | |||
It is a short instruction to create a GPGPU server in cloud from Ubuntu vanilla image | |||
Create a VM from vanilla image (make sure with flavor with GPU support) | Create a VM from vanilla image (make sure with flavor with GPU support) | ||
Line 83: | Line 85: | ||
Be sure to make a snapshot of your server for later use. You may need to suspend your server before creating snapshot (due to KVM passthrough). | Be sure to make a snapshot of your server for later use. You may need to suspend your server before creating snapshot (due to KVM passthrough). | ||
Do not terminate your server before creating snapshot, whole server will be deleted when terminated | Do not terminate your server before creating snapshot, whole server will be deleted when terminated | ||
<source lang="">#!/bin/bash | |||
# Viet Tran, IISAS, 2015 | |||
# Scripting for installing all necessary packages needed for GPU apps on vanilla Ubuntu image | |||
# Make sure to choose cloud image with UEFI support | |||
# https://cloud-images.ubuntu.com/trusty/current/trusty-server-cloudimg-amd64-uefi1.img | |||
# Installing gcc, make and kernel-extra | |||
sudo apt-get update | |||
sudo apt-get -y install gcc make linux-image-extra-virtual | |||
# Instaling NVIDA driver | |||
wget http://us.download.nvidia.com/XFree86/Linux-x86_64/346.96/NVIDIA-Linux-x86_64-346.96.run | |||
chmod a+x NVIDIA-Linux-x86_64-346.96.run | |||
sudo ./NVIDIA-Linux-x86_64-346.96.run -a | |||
# Installing CUDA. You can choose only runtime lib (cuda-runtime-version) | |||
# or all packages "sudo apt-get install cuda" | |||
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_7.5-18_amd64.deb | |||
sudo dpkg -i cuda-repo-ubuntu1404_7.5-18_amd64.deb | |||
sudo apt-get update | |||
sudo apt-get install cuda-runtime-7-5</source><br> | |||
= How to enable GPGPU passthrough in OpenStack = | = How to enable GPGPU passthrough in OpenStack = |
Revision as of 20:51, 23 October 2015
Objective
To provide support for accelerated computing in EGI-Engage federated cloud.
Participants
Viet Tran (IISAS)
Jan Astalos (IISAS)
Miroslav Dobrucky (IISAS)
Current status
A working site with GPGPU in EGI federated cloud https://cloudmon.egi.eu/nagios/cgi-bin/status.cgi?host=nova3.ui.savba.sk
HW configuration:
IBM dx360 M4 server with two NVIDIA Tesla K20 accelerators. Ubuntu 14.04.2 LTS with KVM/QEMU, PCI passthrough virtualization of GPU cards.
SW configuration:
Base OS: Ubuntu 14.04.2 LTS Hypervisor: KVM Middleware: Openstack Kilo GPU-enable flavors: gpu1cpu6 (1GPU + 6 CPU cores), gpu2cpu12 (2GPU +16 CPU cores)
EGI federated cloud configuration:
GOCDB: IISAS-GPUCloud, https://goc.egi.eu/portal/index.php?Page_Type=Site&id=1485 Openstack endpoint: https://keystone3.ui.savba.sk:5000/v2.0 OCCI endpoint: https://nova3.ui.savba.sk:8787 Supported VOs: fedcloud.egi.eu, ops, dteam, moldyngrid, enmr.eu, vo.lifewatch.eu
How to use GPGPU on IISAS-GPUCloud
For EGI users:
Join EGI federated cloud https://wiki.egi.eu/wiki/Federated_Cloud_user_support#Quick_Start Get VOMS proxy certificate from fedcloud.egi.eu or any supported VO with -rfc (voms-proxy-init --voms fedcloud.egi.eu -rfc) Choose a suitable flavor with GPU (e.g. gpu1cpu6, OCCI users: resource_tpl#f0cd78ab-10a0-4350-a6cb-5f3fdd6e6294) Choose a suitable image (e.g. Ubuntu-14.04-UEFI, OCCI users: os_tpl#4aaf1abc-4c21-4192-ac52-8896757978be) Create a keypair for logging in to your server (see https://wiki.egi.eu/wiki/Fedcloud-tf:CLI_Environment#How_to_create_a_key_pair_to_access_the_VMs_via_SSH) Create a VM with the selected image, flavor and keypair (OCCI users: occi --endpoint https://nova3.ui.savba.sk:8787/ \ --auth x509 --user-cred $X509_USER_PROXY --voms --action create --resource compute \ --mixin os_tpl#4aaf1abc-4c21-4192-ac52-8896757978be --mixin resource_tpl#f0cd78ab-10a0-4350-a6cb-5f3fdd6e6294 \ --attribute occi.core.title="Testing GPU" \ --context user_data="file://$PWD/tmpfedcloud.login") Assign a public (floating) IP to your VM using the VM ID returned from previous command (occi --endpoint https://nova3.ui.savba.sk:8787/ \ --auth x509 --user-cred $X509_USER_PROXY --voms --action link \ --resource https://nova3.ui.savba.sk:8787/compute/$YOUR_VM_ID_HERE -j /network/nova) Log in the VM and use it as your own GPU server.
Please remember to terminate your servers when you finish your jobs to release resources for other users
For access to IISAS-GPUCloud via portal:
Contact with cloud-admin _at_ savba.sk to get account and access to full-featured graphical portal
How to create your own GPGPU server in cloud
It is a short instruction to create a GPGPU server in cloud from Ubuntu vanilla image
Create a VM from vanilla image (make sure with flavor with GPU support) Install gcc, make and kernel-extra: apt-get update; apt-get install gcc make linux-image-extra-virtual Choose and download correct driver from http://www.nvidia.com/Download/index.aspx, and upload it to the VM Install the NVIDIA driver: ./NVIDIA-Linux-x86_64-346.96.run Download CUDA toolkit from https://developer.nvidia.com/cuda-downloads (choose deb format for smaller download) Install the CUDA toolkit: dpkg -i cuda-repo-ubuntu*_amd64.deb; apt-get update; apt-get install cuda (very large install, take a long time) Your server is ready for your application. You can install additional software (NAMD, GROMACS, ...) and your own application now Be sure to make a snapshot of your server for later use. You may need to suspend your server before creating snapshot (due to KVM passthrough). Do not terminate your server before creating snapshot, whole server will be deleted when terminated
#!/bin/bash
# Viet Tran, IISAS, 2015
# Scripting for installing all necessary packages needed for GPU apps on vanilla Ubuntu image
# Make sure to choose cloud image with UEFI support
# https://cloud-images.ubuntu.com/trusty/current/trusty-server-cloudimg-amd64-uefi1.img
# Installing gcc, make and kernel-extra
sudo apt-get update
sudo apt-get -y install gcc make linux-image-extra-virtual
# Instaling NVIDA driver
wget http://us.download.nvidia.com/XFree86/Linux-x86_64/346.96/NVIDIA-Linux-x86_64-346.96.run
chmod a+x NVIDIA-Linux-x86_64-346.96.run
sudo ./NVIDIA-Linux-x86_64-346.96.run -a
# Installing CUDA. You can choose only runtime lib (cuda-runtime-version)
# or all packages "sudo apt-get install cuda"
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_7.5-18_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1404_7.5-18_amd64.deb
sudo apt-get update
sudo apt-get install cuda-runtime-7-5
How to enable GPGPU passthrough in OpenStack
For admins of cloud providers
On computing node, get vendor/product ID of your hardware: "lspci | grep NVDIA" to get pci slot of GPU, then "virsh nodedev-dumpxml pci_xxxx_xx_xx_x" On computing node, unbind device from host kernel driver On computing node, add "pci_passthrough_whitelist = {"vendor_id":"xxxx","product_id":"xxxx"}" to nova.conf On controller node, add "pci_alias = {"vendor_id":"xxxx","product_id":"xxxx", "name":"GPU"}" to nova.conf On controller node, enable PciPassthroughFilter in the scheduler Create new flavors with "pci_passthrough:alias" (or add key to existing flavor) e.g. nova flavor-key m1.large set "pci_passthrough:alias"="GPU:2"
Progress
- May 2015
- Review of available technologies
- GPGPU virtualisation in KVM/QEMU
- Performance testing of passthrough
HW configuration: IBM dx360 M4 server with two NVIDIA Tesla K20 accelerators. Ubuntu 14.04.2 LTS with KVM/QEMU, PCI passthrough virtualization of GPU cards.
Tested application: NAMD molecular dynamics simulation (CUDA version), STMV test example (http://www.ks.uiuc.edu/Research/namd/).
Performance results: Tested application runs 2-3% slower in virtual machine compared to direct run on tested server. If hyperthreading is enabled on compute server, vCPUs have to be pinned to real cores so that whole cores will be dedicated to one VM. To avoid potential performance problems, hyperthreading should be switched off.
- June 2015
- Creating cloud site with GPGPU support
Configuration: master node, 2 worker nodes (IBM dx360 M4 servers, see above) Base OS: Ubuntu 14.04.2 LTS Hypervisor: KVM Middleware: Openstack Kilo
- July 2015
- Creating cloud site with GPGPU support
Cloud site created at keystone3.ui.savba.sk, master + two worker nodes, configuration reported above Creating VM images for GPGPU (based on Ubuntu 14.04, GPU driver and libraries)
- August 2015
- Testing cloud site with GPGPU support
Performance testing and tuning with GPGPU in Openstack - comparing performance of cloud-based VM with non-cloud virtualization and physical machine, finding discrepancies and tuning them - setting CPU flavor in Openstack nova (performance optimization) - Adjusting Openstack scheduler
Starting process of integration of the site to EGI FedCloud - Keystone VOMS support being integrated - OCCI in preparation, installation planned in September
- September 2015
Continue integration to EGI-FedCloud
- October 2015
Full integration to EGI-FedCloud, being in certification process Support for moldyngrid, enmr.eu and vo.lifewatch.eu VO
- Next steps
Production, application support Cooperation with APEL team on accounting of GPUs