Difference between revisions of "GPGPU-FedCloud"
Line 134: | Line 134: | ||
]$ cd /usr/local/cuda-8.0/samples/1_Utilities/deviceQuery | ]$ cd /usr/local/cuda-8.0/samples/1_Utilities/deviceQuery | ||
]$ sudo make | ]$ sudo make | ||
/usr/local/cuda-8.0/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o deviceQuery.o -c deviceQuery.cpp | |||
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). | nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). | ||
/usr/local/cuda-8.0/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o deviceQuery deviceQuery.o | /usr/local/cuda-8.0/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o deviceQuery deviceQuery.o | ||
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). | nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). | ||
mkdir -p ../../bin/x86_64/linux/release | mkdir -p ../../bin/x86_64/linux/release | ||
cp deviceQuery ../../bin/x86_64/linux/release | cp deviceQuery ../../bin/x86_64/linux/release | ||
]$ ./deviceQuery | ]$ ./deviceQuery | ||
./deviceQuery Starting... | ./deviceQuery Starting... | ||
CUDA Device Query (Runtime API) version (CUDART static linking) | |||
Detected 1 CUDA Capable device(s) | Detected 1 CUDA Capable device(s) | ||
Line 183: | Line 183: | ||
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = Tesla K20m | deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = Tesla K20m | ||
Result = PASS | Result = PASS | ||
</pre> | |||
= How to enable GPGPU passthrough in OpenStack = | = How to enable GPGPU passthrough in OpenStack = |
Revision as of 10:14, 24 April 2017
Objective
To provide support for accelerated computing in EGI-Engage federated cloud.
Participants
Viet Tran (IISAS) viet.tran _at_ savba.sk
Jan Astalos (IISAS)
Miroslav Dobrucky (IISAS)
Current status
Status of OpenNebula site wiki.egi.eu/wiki/GPGPU-OpenNebula
IISAS-GPUCloud site with GPGPU has been established and integrated into EGI federated cloud
HW configuration:
6 computing nodes IBM dx360 M4 server with two NVIDIA Tesla K20 accelerators. Ubuntu 14.04.2 LTS with KVM/QEMU, PCI passthrough virtualization of GPU cards.
SW configuration:
Base OS: Ubuntu 14.04.2 LTS Hypervisor: KVM Middleware: Openstack Liberty GPU-enable flavors: gpu1cpu6 (1GPU + 6 CPU cores), gpu2cpu12 (2GPU +12 CPU cores)
EGI federated cloud configuration:
GOCDB: IISAS-GPUCloud, https://goc.egi.eu/portal/index.php?Page_Type=Site&id=1485 Monitoring https://cloudmon.egi.eu/nagios/cgi-bin/status.cgi?host=nova3.ui.savba.sk Openstack endpoint: https://keystone3.ui.savba.sk:5000/v2.0 OCCI endpoint: https://nova3.ui.savba.sk:8787/occi1.1/ Supported VOs: fedcloud.egi.eu, ops, dteam, moldyngrid, enmr.eu, vo.lifewatch.eu, acc-comp.egi.eu
Applications being tested/running on IISAS-GPUCloud
MolDynGrid http://moldyngrid.org/ WeNMR https://www.wenmr.eu/ Lifewatch-CC https://wiki.egi.eu/wiki/CC-LifeWatch
For information and support, please contact us via cloud-admin _at_ savba.sk
How to use GPGPU on IISAS-GPUCloud
For EGI users:
Join EGI federated cloud https://wiki.egi.eu/wiki/Federated_Cloud_user_support#Quick_Start Install your rOCCI client if you don't have it already (in Linux: just single command "curl -L http://go.egi.eu/fedcloud.ui | sudo /bin/bash -" ) Get VOMS proxy certificate from fedcloud.egi.eu or any supported VO with -rfc (on rOCCI client: "voms-proxy-init --voms fedcloud.egi.eu -rfc") Choose a suitable flavor with GPU (e.g. gpu1cpu6, OCCI users: resource_tpl#f0cd78ab-10a0-4350-a6cb-5f3fdd6e6294) Choose a suitable image (e.g. Ubuntu-14.04-UEFI, OCCI users: os_tpl#8fc055c5-eace-4bf2-9f87-100f3026227e) Create a keypair for logging in to your server (and stored in tmpfedcloud.login context-file) (see https://wiki.egi.eu/wiki/Fedcloud-tf:CLI_Environment#How_to_create_a_key_pair_to_access_the_VMs_via_SSH) Create a VM with the selected image, flavor and keypair (OCCI users: copy the following very long OCCI command occi --endpoint https://nova3.ui.savba.sk:8787/occi1.1/ \ --auth x509 --user-cred $X509_USER_PROXY --voms --action create --resource compute \ --mixin os_tpl#8fc055c5-eace-4bf2-9f87-100f3026227e --mixin resource_tpl#f0cd78ab-10a0-4350-a6cb-5f3fdd6e6294 \ --attribute occi.core.title="Testing GPU" \ --context user_data="file://$PWD/tmpfedcloud.login" remark: check the proper os_tpl-ID by occi --endpoint https://nova3.ui.savba.sk:8787/occi1.1/ \ --auth x509 --user-cred $X509_USER_PROXY --voms --action describe --resource os_tpl | grep -A1 Ubuntu-14
Assign a public (floating) IP to your VM (using VM_ID from previous command and /occi1.1/network/PUBLIC occi --endpoint https://nova3.ui.savba.sk:8787/occi1.1/ \ --auth x509 --user-cred $X509_USER_PROXY --voms --action link \ --resource https://nova3.ui.savba.sk:8787/occi1.1/compute/$YOUR_VM_ID_HERE -j /occi1.1/network/PUBLIC) Log in the VM with your private key and use it as your own GPU server (ssh -i tmpfedcloud cloudadm@$VM_PUBLIC_IP) Remark: please update the VM-OS immediately: sudo apt-get update && unattended-upgrade; sudo reboot Delete your VM to release resources for other users: occi --endpoint https://nova3.ui.savba.sk:8787/occi1.1/ \ --auth x509 --user-cred $X509_USER_PROXY --voms --action delete \ --resource https://nova3.ui.savba.sk:8787/occi1.1/compute/$YOUR_VM_ID_HERE
Please remember to delete/terminate your servers when you finish your jobs to release resources for other users
For access to IISAS-GPUCloud via portal:
Get a token issued by Keystone with VOMS proxy certificate. You can use the tool from https://github.com/tdviet/Keystone-VOMS-client Login into Openstack Horizon dashboard with the token via https://horizon.ui.savba.sk/horizon/auth/token/ Create and manage VMs using the portal.
Note: All network connections to/from VMs are logged and monitored by IDS. If users have long computation, please inform us ahead. VMs with longer inactivity will be deleted for releasing resources The default user account for VM created from Ubuntu-based images via Horizon is "ubuntu". The default user account for VM created by rOCCI is defined in the context file "tmpfedcloud.login"
How to create your own GPGPU server in cloud
It is a short instruction to create a GPGPU server in cloud from Ubuntu vanilla image
Create a VM from vanilla image with UEFI support (e.g. Ubuntu-14.04-UEFI, make sure with flavor with GPU support) Install gcc, make and kernel-extra: "apt-get update; apt-get install gcc make linux-image-extra-virtual" Choose and download correct driver from http://www.nvidia.com/Download/index.aspx, and upload it to the VM Install the NVIDIA driver: "dpkg -i nvidia-driver-local-repo-ubuntu*_amd64.deb" (or "./NVIDIA-Linux-x86_64-*.run" ) Download CUDA toolkit from https://developer.nvidia.com/cuda-downloads (choose deb format for smaller download) Install the CUDA toolkit: "dpkg -i cuda-repo-ubuntu*_amd64.deb; apt-get update; apt-get install cuda" (very large install, 650+ packages, take a long time ~15 minutes) and set the environment (e.g. "export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}; export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64 ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}" )
Your server is ready for your application. You can install additional software (NAMD, GROMACS, ...) and your own application now
For your convenience, a script is created for installing NVIDIA + CUDA automatically https://github.com/tdviet/NVIDIA_CUDA_installer Be sure to make a snapshot of your server for later use. You may need to suspend your server before creating snapshot (due to KVM passthrough). Do not terminate your server before creating snapshot, whole server will be deleted when terminated
Verify if CUDA is correctly installed
]$ sudo apt-get install cuda-samples-8-0 ]$ cd /usr/local/cuda-8.0/samples/1_Utilities/deviceQuery ]$ sudo make /usr/local/cuda-8.0/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o deviceQuery.o -c deviceQuery.cpp nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). /usr/local/cuda-8.0/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o deviceQuery deviceQuery.o nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). mkdir -p ../../bin/x86_64/linux/release cp deviceQuery ../../bin/x86_64/linux/release ]$ ./deviceQuery ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "Tesla K20m" CUDA Driver Version / Runtime Version 8.0 / 8.0 CUDA Capability Major/Minor version number: 3.5 Total amount of global memory: 4743 MBytes (4972937216 bytes) (13) Multiprocessors, (192) CUDA Cores/MP: 2496 CUDA Cores GPU Max Clock rate: 706 MHz (0.71 GHz) Memory Clock rate: 2600 Mhz Memory Bus Width: 320-bit L2 Cache Size: 1310720 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 2 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Enabled Device supports Unified Addressing (UVA): Yes Device PCI Domain ID / Bus ID / location ID: 0 / 0 / 7 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = Tesla K20m Result = PASS
How to enable GPGPU passthrough in OpenStack
For admins of cloud providers
On computing node, get vendor/product ID of your hardware: "lspci | grep NVDIA" to get pci slot of GPU, then "virsh nodedev-dumpxml pci_xxxx_xx_xx_x" On computing node, unbind device from host kernel driver On computing node, add "pci_passthrough_whitelist = {"vendor_id":"xxxx","product_id":"xxxx"}" to nova.conf On controller node, add "pci_alias = {"vendor_id":"xxxx","product_id":"xxxx", "name":"GPU"}" to nova.conf On controller node, enable PciPassthroughFilter in the scheduler Create new flavors with "pci_passthrough:alias" (or add key to existing flavor) e.g. nova flavor-key m1.large set "pci_passthrough:alias"="GPU:2"
Progress
- May 2015
- Review of available technologies
- GPGPU virtualisation in KVM/QEMU
- Performance testing of passthrough
HW configuration: IBM dx360 M4 server with two NVIDIA Tesla K20 accelerators. Ubuntu 14.04.2 LTS with KVM/QEMU, PCI passthrough virtualization of GPU cards.
Tested application: NAMD molecular dynamics simulation (CUDA version), STMV test example (http://www.ks.uiuc.edu/Research/namd/).
Performance results: Tested application runs 2-3% slower in virtual machine compared to direct run on tested server. If hyperthreading is enabled on compute server, vCPUs have to be pinned to real cores so that whole cores will be dedicated to one VM. To avoid potential performance problems, hyperthreading should be switched off.
- June 2015
- Creating cloud site with GPGPU support
Configuration: master node, 2 worker nodes (IBM dx360 M4 servers, see above) Base OS: Ubuntu 14.04.2 LTS Hypervisor: KVM Middleware: Openstack Kilo
- July 2015
- Creating cloud site with GPGPU support
Cloud site created at keystone3.ui.savba.sk, master + two worker nodes, configuration reported above Creating VM images for GPGPU (based on Ubuntu 14.04, GPU driver and libraries)
- August 2015
- Testing cloud site with GPGPU support
Performance testing and tuning with GPGPU in Openstack - comparing performance of cloud-based VM with non-cloud virtualization and physical machine, finding discrepancies and tuning them - setting CPU flavor in Openstack nova (performance optimization) - Adjusting Openstack scheduler
Starting process of integration of the site to EGI FedCloud - Keystone VOMS support being integrated - OCCI in preparation, installation planned in September
- September 2015
Continue integration to EGI-FedCloud
- October 2015
Full integration to EGI-FedCloud, being in certification process Support for moldyngrid, enmr.eu and vo.lifewatch.eu VO
- November 2015
Create new authentication module for logging into Horizon dashboard via keystone token Various client tools: getting token, installing nvidia+cuda, Participation on EGI Community Forum v Bari Site certificated
- December 2015
User support: adding and testing images from various VOs, solving problems with multiple-VO users Maintenance: security updates and minor improvements
- January 2016
Testing + performance tuning OpenCL Updating images with CUDA Adding Openstack Ceilometer for betting resource monitoring/accounting
- February-March 2016
Testing VM migration Examining GLUE schemes Examining accounting format and tools
- April 2016
Status report presented at EGI Conference 2016
- May 2016
GLUE2.1 draft discussed at GLUE-WG meeting and updated with relevant Accelerator card specific attributes. GPGPU experimental support enabled on CESNET-Metacloud site. VMs with Tesla M2090 GPU cards tested with DisVis program. Working on support for GPU with LXC/LXD hypervisor with Openstack, which would provide better performance than KVM.
- Next steps
Production, application support Cooperation with APEL team on accounting of GPUs Generating II according to GLUE 2.1