Difference between revisions of "GPGPU-OpenNebula"
Jump to navigation
Jump to search
Line 88: | Line 88: | ||
;Log into the new VM with your private key | ;Log into the new VM with your private key | ||
$ ssh -i fedcloud cloudadm@$VM_PUBLIC_IP | $ ssh -i fedcloud cloudadm@$VM_PUBLIC_IP | ||
:'''Note:''' use the username defined in your contextualisation file if it differs from ''cloudadm''. | |||
;Install Nvidia drivers (example installation for CentOS 7) | ;Install Nvidia drivers (example installation for CentOS 7) | ||
:Before installation it's recommended to update the image to get latest security updates: | :Before installation it's recommended to update the image to get latest security updates: |
Revision as of 08:06, 28 April 2017
Objective
To provide testing Cloud site based on OpenNebula middleware for testing GPGPU support.
Current status
IISAS-Nebula site has been integrated to EGI Federated Cloud and is accessible using acc-comp.egi.eu VO.
HW configuration:
Management services: OpenNebula Cloud controller and Site BDII in virtual servers IBM System x3250 M5, 1x Intel(R) Xeon(R) CPU E3-1241 v3 @ 3.50GHz, 16 RAM, 1TB Disk 1 computing node: IBM dx360 M4 server with two NVIDIA Tesla K20 accelerators. CentOS 7 with KVM/QEMU, PCI passthrough virtualization of GPU cards. 2.8TB block storage via NFS
SW configuration:
Base OS: CentOS 7 Hypervisor: KVM Middleware: OpenNebula 5.0.2 OCCI server: rOCCI-server 2.0.0
GPU-enabled flavors:
extra_large_2gpu Extra Large Instance - 8 cores and 8 GB RAM + 2 GPU Nvidia K20m extra_large_gpu Extra Large Instance - 8 cores and 8 GB RAM + 1 GPU Nvidia K20m goliath_2gpu Goliath Instance - 14 cores and 56 GB RAM + 2 GPU Nvidia K20m goliath_gpu Goliath Instance - 14 cores and 56 GB RAM + 1 GPU Nvidia K20m large_2gpu Large Instance - 4 cores and 4 GB RAM + 2 GPU Nvidia K20m large_gpu Large Instance - 4 cores and 4 GB RAM + 1 GPU Nvidia K20m mammoth_2gpu Mammoth Instance - 14 cores and 32 GB RAM + 2 GPU Nvidia K20m mammoth_gpu Mammoth Instance - 14 cores and 32 GB RAM + 1 GPU Nvidia K20m medium_2gpu Medium Instance - 2 cores and 2 GB RAM + 2 GPU Nvidia K20m medium_gpu Medium Instance - 2 cores and 2 GB RAM + 1 GPU Nvidia K20m mem_extra_large_2gpu Extra Large Instance - 8 cores and 32 GB RAM + 2 GPU Nvidia K20m mem_extra_large_gpu Extra Large Instance - 8 cores and 32 GB RAM + 1 GPU Nvidia K20m mem_large_2gpu Large Instance - 4 cores and 16 GB RAM + 2 GPU Nvidia K20m mem_large_gpu Large Instance - 4 cores and 16 GB RAM + 1 GPU Nvidia K20m mem_medium_2gpu Medium Instance - 2 cores and 8 GB RAM + 2 GPU Nvidia K20m mem_medium_gpu Medium Instance - 2 cores and 8 GB RAM + 1 GPU Nvidia K20m mem_small_2gpu Small Instance - 1 core and 4 GB RAM + 2 GPU Nvidia K20m mem_small_gpu Small Instance - 1 core and 4 GB RAM + 1 GPU Nvidia K20m small_2gpu Small Instance - 1 core and 1 GB RAM + 2 GPU Nvidia K20m small_gpu Small Instance - 1 core and 1 GB RAM + 1 GPU Nvidia K20m
EGI federated cloud configuration:
GOCDB: IISAS-Nebula, https://goc.egi.eu/portal/index.php?Page_Type=Site&id=1785 ARGO monitoring: http://argo.egi.eu/lavoisier/status_report-sf?site=IISAS-Nebula&report=Critical&accept=html OCCI endpoint: https://nebula2.ui.savba.sk:11443/ EGI AppDB: https://appdb.egi.eu/store/site/iisas-nebula Supported VOs: acc-comp.egi.eu, ops, dteam
How to use IISAS-Nebula site
- Join Accelerated_computing_VO
- VO acc-comp.egi.eu is dedicated for users to develop and test applications/VMs that use GPGPU or other types of accelerated computing.
- Install rOCCI client
- More information about installation and using of rOCCI CLI can be found at HOWTO11_How_to_use_the_rOCCI_Client
- Get RFC proxy certificate from acc-comp.egi.eu VOMS server
$ voms-proxy-init --voms acc-comp.egi.eu -rfc
- Choose a suitable flavor from the list above
- Alternatively you can list the available resource flavors using OCCI client:
$ occi --endpoint https://nebula2.ui.savba.sk:11443/ --auth x509 --user-cred $X509_USER_PROXY --voms \ --action describe --resource resource_tpl
- Choose a suitable image from the list of supported Virtual Appliance images
- The up-to-date list can be found at EGI AppDB or using OCCI client:
$ occi --endpoint https://nebula2.ui.savba.sk:11443/ --auth x509 --user-cred $X509_USER_PROXY --voms \ --action describe --resource os_tpl
- Create SSH keys and contextualisation file
- Follow the guide at FAQ10_EGI_Federated_Cloud_User#Contextualisation
- Create a VM with the selected image, flavor and context_file using OCCI command
$ occi --endpoint https://nebula2.ui.savba.sk:11443/ --auth x509 --user-cred $X509_USER_PROXY --voms \ --action create --resource compute \ --mixin os_tpl#uuid_egi_centos_7_8 \ --mixin resource_tpl#mem_medium_gpu \ --attribute occi.core.title="Testing GPU" \ --context user_data="file://$PWD/context_file"
- The command should print the URL ID of your new VM
- Find out IP address assigned to your new VM
$ occi --endpoint https://nebula2.ui.savba.sk:11443/ --auth x509 --user-cred $X509_USER_PROXY --voms \ --action describe --resource $VM_ID_URL | grep occi.networkinterface.address
- Log into the new VM with your private key
$ ssh -i fedcloud cloudadm@$VM_PUBLIC_IP
- Note: use the username defined in your contextualisation file if it differs from cloudadm.
- Install Nvidia drivers (example installation for CentOS 7)
- Before installation it's recommended to update the image to get latest security updates:
[cloudadm@localhost ~]$ sudo yum -y update ; sudo reboot
- Installation of cuda-drivers:
[cloudadm@localhost ~]$ sudo yum -y install kernel-devel-$(uname -r) kernel-headers-$(uname -r) [cloudadm@localhost ~]$ sudo yum -y install http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-repo-rhel7-8.0.61-1.x86_64.rpm [cloudadm@localhost ~]$ sudo yum -y install cuda-drivers [cloudadm@localhost ~]$ sudo reboot
- After reboot the drivers should work. It can be checked by running nvidia-smi tool:
[cloudadm@localhost ~]$ sudo nvidia-smi
- Deploy your application into your VM
- After you finish working with your VM, don't forget to delete it to free GPU resources for other users
$ occi --endpoint https://nebula2.ui.savba.sk:11443/ --auth x509 --user-cred $X509_USER_PROXY --voms \ --action delete --resource $VM_ID_URL