Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "GPGPU-FedCloud"

From EGIWiki
Jump to navigation Jump to search
Line 86: Line 86:


Brokering, monitoring, VM management
Brokering, monitoring, VM management
= Possible configuration =
== Dedicated cloud site with GPGPU ==
Homogenous: identical working nodes
Single VM type, single VM per node
*Simple configuration, no conflicting resources, no need to modify scheduler
Example Amazon EC2
== Cloud site with OS level hypervisor =
VMs can have direct access to hardware resources and share them
Limitation to the same OS/kernel

Revision as of 13:54, 30 April 2015

Status of accelerated computing in Clouds

Need modification/support at all levels

  • Chipset : HW virtualization support (otherwise some limitation)
  • OS level: correct kernel configuration for the accelerators
  • Hypervisor: configuration pass-through, vGPU
  • CMFs: VM start, scheduler
  • FedCloud facilities: accounting, information discovery
  • Application: VM images with correct drivers for specific chipsets


Accelerators

GPGPU (General-Purpose computing on Graphical Processing Units)

NVIDIA GPU/Tesla/GRID, AMD Radeon/FirePro, Intel HD Graphics,...

Virtualization using VGA pass-through, vGPU (GPU partitioning) - NVIDIA GRID accelerators

Intel Many Integrated Core Architecture

Xeon Phi Coprocessor

Virtualization using PCI pass-through

Specialized PCIe cards with accelerators

DSP (Digital Signal Processors)

FPGA (Field Programmable Gate Array)

Not commonly used in cloud environment

Hypervisors

QEMU/KVM

Supports only pass-through virtualization model

vGPU support is under development

Citrix XenServer 6, VMware ESXi 5.1

Support both pass-through and vGPU virtualization models

Limitations:

  • vGPU support require certified server HW
  • Live VM migration is not supported
  • VM snapshot with memory is not supported

Cloud Management Frameworks

Some initiatives but not completed

Work to be done:

  • Define VM types/flavors with attributes for GPGPU
  • Modify VM start to allow passthrough or allocate vGPU
  • Modify scheduler to allocate VMs with GPGPU correctly

VM images

VM images should contain proper drivers and libraries for specific accelerators

  • Not transferable from site to site

More suitable approach is to use vanilla images with GPU support provided by cloud provider

  • Using VM contextualization like cloud-init for installing applications

Or using VM snapshots

  • May require support from site admins

FedCloud facilities

AppDB

  • VM images are rather site-specific: any sense to use AppDB ?

Information discovery

  • Should use similar GLUE2 scheme like grid sites with GPGPU

Accounting

  • How to account GPU? (again to coordinate with grid)

Brokering, monitoring, VM management

Possible configuration

Dedicated cloud site with GPGPU

Homogenous: identical working nodes

Single VM type, single VM per node

  • Simple configuration, no conflicting resources, no need to modify scheduler

Example Amazon EC2

= Cloud site with OS level hypervisor

VMs can have direct access to hardware resources and share them

Limitation to the same OS/kernel