Difference between revisions of "GPGPU-CREAM"
Jump to navigation
Jump to search
m (→Progress) |
m (→Progress) |
||
Line 29: | Line 29: | ||
BLAH_JOB_SUBMIT 2 [Cmd="/tmp/test.sh";GridType="pbs";Queue="batch";In="/dev/null";Out="~\/StdOutput";Err="~\/StdError";GPUNumber=1;GPUMode="default"] | BLAH_JOB_SUBMIT 2 [Cmd="/tmp/test.sh";GridType="pbs";Queue="batch";In="/dev/null";Out="~\/StdOutput";Err="~\/StdError";GPUNumber=1;GPUMode="default"] | ||
BLAH_JOB_SUBMIT 2 [Cmd="test_gpu_blah.sh";GridType="pbs";Queue="batch";In="/dev/null";Out="StdOutput";Err="StdError";GPUNumber=1;GPUMode="exclusive_process"] | BLAH_JOB_SUBMIT 2 [Cmd="test_gpu_blah.sh";GridType="pbs";Queue="batch";In="/dev/null";Out="StdOutput";Err="StdError";GPUNumber=1;GPUMode="exclusive_process"] | ||
#: this required modifications blah_common_submit_functions.sh and server.c | |||
** first implementation of the two new attributes for PBS/Torque: | ** first implementation of the two new attributes for PBS/Torque: | ||
GPUMode can have the following values for PBS/Torque: | GPUMode can have the following values for PBS/Torque: |
Revision as of 16:53, 7 July 2015
Goal
- To develop a solution enabling GPU support in CREAM-CE:
- For the most popular LRMSes already supported by CREAM-CE
- Based on GLUE 2.1 schema
Work plan
- Indentifying the relevant GPGPU-related parameters supported by the different LRMS, and abstract them to significant JDL attributes
- GPGPU accounting is expected to be provided by LRMS log files, as done for CPU accounting, and then follows the same APEL flow
- Implementing the needed changes in CREAM-core and BLAH components
- Writing the infoproviders according to GLUE 2.1
- Testing and certification of the prototype
- Releasing a CREAM-CE update with full GPGPU support
Testbed
- 3 nodes (2x Intel Xeon E5-2620v2) with 2 NVIDIA Tesla K20m GPUs per node available at CIRMMP
- MoBrain applications installed: AMBER and GROMACS with CUDA 5.5
- Batch system/Scheduler: Torque 4.2.10 (source compiled with NVML libs)/ Maui 3.3.1
- EMI3 CREAM-CE
Progress
- May 2015:
- tested local AMBER job submission with the different Torque/NVIDIA GPGPU support options, e.g.:
qsub -l nodes=1:gpus=2:default qsub -l nodes=1:gpus=2:exclusive_process
- June 2015:
- attributes "GPUNumber" e "GPUMode" added to command BLAH_JOB_SUBMIT, e.g.:
BLAH_JOB_SUBMIT 2 [Cmd="/tmp/test.sh";GridType="pbs";Queue="batch";In="/dev/null";Out="~\/StdOutput";Err="~\/StdError";GPUNumber=1;GPUMode="default"] BLAH_JOB_SUBMIT 2 [Cmd="test_gpu_blah.sh";GridType="pbs";Queue="batch";In="/dev/null";Out="StdOutput";Err="StdError";GPUNumber=1;GPUMode="exclusive_process"]
- this required modifications blah_common_submit_functions.sh and server.c
- first implementation of the two new attributes for PBS/Torque:
GPUMode can have the following values for PBS/Torque: - default - Shared mode available for multiple processes - exclusive Thread - Only one COMPUTE thread is allowed to run on the GPU (v260 exclusive) - prohibited - No COMPUTE contexts are allowed to run on the GPU - exclusive_process - Only one COMPUTE process is allowed to run on the GPU
this required modifications to pbs_submit.sh
Next steps
- installing slurm on cegpu.cerm.unifi.it
- add new JDL attributes in CREAM core