HPC:GPU
This sections describes how to use the GPU resources available via the HPC system.
Contents
Hardware
There are currently two GPU nodes available via the HPC. Below are the hardware specifications:
- 2x GPU nodes; each configured with
- 2x 22-core Intel Xeon E5-2699 v4 2.20GHz CPUs (88 threads per node, with hyperthreading turned on)
- 512GB RAM
- 1x Nvidia Tesla P100 16GB GPU Card (3584 CUDA cores & 16GB RAM per card)
- 100 Gb/s InfiniBand connection to the GPFS file system
- 1.6TB dedicated scratch space provided by local NVMe
Software
The GPU nodes have the same set of available software as the rest of the compute nodes. The full list of available software here
In addition to this, there are two versions of CUDA that are readily available:
$ module avail CUDA --------------------------------------------------------------------------------- /usr/share/Modules/modulefiles ---------------------------------------------------------------------------------- CUDA/10.1.243 CUDA/9.2.148
Using CUDA
To use one of the available versions of CUDA, simply load the appropriate module
NOTE: The following command MUST be run via either an interactive or non-interactive job
-bash-4.2$ module load CUDA/10.1.243 -bash-4.2$ which nvcc /opt/software/CUDA/10.1.243/bin/nvcc -bash-4.2$ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Sun_Jul_28_19:07:16_PDT_2019 Cuda compilation tools, release 10.1, V10.1.243
Running jobs on GPU nodes
Both interactive and non-interactive jobs can be run on the GPU nodes. At present, the GPU nodes are available via a dedicated queue.
[asrini@consign ~]$ bqueues gpu QUEUE_NAME PRIO STATUS MAX JL/U JL/P JL/H NJOBS PEND RUN SUSP gpu 30 Open:Active - 88 - - 41 0 41 0
Interactive jobs
To launch an interactive job on one of the GPU nodes use the usual bsub command with the "-q gpu" option:
[asrini@consign ~]$ bsub -q gpu -Is bash Job <63866682> is submitted to queue <gpu>. <<Waiting for dispatch ...>> <<Starting on gpunode02.hpc.local>> [asrini@gpunode02 ~]$ module avail CUDA --------------------------------------------------------------------------------- /usr/share/Modules/modulefiles ---------------------------------------------------------------------------------- CUDA/10.1.243 CUDA/9.2.148 [asrini@gpunode02 ~]$ module load CUDA/10.1.243 [asrini@gpunode02 ~]$ which nvcc /opt/software/CUDA/10.1.243/bin/nvcc [asrini@gpunode02 ~]$ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Sun_Jul_28_19:07:16_PDT_2019 Cuda compilation tools, release 10.1, V10.1.243