HPC:GPU

From HPC wiki

This sections describes how to use the GPU resources available via the HPC system.

Hardware

There are currently two GPU nodes available via the HPC. Below are the hardware specifications:

  • 2x GPU nodes; each configured with
      • 2x 22-core Intel Xeon E5-2699 v4 2.20GHz CPUs (88 threads per node, with hyperthreading turned on)
      • 512GB RAM
      • 1x Nvidia Tesla P100 16GB GPU Card (3584 CUDA cores & 16GB RAM per card)
      • 100 Gb/s InfiniBand connection to the GPFS file system
      • 1.6TB dedicated scratch space provided by local NVMe

Software

The GPU nodes have the same set of available software as the rest of the compute nodes. The full list of available software here

In addition to this, there are two versions of CUDA that are readily available:

$ module avail CUDA

--------------------------------------------------------------------------------- /usr/share/Modules/modulefiles ----------------------------------------------------------------------------------
CUDA/10.1.243 CUDA/9.2.148

Using CUDA

To use one of the available versions of CUDA, simply load the appropriate module

NOTE: The following command MUST be run via either an interactive or non-interactive job

 
-bash-4.2$ module load CUDA/10.1.243 

-bash-4.2$ which nvcc
/opt/software/CUDA/10.1.243/bin/nvcc

-bash-4.2$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

Running jobs on GPU nodes

Both interactive and non-interactive jobs can be run on the GPU nodes. At present, the GPU nodes are available via a dedicated queue.

 
[asrini@consign ~]$ bqueues gpu
QUEUE_NAME      PRIO STATUS          MAX JL/U JL/P JL/H NJOBS  PEND   RUN  SUSP 
gpu              30  Open:Active       -   88    -    -    41     0    41     0

Interactive jobs

To launch an interactive job on one of the GPU nodes use the usual bsub command with the "-q gpu" option:

[asrini@consign ~]$ bsub -q gpu -Is bash
Job <63866682> is submitted to queue <gpu>.
<<Waiting for dispatch ...>>
<<Starting on gpunode02.hpc.local>>

[asrini@gpunode02 ~]$ module avail CUDA

--------------------------------------------------------------------------------- /usr/share/Modules/modulefiles ----------------------------------------------------------------------------------
CUDA/10.1.243 CUDA/9.2.148

[asrini@gpunode02 ~]$ module load CUDA/10.1.243 

[asrini@gpunode02 ~]$ which nvcc
/opt/software/CUDA/10.1.243/bin/nvcc

[asrini@gpunode02 ~]$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

Non-interactive jobs

Other Pages