GPU Overview
GPU resource requests are handled slightly differently from what was described in the Projects section. In case you need to request GPUs, the first step is to open a ticket to the GPU Platform Consultancy functional element. The consultants will help you decide which of the services better suits your needs.
Services
OpenStack Project with GPU Flavors
This option is identical to the one described in the Projects section, except that GPU flavors will be assigned to your project. You can then launch instances with GPUs. The available flavors are:
Flavor Name | GPU | RAM | vCPUs | Disk | Ephemeral | Comments |
---|---|---|---|---|---|---|
g1.xlarge | V100 | 16 GB | 4 | 56 GB | 96 GB | - |
g1.4xlarge | V100 (4x) | 64 GB | 16 | 80 GB | 528 GB | - |
g2.xlarge | T4 | 16 GB | 4 | 64 GB | 192 GB | - |
g3.xlarge | V100S | 16 GB | 4 | 64 GB | 192 GB | - |
g3.4xlarge | V100S (4x) | 64 GB | 16 | 128 GB | 896 GB | - |
vg1.xlarge | T4 (vGPU) | 16 GB | 4 | 64 GB | 192 GB | Specific configuration here |
Note: Adequate GPU drivers have to be installed (detailed here).
Container Service Clusters
After having GPU resources allocated to your OpenStack project, you can deploy clusters with GPUs by setting a label (explained here).
Batch Service GPU jobs
The Batch service at CERN already allows the submission of GPU jobs (examples here). Batch not only allows to submit jobs in the typical batch system form, but also using docker, singularity and interactive jobs (including running GUI applications).
VM Configuration
When using GPUs directly in virtual machines you need to handle driver installation and configuration.
Driver Installation
Note: Virtual GPU driver installation is different (see here).
To install NVIDIA drivers, open the CUDA Toolkit Downloads page and select the options related to your system. As an installer type, we recommend choosing the 'network' option. Having selected all options, you will be prompted with a succinct installation instructions box.
As a rule of thumb, you can verify that the drivers have been correctly installed if you can successfully run 'nvidia-smi' in a terminal (Linux) or if you see the GPU model you have assigned in the device manager, under display adapters (Windows).
For more detailed instructions, such as pre- and post-installation actions, see the Installation Guide for Linux or the Installation Guide for Microsoft Windows.
Virtual GPUs
Note: Running Windows is not supported with the current license.
For the vGPUs to operate at full capacity, licensing is required. Licensing can be setup automatically when creating a VM by passing a user data file that we provide (download here). Note that with this user data file, the vGPU drivers are also installed. Example vGPU VM creation command:
$ wget https://clouddocs.web.cern.ch/gpu/vgpu-config.sh
$ openstack server create --user-data vgpu-config.sh --flavor vg1.xlarge --image <LINUX_IMAGE> --key-name <KEY_NAME> <VM_NAME>
Virtual GPU VMs can also be created through the OpenStack dashboard by loading the same user data file in the "Configuration" tab.
GPU accelerated Docker containers
The NVIDIA Container Toolkit is required to run GPU accelerated Docker containers (not required for required for Kubernetes). Installation instructions are available here.
Additional resources:
- NVIDIA Driver Downloads (all products and OS)
- Tesla Driver release notes
- Tesla NVIDIA Driver Installation Quickstart Guide (Linux)
- CUDA Quick Start Guide (Linux and Windows)