GPU resource requests are handled slightly differently from what was described in the Projects section. In case you need to request GPUs, the first step is to open a ticket to the GPU Platform Consultancy functional element. The consultants will help you decide which of the services better suits your needs.
OpenStack Project with GPU Flavors
This option is identical to the one described in the Projects section, except that GPU flavors will be assigned to your project. You can then launch instances with GPUs. The available flavors are:
|g1.xlarge||V100||16 GB||4||56 GB||96 GB||-|
|g1.4xlarge||V100 (4x)||64 GB||16||80 GB||528 GB||-|
|g2.xlarge||T4||16 GB||4||64 GB||192 GB||-|
|g2.5xlarge||T4||168 GB||28||160 GB||1200 GB||-|
|g3.xlarge||V100S||16 GB||4||64 GB||192 GB||-|
|g3.4xlarge||V100S (4x)||64 GB||16||128 GB||896 GB||-|
|vg1.xlarge||T4 (vGPU)||16 GB||4||64 GB||192 GB||Specific configuration here|
Note: Adequate GPU drivers have to be installed (detailed here).
Note: Baremetal nodes with GPUs are also possible in certain cases, please open a ticket for these requests.
Container Service Clusters
After having GPU resources allocated to your OpenStack project, you can deploy clusters with GPUs by setting a label (explained here).
Batch Service GPU jobs
The Batch service at CERN already allows the submission of GPU jobs (examples here). Batch not only allows to submit jobs in the typical batch system form, but also using docker, singularity and interactive jobs (including running GUI applications).
GitLab (Continuous Integration)
A number of shared runners in CERN GitLab offer GPUs.
Check here for configuration information and examples.
The lxplus service offers lxplus-gpu.cern.ch for shared GPU instances - with limited isolation and performance.
When using GPUs directly in virtual machines you need to handle driver installation and configuration.
Note: Virtual GPU driver installation is different (see here).
To install NVIDIA drivers, open the CUDA Toolkit Downloads page and select the options related to your system. As an installer type, we recommend choosing the 'network' option. Having selected all options, you will be prompted with a succinct installation instructions box.
As a rule of thumb, you can verify that the drivers have been correctly installed if you can successfully run 'nvidia-smi' in a terminal (Linux) or if you see the GPU model you have assigned in the device manager, under display adapters (Windows).
Note: Running Windows is not supported with the current license.
For the vGPUs to operate at full capacity, licensing is required. Licensing can be setup automatically when creating a VM by passing a user data file that we provide (download here). Note that with this user data file, the vGPU drivers are also installed. Example vGPU VM creation command:
$ wget https://clouddocs.web.cern.ch/gpu/vgpu-config.sh $ openstack server create --user-data vgpu-config.sh --flavor vg1.xlarge --image <LINUX_IMAGE> --key-name <KEY_NAME> <VM_NAME>
Virtual GPU VMs can also be created through the OpenStack dashboard by loading the same user data file in the "Configuration" tab.
Installing CUDA Toolkit in a vGPU VM
The first step in installing the CUDA Toolkit is to check which is the latest compatible version with vGPU in this table (currently deployed vGPU software release: 13.0). Then, from the downloads archive you can find the corresponding CUDA Toolkit download link.
From the downloads page, pick the runfile installer after selecting your target OS. Not using the runfile can result in deploying an unsupported version of CUDA and overriding the vGPU driver.
During the interactive installation of the runfile, it is important to deselect the driver install option. Alternatively, you can run the installer non-interactively with the following flags:
$ sudo <CudaInstaller>.run --silent --toolkit --samples
Please check the detailed installation steps as there are relevant pre- and post-installation actions (such as installing g++ and altering the PATH environment variable).
To uninstall this runfile type of installation, simply run:
GPU accelerated Docker containers
- NVIDIA Driver Downloads (all products and OS)
- Tesla Driver release notes
- Tesla NVIDIA Driver Installation Quickstart Guide (Linux)
- CUDA Quick Start Guide (Linux and Windows)