This page is also available in: 中文
Nvidia GPUs on the host machine cannot be directly used in Docker containers. You need to install the NVIDIA Container Toolkit to map the GPUs to the containers.
Copy and execute the following command in the terminal:
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
# Refresh software sources
sudo apt-get update
# Install the package
sudo apt-get install -y nvidia-container-toolkit
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
sudo yum install -y nvidia-container-toolkit
To enable containers to access Nvidia GPUs on the host machine, you need to configure the Docker container runtime using the following command:
sudo nvidia-ctk runtime configure --runtime=docker
This command will automatically update the /etc/docker/daemon.json
configuration file.
Restart the Docker service:
sudo systemctl restart docker
After the installation and configuration steps mentioned above, Docker containers can now utilize Nvidia GPUs on the host machine. Here's an example usage:
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
The container will print GPU runtime information similar to the following:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.10 Driver Version: 535.86.10 CUDA Version: 12.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
| N/A 34C P8 9W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
For more information, please refer to https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/index.html
This article was published on 2024-08-20 and last updated on 2024-09-23.
This article is copyrighted by torchtree.com and unauthorized reproduction is prohibited.