首先过一遍安装流程
1、设置 docker-ce 存储库:
sudo yum-config-manager --add-repo=/linux/centos/
2、安装 包:
sudo yum install -y /linux/centos/7/x86_64/stable/Packages/-1.4.3-3.1.el7.x86_64.rpm
3、安装 docker-ce 软件包:
sudo yum install docker-ce -y
使用以下命令确保 Docker 服务正在运行:
sudo systemctl --now enable docker
最后,通过运行hello-world容器来测试你的 Docker 安装:
sudo docker run --rm hello-world
如下显示则正常
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(amd64)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
/
For more examples and ideas, visit:
/get-started/
4、设置 nvidia-container-toolkit 存储库和 GPG 密钥:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) && curl -s -L /libnvidia-container/$distribution/ | sudo tee /etc//
将experimental分支添加到存储库列表中:
yum-config-manager --enable libnvidia-container-experimental
5、更新包列表后安装nvidia-container-toolkit包:
sudo yum clean expire-cache
sudo yum install -y nvidia-container-toolkit
配置 Docker 守护进程以识别 NVIDIA 容器运行时:
sudo nvidia-ctk runtime configure --runtime=docker
设置默认运行时后重启Docker守护进程完成安装:
sudo systemctl restart docker
此时,可以通过运行基本 CUDA 容器来测试工作设置:
sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:12.1.1-base-centos7 nvidia-smi
这应该会产生如下所示的控制台输出:
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02 Driver Version: 530.30.02 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce GTX 1080 Ti Off| 00000000:01:00.0 Off | N/A |
| 20% 38C P0 57W / 250W| 0MiB / 11264MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
报错原因
不可以在虚拟环境中安装
在第4步的储存库地址设置时使用了curl命令
而虚拟环境中的curl和本地源环境所使用的不是一个
所以储存库地址会设置错误
导致找不到nvidia-container-toolkit的软件包