- 安装在安装docker之前需要的安装包
sudo apt-get update sudo apt-get install \ apt-transport-https \ ca-certificates \ curl \ gnupg \ lsb-release
- 添加gpg密钥,成功会显示ok
curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
- 验证密钥
sudo apt-key fingerprint 0EBFCD88
- 设置阿里云仓库
sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"
- 更新软件列表
sudo apt-get update
- 安装最新版本的docker engine
sudo apt-get install docker-ce docker-ce-cli containerd.io
- 安装指定版本,先查看可用版本,之后安装指定版本
# 查看可用版本 apt-cache madison docker-ce # 指定版本安装 sudo apt-get install docker-ce=5:20.10.11~3-0~ubuntu-bionic docker-ce-cli=5:20.10.11~3-0~ubuntu-bionic containerd.io
安装Nvidia-docker
开始之前请确保NVIDIA Drivers和Docker已经安装好
nvcc --version docker
- 设置
stable
存储库和密钥
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \ && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \ && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
2. 安装nvidia-docker2
安装包
sudo apt-get update sudo apt-get install -y nvidia-docker2
3. 重启docker
sudo systemctl restart docker
4. 用官方container测试是否安装成功。注意!版本号11.2需要根据机器上装的cuda版本调整,可在下面网站查找对应tag的名字,https://hub.docker.com/r/nvidia/cuda/tags?page=1
docker run --runtime=nvidia --rm nvidia/cuda:11.2.0-base nvidia-smi
正确输出结果如下
Thu Apr 1 02:46:41 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.39 Driver Version: 460.39 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 2060 Off | 00000000:01:00.0 On | N/A |
| N/A 37C P8 7W / N/A | 272MiB / 5926MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
创建docker用户组,非root用户使用docker
- 创建用户组
sudo groupadd docker
- 将普通用户username加入到docker组
sudo gpasswd -a [user_name] docker
- 更新docker组
newgrp docker
- 修改/var/run/docker.sock权限
sudo chmod a+rw /var/run/docker.sock
pytorch docker镜像拉取
根据cuda,cudnn版本拉取合适tag的pytorch镜像,具体tag可查询链接:https://hub.docker.com/r/pytorch/pytorch/tags
docker pull pytorch/pytorch:1.9.1-cuda11.1-cudnn8-runtime
docker常用命令
查看安装的镜像
docker images
运行镜像,打开一个容器
nvidia-docker run --gpus all --shm-size=8g -v {DATA_DIR}:/data -v {CODE_DIR}:/code -it pytorch
查看当前运行的所有容器
docker ps -a
打开/暂停某容器
docker start {container_ID} docker stop {container_ID}
删除某容器
docker rm {container_ID}
在某运行的容器下,运行一个bash
docker exec -it b16635fb226a /bin/bash标签:教程,get,sudo,apt,简易,nvidia,docker,安装 From: https://www.cnblogs.com/Xiaoyan-Li/p/16658456.html