1.0 前言
本地搭建stable-diffusion diffuser docker CUDA10.2 RTX2060
上次安裝的cuda10.2太舊了,升級cuda11.7順便填一下漏了的點。
2.0 卸載
sudo apt-get remove --purge '^nvidia-.*' sudo apt-get remove --purge '^libnvidia-.*' sudo apt-get remove --purge '^cuda-.*' sudo apt-get remove --purge '^cudnn-.*' sudo apt-get remove --purge '^libcudnn7-.*' sudo apt-get remove --purge '^libcudnn7*'
卸載
2.1 檢查
dpkg -l | grep nvidia dpkg -l | grep cuda dpkg -l | grep cudnn
檢查是否已成功卸載
3.0 CUDA
https://developer.nvidia.com/cuda-10.2-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=deblocal
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600 wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda-repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb sudo dpkg -i cuda-repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb sudo cp /var/cuda-repo-ubuntu1804-11-7-local/cuda-*-keyring.gpg /usr/share/keyrings/ sudo apt-get update sudo apt-get -y install cuda
安裝CUDA
3.1 rmmod & lsof cuda
https://comzyh.com/blog/archives/967/
sudo rmmod nvidia_drm sudo rmmod nvidia_modeset sudo rmmod nvidia_uvm sudo rmmod nvidia
重新restart服務器,或手動rmmod kernel mod
sudo lsof /dev/nvidia*
重新加載cuda
3.2 vncc
export PATH=/usr/local/cuda-11.7/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
vncc動態鏈
3.4 檢查
nvidia-smi
4.0 cudnn
sudo dpkg -i cudnn-local-repo-ubuntu1804-8.9.0.131_1.0-1_amd64.deb sudo cp /var/cudnn-local-repo-*/cudnn-local-*-keyring.gpg /usr/share/keyrings/ sudo apt-get update sudo apt-get install libcudnn8=8.9.0.131-1+cuda11.8 sudo apt-get install libcudnn8-dev=8.9.0.131-1+cuda11.8 sudo apt-get install libcudnn8-samples=8.9.0.131-1+cuda11.8
安裝cudnn
$cp -r /usr/src/cudnn_samples_v8/ $HOME cd ~/cudnn_samples_v8/mnistCUDNN make clean && make ./mnistCUDNN
測試cudnn
4.1 檢查
5.0 安裝libnvidia-container
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#setting-up-nvidia-container-toolkit
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \ && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ && curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list sudo apt-get update sudo apt-get install -y nvidia-container-toolkit sudo nvidia-ctk runtime configure --runtime=docker sudo systemctl restart docker
安裝libnvidia-container
6.0 docker部署
sudo docker build -t diffusers/cuda/v4:11.7-cudnn8-runtime-ubuntu18.04 . sudo docker run --rm --runtime=nvidia --gpus all diffusers/cuda/v5:11.7-cudnn8-runtime-ubuntu18.04 nvidia-smi
标签:diffusion,get,--,sudo,升級,apt,diffuser,cuda,nvidia From: https://www.cnblogs.com/chenkuang/p/17333447.html