【全网首发】Ubuntu-22.04服务器系统搭建深度学习环境，安装cuda和cuDNN，并实现cuda灵活切换

标签：blog cuDNN 22.04 csdn cuda https article net

一、前言

截止2024年12月19日，所有搜索引擎中无法找到在服务器环境下搭建Ubuntu-22.04的cuda环境教程中文文章，并且许多安装教程已经过时、存在错误，使很多人走了弯路，因此发布本篇文章来造福社会。为编写本文耗费了近一周的时间尝试、整理，因此本文处处存在十分微小的细节，请缓慢仔细阅读！

注意！本文章仅适用于Ubuntu-22.04系统并且是在服务器环境下搭建！完全不符合以下条件的慎重采纳本文章的意见：

System：Ubuntu-22.04（不含图形化界面）
CPU：E5-2686v4（虚拟化）
RAM：32G
GPU：RTX 2080Ti 22G 300A（独享）

本文目录如下：

二、安装准备

1. 准备环境-重装原始系统

一般云服务服务商都会提供Ubuntu系统镜像可用，请选择Ubuntu22.04镜像进行系统重装。

对于已经把系统搞得乱成一团的初学者们也建议重装系统从头开始（比如一来就把核心给改了的，安装上宝塔了的...）。不要觉得自己做了那么多很可惜，对于初学者来说重装系统回到原点是最好最简单的方法来继续。作者在准备编写本文期间重装了4种系统共重装了十几次的。

安装完成后请不要进行多余的操作，跟着本文教程继续走。

2. 推荐的工作学习环境-VScodeSSH

在这里推荐两个大家都不陌生的工具：VScode和Xshell

编辑文件神奇-VScode：

VScode应该都不陌生，不过它的SSH远程连接服务器功能真的香！！

具体如何安装和操作请参见本文：

最香远程开发解决方案！手把手教你配置VS Code远程开发工具

有了VScodeSSH我们可以非常轻松地编辑服务器上的目录文件！便于之后的操作。

SSH工具可以随便选一个Bitvise SSH Client、Xshell，但不推荐直接使用VScodeSSH的终端！因为它不支持一些系统，并且容易出现问题。

3. 切换apt源

注意！这里的内容只适用于Ubuntu22.04！只适用于Ubuntu22.04！只适用于Ubuntu22.04！重要的事情说三遍！

如果你现在还不会编辑文件请先完成上一节内容！

在/etc/apt/sources.list中写入：

# 默认注释了源码仓库，如有需要可自行取消注释
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-updates main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-backports main restricted universe multiverse
deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-security main restricted universe multiverse
# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ jammy-security main restricted universe multiverse

然后执行命令，更新软件包列表：

sudo apt update

然后进行下一步。

4. 安装依赖-gcc、g++、make

执行以下命令安装依赖：

sudo apt install build-essential gcc g++ make -y

安装完成后进行下一步。

5. 禁用默认nouveau驱动

编辑/etc/modprobe.d/blacklist.conf，在末尾加入以下内容：

blacklist nouveau
options nouveau modeset=0

更新系统initramfs镜像文件

sudo update-initramfs -u

重启

reboot

使用以下命令检验是否成功禁用nouveau，如果没有输出则表示成功禁用。

lsmod | grep nouveau

完成后我们就可以正式进行cuda的安装了！。

三、安装CUDA与cuDNN

你可能会问：不是还没装nvidia驱动吗？怎么就装cuda了？

最开始我也是这么想的，后来发现其实cuda会自动安装nvidia驱动的，如果安装了nvidia驱动再安装cuda，安装好cuda后nvidia驱动就会又用不了了qwq（电子玄学问题，无法解释原因）

1. 安装CUDA

在bing上搜cuda对应的版本号，例如cuda11.8，就这么简单//

然后选择自己的显卡、Ubuntu22.04对应的runfile，这里不过多赘述了：

注意是选择runfile (local)！！选deb也可以，但推荐使用runfile

执行命令：

wget https://developer.download.nvidia.com/compute/cuda/...

然后进行安装：

sudo sh cuda_xxx_xxx.xxx_linux.run

安装前解压时间长，请耐心等待。

进入安装界面后，不要做任何操作，直接Install（如果想要安装Kernel，请参见这一文章：Centos7系统wget 的安装与使用详细教程。_centos7安装wget-CSDN博客，虽然是Centos的，但应该和Ubuntu目录结构大致相同，可以参考）

安装cuda会自动安装nvidia驱动。如果需要安装最新驱动请见后面。关于如何使nvcc -V命令有效（将cuda加入到环境变量中），请见"拓展部分-CUDA版本的灵活切换"，建议按照步骤慢慢来。如果你不想实现cuda版本灵活切换，可以参考这篇文章直接配置环境：ubuntu22.04 cuda和cudnn安装和配置 - raiuny - 博客园

可以用以下命令检验nvidia驱动是否成功安装：

nvidia-smi

完成后我们进入下一步。

2. 安装cuDNN

在这个链接中找到自己的系统版本：cuDNN Archive | NVIDIA Developer

不要直接复制链接wget，那样下载的只有一个index.html文件。先在主机下载后，复制下载链接，然后再wget。并且为了避免文件名过程报错，需要使用以下命令来下载：

wget -O cudnn-ubuntu2204.deb 下载链接

然后使用dpkg安装

sudo dpkg -i cudnn-ubuntu2204.deb

安装完成后会提示"The public cudnn-local-repo-ubuntu2204-8.9.7.29 GPG key does not appear to be installed"，直接复制下面的指令输进去然后再来一次即可。

sudo cp /var/cudnn-local-repo-ubuntu2204....
sudo dpkg -i cudnn-ubuntu2204.deb

cuDNN自此安装成功。

3. （可选）安装Nvidia驱动

如果你希望保持更高的Nvidia驱动版本，可以进行以下步骤：

到官网下载适合自己显卡和系统的驱动：

然后下载并安装：

wget https://....
sudo NVIDIA-Linux-x86_64-xxx.xxx.run

检验是否安装成功：

nvidia-smi

四、拓展

1. 实现CUDA版本灵活切换

本节参考文章创建多个cuda版本，可以自由切换，感谢作者！

由于很多模型框架基于的cuda版本不同，因此我们可能需要频繁地使用各种各样的cuda。不过本节内容可以彻底解放大脑，轻松切换CUDA版本！

进入到usr/bin目录下，创建文件switch-cuda.sh，写入以下内容：

#!/usr/bin/env bash
	
	# Copyright (c) 2018 Patrick Hohenecker
	#
	# Permission is hereby granted, free of charge, to any person obtaining a copy
	# of this software and associated documentation files (the "Software"), to deal
	# in the Software without restriction, including without limitation the rights
	# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
	# copies of the Software, and to permit persons to whom the Software is
	# furnished to do so, subject to the following conditions:
	#
	# The above copyright notice and this permission notice shall be included in all
	# copies or substantial portions of the Software.
	#
	# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
	# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
	# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
	# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
	# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
	# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
	# SOFTWARE.
	
	# author:   Patrick Hohenecker <mail@paho.at>
	# version:  2018.1
	# date:     May 15, 2018
	
	
	set -e
	
	
	# ensure that the script has been sourced rather than just executed
	if [[ "${BASH_SOURCE[0]}" = "${0}" ]]; then
	    echo "Please use 'source' to execute switch-cuda.sh!"
	    exit 1
	fi
	
	INSTALL_FOLDER="/usr/local"  # the location to look for CUDA installations at
	TARGET_VERSION=${1}          # the target CUDA version to switch to (if provided)
	
	# if no version to switch to has been provided, then just print all available CUDA installations
	if [[ -z ${TARGET_VERSION} ]]; then
	    echo "The following CUDA installations have been found (in '${INSTALL_FOLDER}'):"
	    ls -l "${INSTALL_FOLDER}" | egrep -o "cuda-[0-9]+\\.[0-9]+$" | while read -r line; do
	        echo "* ${line}"
	    done
	    set +e
	    return
	# otherwise, check whether there is an installation of the requested CUDA version
	elif [[ ! -d "${INSTALL_FOLDER}/cuda-${TARGET_VERSION}" ]]; then
	    echo "No installation of CUDA ${TARGET_VERSION} has been found!"
	    set +e
	    return
	fi
	
	# the path of the installation to use
	cuda_path="${INSTALL_FOLDER}/cuda-${TARGET_VERSION}"
	
	# filter out those CUDA entries from the PATH that are not needed anymore
	path_elements=(${PATH//:/ })
	new_path="${cuda_path}/bin"
	for p in "${path_elements[@]}"; do
	    if [[ ! ${p} =~ ^${INSTALL_FOLDER}/cuda ]]; then
	        new_path="${new_path}:${p}"
	    fi
	done
	
	# filter out those CUDA entries from the LD_LIBRARY_PATH that are not needed anymore
	ld_path_elements=(${LD_LIBRARY_PATH//:/ })
	new_ld_path="${cuda_path}/lib64:${cuda_path}/extras/CUPTI/lib64"
	for p in "${ld_path_elements[@]}"; do
	    if [[ ! ${p} =~ ^${INSTALL_FOLDER}/cuda ]]; then
	        new_ld_path="${new_ld_path}:${p}"
	    fi
	done
	
	# update environment variables
	export CUDA_HOME="${cuda_path}"
	export CUDA_ROOT="${cuda_path}"
	export LD_LIBRARY_PATH="${new_ld_path}"
	export PATH="${new_path}"
	
	echo "Switched to CUDA ${TARGET_VERSION}."
	
	set +e
	return

之后我们就可以使用"switch-cuda.sh+版本"来实现灵活切换CUDA版本。具体指令如下：

# 查询当前已安装的CUDA版本
switch-cuda.sh
# 切换到指定CUDA版本，例如cuda11.8
switch-cuda.sh 11.8

可以发现CUDA版本成功切换了。

自此文章大部分内容结束，深度学习环境成功搭建完成！

2. 使用Anaconda搭建虚拟环境

最后还是得推荐下自家宝子 Anaconda 相信大多数人都用过，也是由于各种各样的模型框架所需要的Python库版本不同，所以可以使用 Anaconda创建不同环境来支持不同的模型。

主要命令：

# 创建新的虚拟环境
conda create -n env_name python=3.8

# 激活虚拟环境
conda activate env_name

# 退出虚拟环境
conda deactivate

# 删除虚拟环境
conda remove -n env_name --all

# 查看已安装的环境列表
conda env list

# 安装指定的包
conda install package_name

# 升级指定的包
conda update package_name

# 移除指定的包
conda remove package_name

# 查看已安装的包
conda list

# 搜索可用的包
conda search package_name

# 更新Anaconda自身
conda update conda
conda update anaconda

# 导出虚拟环境
conda env export > environment.yml

# 从环境文件导入虚拟环境
conda env create -f environment.yml

# 检查依赖问题并修复
conda doctor

# 查看conda帮助
conda --help

3. 实现自动连接无需密码

本节参考本篇文章：最香远程开发解决方案！手把手教你配置VS Code远程开发工具，感谢作者！

打开win cmd终端，输入 ssh-keygen -t rsa 生成秘钥对
```
ssh-keygen -t rsa
```

秘钥列表

打开生成的秘钥保存路径，拷贝 id_rsa.pub 内容，添加到到云服务器的 ~/.ssh/authorized_keys 文件后面。

尝试再次连接，不用输密码了。

以上就是本文的全部内容，感谢大家的耐心阅读，希望本文能够帮到你！如果有任何疑问请在评论区留言！

五、后记

1. 参考链接

- https://blog.csdn.net/qq_32892383/article/details/141460197

- https://blog.csdn.net/qq_34972053/article/details/127689332

- https://forums.developer.nvidia.com/t/info-finished-with-code-256-error-install-of-driver-component-failed/107661/9

- https://kernel.ubuntu.com/mainline/v6.12/

- https://blog.csdn.net/bocai1215/article/details/126559197

- https://www.cnblogs.com/chua-n/p/13208414.html

- https://www.cnblogs.com/chua-n/p/13208398.html

- https://zhuanlan.zhihu.com/p/136234910

- https://developer.nvidia.com/cuda-11-8-0-download-archive?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=runfile_local

- https://blog.csdn.net/wjinjie/article/details/108997692

- https://blog.csdn.net/lishuaigell/article/details/124740342

- https://blog.csdn.net/weixin_43472800/article/details/128434810

- https://blog.csdn.net/onlyyoujojo/article/details/129024630

- https://www.nvidia.cn/drivers/lookup

- https://www.bilibili.com/video/BV1JH4y137sy/?spm_id_from=333.337.search-card.all.click&vd_source=228350a2a097728d450346c0a016177a

- https://www.bilibili.com/video/BV1oK411b7RP/?spm_id_from=333.337.search-card.all.click

- https://zhuanlan.zhihu.com/p/133323571

- https://blog.csdn.net/Long_xu/article/details/135039596

- https://blog.csdn.net/marleylee/article/details/70739131

- https://blog.csdn.net/qq_21095573/article/details/99736630

- https://developer.aliyun.com/mirror/

- https://zhuanlan.zhihu.com/p/251009600

- https://blog.csdn.net/qq_21095573/article/details/99736630

- https://blog.csdn.net/Sihang_Xie/article/details/127347139

- https://blog.csdn.net/m0_63171455/article/details/139150054

- https://www.cnblogs.com/pprp/p/9430836.html

- https://blog.csdn.net/weixin_44009447/article/details/120034467

- https://blog.csdn.net/Ben__Ho/article/details/139202015

- https://www.bilibili.com/video/BV1v84y1D7Rw/?spm_id_from=333.337.search-card.all.click&vd_source=228350a2a097728d450346c0a016177a

- https://www.cnblogs.com/bile/p/12502739.html

- https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html

- https://blog.csdn.net/weixin_42301220/article/details/130078734

- https://zhuanlan.zhihu.com/p/581634820

- https://blog.csdn.net/qq_42055933/article/details/142030339

- https://www.cnblogs.com/raiuny/p/16963325.html

- https://blog.csdn.net/huiyoooo/article/details/128015155

- https://blog.csdn.net/cwjcw81/article/details/139604268

- https://www.modelscope.cn/models/Qwen/Qwen2.5-7B-Instruct/summary

- https://blog.csdn.net/hb_learing/article/details/115547461

- https://www.cnblogs.com/137point5/p/15000954.html

- https://blog.csdn.net/tiansyun/article/details/131453705

- https://blog.csdn.net/xuezhe5212/article/details/139087879

- https://blog.csdn.net/A15216110998/article/details/113402172

- https://forums.developer.nvidia.com/t/linux-6-7-3-545-29-06-550-40-07-error-modpost-gpl-incompatible-module-nvidia-ko-uses-gpl-only-symbol-rcu-read-lock/280908/6

- https://forums.developer.nvidia.com/t/nvidia-545-installation-is-getting-failed-in-my-ubuntu-20-04-and-i-have-rtx4000/297029/5

2. 特别鸣谢

- 未音云 - https://cloud.whyyin.com/

感谢未音云提供的云服务器服务

- 欢雨云 - https://cloud.rainly.net/

感谢欢雨云提供的GPU服务器服务

标签：blog,cuDNN,22.04,csdn,cuda,https,article,net
From： https://blog.csdn.net/kiumk/article/details/144596473