先查看自己的linux上显卡型号:
# lspci | grep -i nvidia
04:00.0 VGA compatible controller: NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] (rev a1)
04:00.1 Audio device: NVIDIA Corporation GP102 HDMI Audio Controller (rev a1)
查看是否有程序占用(如果存在占用,请停掉该程序)
# lsof | grep nvidia
nvidia-mo 443 root cwd DIR 253,0 254 64 /
nvidia-mo 443 root rtd DIR 253,0 254 64 /
nvidia-mo 443 root txt unknown /proc/443/exe
当然显卡驱动也可以这样安装:(推荐)
sudo yum install nvidia-detect
nvidia-detect -v
Probing for supported NVIDIA devices...
[10de:1b06] NVIDIA Corporation GP102 [GeForce GTX 1080 Ti]
This device requires the current 440.64
yum -y install kmod-nvidia
错误:nvidia-x11-drv-390xx conflicts with nvidia-x11-drv-460.39-1.el7_9.elrepo.x86_64
错误:nvidia-x11-drv-390xx conflicts with nvidia-x11-drv-libs-460.39-1.el7_9.elrepo.x86_64
错误:nvidia-x11-drv conflicts with nvidia-x11-drv-390xx-390.138-1.el7_8.elrepo.x86_64
您可以尝试添加 --skip-broken 选项来解决该问题
** 发现 2 个已存在的 RPM 数据库问题, 'yum check' 输出如下:
dnf-4.0.9.2-1.el7_6.noarch 有缺少的需求 python2-dnf = ('0', '4.0.9.2', '1.el7_6')
orca-3.6.3-4.el7.x86_64 有缺少的需求 pyatspi
卸载冲突的包
yum remove -y nvidia-x11-drv-390xx-390.138-1.el7_8.elrepo.x86_64
yum remove -y nvidia-x11-drv-460.39-1.el7_9.elrepo.x86_64
卸载驱动:
sudo yum remove kmod-nvidia
# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
# nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
http://www.nvidia.cn/Download/Find.aspx?lang=cn
wget https://us.download.nvidia.com/XFree86/Linux-x86_64/440.64/NVIDIA-Linux-x86_64-440.64.run
sudo chmod a+x NVIDIA-Linux-x86_64-440.64.run
./NVIDIA-Linux-x86_64-440.64.run
# nvidia-smi
ERROR: An NVIDIA kernel module 'nvidia-drm' appears to already be loaded in your kernel.
# sudo systemctl isolate multi-user.target
# sudo modprobe -r nvidia-drm
modprobe: FATAL: Module nvidia_drm is in use.
sudo modprobe -r nvidia-modeset
# lsmod | grep nvidia.drm
nvidia_drm 43547 2
nvidia_modeset 1053327 1 nvidia_drm
drm_kms_helper 186531 1nvidia_drm
drm 456166 5 drm_kms_helper,nvidia_drm
Run lsmod | grep nvidia.drm
and see the numbers to the right of the nvidia_drm
module name. The first number is simply the size of the module; the second is the use count.
If the X11 server is running and using the nvidia
driver, then the nvidia_drm
kernel module will most assuredly be in use. So you'll need, at the very least, switch into text console and shutdown the X11 server. Usually this can be done by stopping whichever X Display Manager service you're using (depends on which desktop environment you're using).
As the error message said, if you are running nvidia-persistenced
, you'll need to stop that too before you can unload the nvidia_drm
module.
kill -9 Xvnc
17080 root 20 0 519316 214832 47908 S 6.3 0.1 5421:48 Xvnc
ps aux | grep nvidia
root 443 0.0 0.0 0 0 ? S 2020 0:00 [nvidia-modeset]
root 8197 0.0 0.0 112832 984 pts/0 S+ 22:01 0:00 grep --color=auto nvidia