nvidia-smi Failed to initialize NVML: Driver/library version mismatch
原因:NVIDIA 内核驱动版本与系统驱动不一致,
# sudo rmmod nvidia
rmmod: ERROR: Module nvidia is in use by: nvidia_modeset nvidia_uvm
首先要知道现在kernel mod 的依赖情况,首先我们从错误信息中知道,nvidia_modeset nvidia_uvm 这两个 mod 依赖于 nvidia, 所以要先卸载他们
# lsmod | grep nvidia
nvidia_uvm 769582 0
nvidia_drm 43547 2
nvidia_modeset 1053327 1 nvidia_drm
nvidia 15764359 2 nvidia_modeset,nvidia_uvm
drm_kms_helper 186531 1 nvidia_drm
drm 456166 5 drm_kms_helper,nvidia_drm
ipmi_msghandler 56728 4 ipmi_ssif,ipmi_devintf,nvidia,ipmi_si
sudo lsof -n -w /dev/nvidia*
这些进程有个了解,如果一会卸载失败,记得关闭相关进程。
sudo lsof /dev/nvidia*
confirm you successfully unload those kmods
sudo rmmod nvidia_drm
rmmod: ERROR: Module nvidia_drm is in use
sudo rmmod nvidia_modeset
rmmod: ERROR: Module nvidia_modeset is in use by: nvidia_drm
sudo rmmod nvidia_uvm
lsmod | grep nvidia
you should get nothing, then confirm you can load the correct driver
nvidia-smi