英伟达(nvida)驱动安装的坑

由于时效问题,该文某些代码、技术可能已经过期,请注意!!!本文最后更新于:2 年前

如题

本来服务器上的驱动没有问题的,因为之前的cudatookits是通过conda安装的,现在需要管理员在全局安装。
cudatoolkits下载 https://developer.nvidia.com/cuda-toolkit-archive
根据自己的系统选择对应的文件下载。注意:这里尽量选择runfile文件下载,因为runfile文件安装的时候可以进行选择哪些安装哪些不安装,如果选择rpm包下载安装可能就是全家桶了,包括显卡驱动。我就是下的rpm包安装的(没留神下成最新的版本了),然后驱动就更新了,随之显卡就悲催的不能用了。
没办法只能重新安装驱动
查看显卡版本

1
cat /proc/driver/nvidia/version

下载驱动 http://www.nvidia.com/Download/Find.aspx
选择和显卡版本一致的驱动版本,我这里的版本和安装命令如下,另外kernel的目录3.10.0-1062.18.1.el7.x86_64改成自己的

1
2
3
# NVIDIA-Linux-x86_64-515.57.run
chmod +x NVIDIA-Linux-x86_64-515.57.run
sudo ./NVIDIA-Linux-x86_64-515.57.run --kernel-source-path=/usr/src/kernels/3.10.0-1062.18.1.el7.x86_64 -k $(uname -r)

以为一切很顺利,然后就报错了。

第一个坑
1
2
3
4
ERROR: You appear to be running an X server; please exit X before            
installing. For further details, please see the section INSTALLING
THE NVIDIA DRIVER in the README available on the Linux driver
download page at www.nvidia.com.

网上扒了半天扒了一个有效的方法:https://unix.stackexchange.com/questions/25668/how-to-close-x-server-to-avoid-errors-while-updating-nvidia-driver

1
sudo init 3
第二个坑
1
2
3
4
5
6
ERROR: An NVIDIA kernel module 'nvidia-uvm' appears to already be loaded in your kernel.  This may be because it is 
in use (for example, by an X server, a CUDA program, or the NVIDIA Persistence Daemon), but this may also happen if
your kernel was configured without support for module unloading. Please be sure to exit any programs that may be us
ing the GPU(s) before attempting to upgrade your driver. If no GPU-based programs are running, you know that your k
ernel supports module unloading, and you still receive this message, then an error may have occurred that has corrup
ted an NVIDIA kernel module's usage count, for which the simplest remedy is to reboot your computer.

解决方法参考 https://www.cnblogs.com/1016391912pm/p/16494815.html

1
2
3
4
# 查看显卡上的进程 
lsof /dev/nvidia*
# kill 掉所有进程
kill -9 xxx

再次运行安装显卡驱动命令终于可以往下进行了。

参考:https://www.cnblogs.com/gollong/p/12655424.html
https://www.cnblogs.com/shenggang/p/12133220.html
https://eipi10.cn/deep-learning/2019/11/28/centos_cuda_cudnn/


本博客所有文章除特别声明外,均采用 CC BY-SA 4.0 协议 ,转载请注明出处!