Ubuntu深度学习环境搭建

如为archlinux,一行命令完成GPU驱动、cuda Toolkits、cudnn的安装
paru -S cudnn

GPU驱动安装

首先ctrl + alt + F1进入字符界面

1.删除原有驱动

1
2
3
sudo apt-get purge nvidia*
sudo apt-get autoremove
sudo ./NIVIDIA-Linux-X86_64-384.59.run --uninstall

2.安装依赖

sudo apt-get install build-essential gcc-multilib dkms

3.禁用nouveau驱动

编辑 /etc/modprobe.d/blacklist-nouveau.conf 文件,添加以下内容:

1
2
3
4
5
blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off

关闭nouveau:
echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf

4.reboot

1
2
sudo update-initramfs -u
sudo reboot

5.获取kernel source

1
2
apt-get install linux-source
apt-get install linux-headers-x.x.x-x-generic

其中x.x.x-x-generic可以通过$(uname -r)替换得到

6.关掉x graphic 服务

1
2
3
sudo systemctl stop lightdm(or sudo service lightdm stop)
sudo systemctl stop gdm
sudo systemctl stop kdm

7.安装nvidia驱动

下载对应版本https://www.nvidia.cn/Download/index.aspx?lang=cn

1
2
sudo chmod NVIDIA*.run
sudo ./NVIDIA-Linux-x86_64-384.59.run

8.显卡驱动检查

nvidia-smi

安装cuda Toolkits

1.下载

https://developer.nvidia.com/cuda-downloads
fedora版本可用于archlinux

2.安装

1
2
chmod u+x cudxxxxxxxxxxxx
sudo ./cudxxxxxxxxxxxx

3.检查

nvcc -V

安装cudnn(cuda加速库)

1.下载

https://developer.nvidia.com/rdp/cudnn-archive

2.复制cudnn头文件

sudo cp include/* /usr/local/cuda-11.6/include/

3.复制cudnn的库

sudo cp lib/* /usr/local/cuda/lib64/

4.添加可执行权限

1
2
sudo chmod +x /usr/local/cuda-11.6/include/cudnn.h
sudo chmod +x /usr/local/cuda-11.6/lib64/libcudnn*

5.检查

cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2

切换cuda版本

进入目录

cd /usr/local

删除原cuda映射

sudo rm -rf cuda

建立新的cuda映射

ln -s cuda-9.0 cuda

检查

nvcc -V

参考资料

https://wiki.archlinux.org/title/NVIDIA_(简体中文)
https://blog.csdn.net/qq_40907977/article/details/115305634
https://blog.csdn.net/public669/article/details/98470857
https://blog.csdn.net/bigconvience/article/details/8782668