Ubuntu16.04安装TensorFlow-gpu+CUDA9.0+cuDNN v7
Ubuntu16.04安装TensorFlow-gpu+CUDA9.0+cuDNN v7
1.准备工作
更新Ubuntu16.04源,用的是中科大的源:
cd /etc/apt/
sudo nano sources.list
把下面的这些源添加到source.list文件头部:
deb http://mirrors.ustc.edu.cn/ubuntu/ xenial main restricted universe multiverse
deb http://mirrors.ustc.edu.cn/ubuntu/ xenial-security main restricted universe multiverse
deb http://mirrors.ustc.edu.cn/ubuntu/ xenial-updates main restricted universe multiverse
deb http://mirrors.ustc.edu.cn/ubuntu/ xenial-proposed main restricted universe multiverse
deb http://mirrors.ustc.edu.cn/ubuntu/ xenial-backports main restricted universe multiverse
deb-src http://mirrors.ustc.edu.cn/ubuntu/ xenial main restricted universe multiverse
deb-src http://mirrors.ustc.edu.cn/ubuntu/ xenial-security main restricted universe multiverse
deb-src http://mirrors.ustc.edu.cn/ubuntu/ xenial-updates main restricted universe multiverse
deb-src http://mirrors.ustc.edu.cn/ubuntu/ xenial-proposed main restricted universe multiverse
deb-src http://mirrors.ustc.edu.cn/ubuntu/ xenial-backports main restricted universe multiverse
最后更新源和更新已安装的包:
sudo apt-get update
sudo apt-get upgrade
将pip源指向清华大学的源镜像:https://mirrors.tuna.tsinghua.edu.cn/help/pypi/,具体添加一个 ~/.config/pip/pip.conf 文件
cd .config
sudo mkdir pip
sudo touch pip.conf
sudo gedit pip.conf
设置为:
[global]
index-url = https://pypi.tuna.tsinghua.edu.cn/simple
2. 安装驱动
系统设置->软件更新->附加驱动->选择nvidia最新驱动->应用更改
3. 安装CUDA
下载CUDA9.0
下载完毕,进入文件的保存路径,运行如下代码:
sudo sh cuda_8.0.61_375.26_linux.run
开始安装,首先是一份长达两公里的协议,一直按回车到底以后,输入accept。其他的操作如下所示:
Do you accept the previously read EULA?
accept/decline/quit: accept
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 367.48?
(y)es/(n)o/(q)uit: n
Install the CUDA 8.0 Toolkit?
(y)es/(n)o/(q)uit: y
Enter Toolkit Location
[ default is /usr/local/cuda-8.0 ]:
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y
Install the CUDA 8.0 Samples?
(y)es/(n)o/(q)uit: y
除了询问是否安装显卡驱动选 n ,其他地方均为y
配置环境变量:
sudo gedit ~/.bashrc
在尾部加上这么两句话:
export PATH=/usr/local/cuda-9.0/bin:$PATH
和
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64:$LD_LIBRARY_PATH
保存后请运行:
source ~/.bashrc
编译cuda自带的Sample,指令如下:
cd /usr/local/cuda/samples/1_Utilities/deviceQuery
sudo make
./deviceQuery
结果如下:
4. 安装cuDNN
下载地址如下: https://developer.nvidia.com/rdp/cudnn-download
选择和CUDA对应的cudnn版本(CUDA9.0对应cudnn v7)
使用tar进行解压:
tar -xvf cudnn-8.0-linux-x64-v6.0.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
sudo chmod a+r /usr/local/cuda/include/cudnn.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
在安装Tensorflow之前,按照Tensorflow官方安装文档的说明,需要先安装一个libcupti-dev库:
sudo apt-get install libcupti-dev
安装TensorFlow
sudo apt-get install python3-pip
pip3 install tensorflow-gpu
进入Python3
import tensorflow as tf
hello = tf.constant(‘Hello, TensorFlow!’)
sess = tf.Session()
print(sess.run(hello))
5. 遇到的问题及解决
问题1:
sudo apt-get upgrade时遇到错误:
dpkg: error processing package grub-efi-amd64-signed (–configure):
subprocess installed post-installation script returned error exit status 1
Errors were encountered while processing:
grub-efi-amd64-signed
E: Sub-process /usr/bin/dpkg returned an error code (1)
解决方法:
sudo apt-get purge grub*
sudo apt-get install grub-efi
sudo apt-get autoremove
sudo update-grub
问题2:
import tensorflow时遇到错误:
ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory
解决方法:
sudo ldconfig /usr/local/cuda/lib64