【caffe】配置caffe记录(GPU)

配置过程必须记录下来,否则换台电脑继续踩坑懵逼。

电脑配置和已有环境配置

  • 系统:Ubuntu16.04
  • GPU:NVIDIA GeForce 940MX
  • Python版本:3.5.2(系统自带,未使用anaconda)
  • opencv版本:3.4.1
  • protobuf版本:3.6.0 -> 3.3.0(后降版本)

可以通过lspci | grep -i nvidia来查看显卡型号。由于我的显卡驱动是装好的,因此跳过。有需要可以参考网上安装驱动教程。通过sudo dpkg --list | grep nvidia-*来查看显卡驱动,通过pkg-config --modversion opencv来查看opencv版本。

安装CUDA

欲用GPU,必先。。安CUDA,它是用于NIVDIA的GPU的并行计算框架。
NIVDIA官网CUDA下载
cuda-repo-ubuntu1604_10.0.130-1_amd64.deb

按照官网上的提示进行安装:

sudo dpkg -i cuda-repo-ubuntu1604_10.0.130-1_amd64.deb 
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda

对./bashrc进行修改:

sudo gedit ~/.bashrc

在末尾加上

export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

更新bashrc:

source ~/.bashrc

查看CUDA版本:

cat /usr/local/cuda/version.txt

测试CUDA样例:

cd /usr/local/cuda-10.0/samples/5_Simulations/nbody/
make
sudo ./nbody

【caffe】配置caffe记录(GPU)

安装cuDNN

cuDNN是NVIDIA针对深度神经网络DNN做的加速库。

下载:

cudnn下载地址
注:需要注册/登录账号后下载
由于CUDA版本为10.0,因此选择cuDNN v7.3.1 for CUDA10.0,下载cuDNN v7.3.1 Library for Linux。

解压:

sudo tar -zxvf ./cudnn-10.0-linux-x64-v7.3.1.20.solitairetheme8

复制头文件:

cd cuda/include
sudo cp cudnn.h /usr/local/cuda/include 

复制动态链接库,删除原有动态文件,并生成新的软链接,使其生效:

cd ../lib64
sudo cp lib* /usr/local/cuda/lib64

cd /usr/local/cuda/lib64/
sudo rm -rf libcudnn.so libcudnn.so.7

sudo ln -s libcudnn.so.7.3.1 libcudnn.so.7
sudo ln -s libcudnn.so.7 libcudnn.so

sudo ldconfig -v

查看cuDNN版本:

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

显示了#define CUDNN_MAJOR 7

安装caffe

caffe是一种常用于视频、图像处理的深度学习框架。

下载caffe:

caffe github

git clone https://github.com/BVLC/caffe.git

复制配置模板,生成配置文件:

sudo cp Makefile.config.example Makefile.config
sudo gedit Makefile.config

修改Makefile.config和Makefile并编译:

sudo make clean
make all
make test
make runtest
1.若出现hdf5相关问题

fatal error: hdf5.h: 没有那个文件或目录

/usr/bin/ld: 找不到 -lhdf5_hl
/usr/bin/ld: 找不到 -lhdf5

1.将Makefile中

##############################
# Derive include and lib directories
##############################
LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_hl hdf5

修改为:

##############################
# Derive include and lib directories
##############################
LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_serial_hl hdf5_serial

2.将hdf5路径添加到Makefile.config中

INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include 
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib 

改为

INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/hdf5/serial
2.若出现cv:imdecode…未定义引用问题

将Makefile修改为:

##############################
# Derive include and lib directories
##############################
LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_serial_hl hdf5_serial opencv_core opencv_highgui opencv_imgproc opencv_imgcodecs opencv_videoio
3.若出现nvcc fatal : Unsupported gpu architecture ‘compute_20’

在Makefile.config中根据CUDA版本设置

# CUDA architecture setting: going with all of them.
# For CUDA < 6.0, comment the *_50 through *_61 lines for compatibility.
# For CUDA < 8.0, comment the *_60 and *_61 lines for compatibility.
# For CUDA >= 9.0, comment the *_20 and *_21 lines for compatibility.
CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \
                -gencode arch=compute_20,code=sm_21 \
                -gencode arch=compute_30,code=sm_30 \
                -gencode arch=compute_35,code=sm_35 \
                -gencode arch=compute_50,code=sm_50 \
                -gencode arch=compute_52,code=sm_52 \
                -gencode arch=compute_60,code=sm_60 \
                -gencode arch=compute_61,code=sm_61 \
                -gencode arch=compute_61,code=compute_61 

由于我的CUDA版本为10.0,所以修改为

# CUDA architecture setting: going with all of them.
# For CUDA < 6.0, comment the *_50 through *_61 lines for compatibility.
# For CUDA < 8.0, comment the *_60 and *_61 lines for compatibility.
# For CUDA >= 9.0, comment the *_20 and *_21 lines for compatibility.
CUDA_ARCH := -gencode arch=compute_30,code=sm_30 \
                -gencode arch=compute_35,code=sm_35 \
                -gencode arch=compute_50,code=sm_50 \
                -gencode arch=compute_52,code=sm_52 \
                -gencode arch=compute_60,code=sm_60 \
                -gencode arch=compute_61,code=sm_61 \
                -gencode arch=compute_61,code=compute_61  
4.若出现error This file requires compiler and library support for the ISO C++ 2011 standard

若在CMakeLists.txt中添加”set(CXX_STANDARD 11)”并没有用,则可以尝试将protobuf版本降低。

wget https://github.com/google/protobuf/archive/v3.3.0.zip
unzip v3.3.0.zip
cd protobuf-3.3.0/
./autogen.sh
./configure
make
make check

参考博客

caffe测试样例mnist

cd caffe/data/mnist
./get_mnist.sh
cd ../../
./examples/mnist/create_mnist.sh
./examples/mnist/train_lenet.sh

上面的cd ../../表示回到caffe根目录下,因为新版caffe都需要从根目录上执行。否则会报错:./create_mnist.sh: 17: ./create_mnist.sh: build/examples/mnist/convert_mnist

训练过程:
【caffe】配置caffe记录(GPU)

ok,caffe配置完成