浅析py-faster-rcnn中不同版本caffe的安装及其对应不同版本cudnn的解决方案
本文是截止目前为止最强攻略,按照本文方法基本可以无压力应对caffe和Ross B. Girshick的代码安装配置,如有转载请注明出处
Copyright 飞翔的蜘蛛人
注1:本人新手,文章中不准确的地方,欢迎批评指正
注2:阅读本文前请先熟悉:
1) Linux的基本操作
2) 熟悉Ubuntu系统下nvidia驱动及cuda安装,请见我的另一篇博客
基于UBUNTU14.04系统的NVIDIA TESLA K40驱动和CUDA 7.5安装笔记
http://www.cnblogs.com/muchong/p/6093328.html
3) 熟悉cudnn的安装和caffe的安装,请见YaoyaoLiu
Caffe配置简明教程 ( Ubuntu 14.04 / CUDA 7.5 / cuDNN 5.1 )
http://www.cnblogs.com/yaoyaoliu/p/5850993.html
以及caffe官方Caffe installation instructions说明
http://caffe.berkeleyvision.org/installation.html
一.Caffe installation
如果熟悉上述1,2,3的同学,那么最新版本的Caffe安装基本不会出现什么问题,如有问题请见上述推荐文章。
上述文章可能未覆盖到的问题
error while loading shared libraries: libcudnn.so.5: cannot open shared object file: No such file or directory
原因:自己安装的caffe库文件所在路径未添加到/etc/ld.so.conf文件中
解决方法
进入自己安装的caffe库文件所在位置,并把路径添加到/etc/ld.so.conf文件中
tju@tju-System-Product-Name:~$ cd tju/caffe/build/lib/
tju@tju-System-Product-Name:~/tju/caffe/build/lib$ pwd
/home/tju/tju/caffe/build/lib
tju@tju-System-Product-Name:~/tju/caffe$ cd /etc/ld.so.conf.d/
tju@tju-System-Product-Name:/etc/ld.so.conf.d$ ls -l
总用量 20
-rw-r--r-- 1 root root 22 11月 24 18:17 cuda.conf
-rw-rw-r-- 1 root root 38 3月 24 2014 fakeroot-x86_64-linux-gnu.conf
lrwxrwxrwx 1 root root 41 11月 24 18:09 i386-linux-gnu_EGL.conf -> /etc/alternatives/i386-linux-gnu_egl_conf
lrwxrwxrwx 1 root root 40 11月 24 18:09 i386-linux-gnu_GL.conf -> /etc/alternatives/i386-linux-gnu_gl_conf
-rw-r--r-- 1 root root 44 8月 10 2009 libc.conf
-rw-r--r-- 1 root root 68 4月 12 2014 x86_64-linux-gnu.conf
lrwxrwxrwx 1 root root 43 11月 24 18:09 x86_64-linux-gnu_EGL.conf -> /etc/alternatives/x86_64-linux-gnu_egl_conf
lrwxrwxrwx 1 root root 42 11月 24 18:09 x86_64-linux-gnu_GL.conf -> /etc/alternatives/x86_64-linux-gnu_gl_conf
-rw-r--r-- 1 root root 56 5月 26 18:28 zz_i386-biarch-compat.conf
tju@tju-System-Product-Name:/etc/ld.so.conf.d$ sudo touch libcudnn.conf
tju@tju-System-Product-Name:/etc/ld.so.conf.d$ sudo gedit libcudnn.conf
写入/home/tju/tju/caffe/build/lib
sudo ldconfig
二.我对py-faster-rcnn的作死安装方式
https://github.com/rbgirshick/py-faster-rcnn
讲在前面的话:
- 前置相关依赖项请用terminal安装,尽量不使用类似新立得软件包管理器等安装方式。
- 请先按照原文中的安装流程,先安装Python packages中cython, python-opencv, easydict这样可以减少几个编译中的报错,
安装命令:
sudo apt-get install cython
sudo apt-get install python-opencv
sudo pip install easydict
3. py-faster-rcnn存放位置路径中不能有中文,否则报错UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position
安装py-faster-rcnn
cd 要存放文件的位置
git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git
cd py-faster-rcnn/lib/
make
删除caffe-faster-rcnn中所有文件
拷贝未经编译过的最新版caffe中所有文件到caffe-faster-rcnn中
cd ../caffe-fast-rcnn/
cd python/
for req in $(cat requirements.txt); do sudo pip install $req; done
cd ..
cp Makefile.config.example Makefile.config
make all -j16
make test -j16
make runtest -j16
跑通
make pycaffe
跑通
把下好的faster_rcnn_models放到py-faster-rcnn/data下
./tools/demo.py
下面神奇的事情发生了,层出不穷的报错,直到我无法解决的错误
1.ImportError: No module named scipy
解决:sudo pip install scipy
2.error: library dfftpack has Fortran sources but no Fortran compiler found
解决:sudo apt-get install gfortran
http://blog.csdn.net/u010551621/article/details/46363853
3.Error parsing text-format caffe.NetParameter: 350:21: Message type "caffe.LayerParameter" has no field named "roi_pooling_param"
ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: /home/tju/tju/py-faster-rcnn/models/pascal_voc/VGG16/faster_rcnn_alt_opt/faster_rcnn_test.pt
详细报错内容
/usr/local/lib/python2.7/dist-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
WARNING: Logging before InitGoogleLogging() is written to STDERR
W1124 23:36:53.414587 9519 _caffe.cpp:122] DEPRECATION WARNING - deprecated use of Python interface
W1124 23:36:53.414649 9519 _caffe.cpp:123] Use this instead (with the named "weights" parameter):
W1124 23:36:53.414660 9519 _caffe.cpp:125] Net('/home/tju/tju/py-faster-rcnn/models/pascal_voc/VGG16/faster_rcnn_alt_opt/faster_rcnn_test.pt', 1, weights='/home/tju/tju/py-faster-rcnn/data/faster_rcnn_models/VGG16_faster_rcnn_final.caffemodel')
[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 350:21: Message type "caffe.LayerParameter" has no field named "roi_pooling_param".
F1124 23:36:53.437803 9519 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: /home/tju/tju/py-faster-rcnn/models/pascal_voc/VGG16/faster_rcnn_alt_opt/faster_rcnn_test.pt
*** Check failure stack trace: ***
解决:这个问题基本已经百度不到了,换google,中文没找到解决方案,英文中有一个相关问题解答,但是没说如何解决
问题原因:caffe版本不对
基本思路:need to update your build of Caffe and Run upgrade_net.cpp
过程中遇到的一堆问题及解决方法:
tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/tools$ gcc upgrade_net_proto_text.cpp
upgrade_net_proto_text.cpp:10:27: fatal error: caffe/caffe.hpp: 没有那个文件或目录
#include "caffe/caffe.hpp"
tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/tools sudo nautilus
把py-faster-rcnn/caffe-fast-rcnn/include/caffe下的caffe.hpp移动到/usr/include/
tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/include/caffe$ cd /usr/include/
tju@tju-System-Product-Name:/usr/include$ ls | grep caffe
caffe.hpp
gcc所调用的库文件要添加到路径中去/home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/caffe
tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/tools$ gcc upgrade_net_proto_text.cpp -I /home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/ -o upgrade_net_proto_text
In file included from /home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/caffe/common.hpp:19:0,
from /home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/caffe/blob.hpp:8,
from /home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/caffe/caffe.hpp:7,
from upgrade_net_proto_text.cpp:10:
/home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/caffe/util/device_alternate.hpp:34:23: fatal error: cublas_v2.h: 没有那个文件或目录
#include <cublas_v2.h>
^
compilation terminated.
tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/tools$
tju@tju-System-Product-Name:~$ cd /usr/local/cuda-7.5/targets/x86_64-linux/include/
tju@tju-System-Product-Name:/usr/local/cuda-7.5/targets/x86_64-linux/include$ ls | grep cublas.h
cublas.h
tju@tju-System-Product-Name:/usr/local/cuda-7.5/targets/x86_64-linux/include$ pwd
/usr/local/cuda-7.5/targets/x86_64-linux/include
tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/tools$ gcc upgrade_net_proto_text.cpp -I /home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/ -I /usr/local/cuda-7.5/targets/x86_64-linux/include -o upgrade_net_proto_text
In file included from /home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/caffe/caffe.hpp:7:0,
from upgrade_net_proto_text.cpp:10:
/home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/include/caffe/blob.hpp:9:34: fatal error: caffe/proto/caffe.pb.h: 没有那个文件或目录
#include "caffe/proto/caffe.pb.h"
^
compilation terminated.
tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/src/caffe/proto$ sudo protoc caffe.proto --cpp_out=.
tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/src/caffe/proto$ ls -l
总用量 1916
-rw-r--r-- 1 root root 1103165 11月 25 01:50 caffe.pb.cc
-rw-r--r-- 1 root root 794370 11月 25 01:50 caffe.pb.h
-rw-rw-r-- 1 tju tju 57711 11月 24 19:02 caffe.proto
tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn/src/caffe/proto$ pwd
/home/tju/tju/py-faster-rcnn/caffe-fast-rcnn/src/caffe/proto
tju@tju-System-Product-Name:~/tju/py-faster-rcnn/caffe-fast-rcnn$ mkdir include/caffe/proto
root@tju-System-Product-Name:/home/tju/桌面# cd py-faster-rcnn/
root@tju-System-Product-Name:/home/tju/桌面/py-faster-rcnn# cd lib/
root@tju-System-Product-Name:/home/tju/桌面/py-faster-rcnn/lib# make
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe6 in position 10: ordinal not in range(128)
building 'utils.cython_bbox' extension
utils/bbox.c:1:2: error: #error Do not use this file, it is the result of a failed Cython compilation.
#error Do not use this file, it is the result of a failed Cython compilation.
^
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
make: *** [all] 错误 1
原因:路径中有中文,换个地方放
src/caffe/layer_factory.cpp:
替换文件
/home/tju/tju/software/caffe/src/caffe/layer_factory.cpp
/usr/local/cuda/include/cudnn.h
替换文件
src/caffe/layers/cudnn_tanh_layer.cu
替换文件
src/caffe/layers/cudnn_tanh_layer.cpp:16:45: error: ‘activ_desc_’ was not declared in this scope
cudnn::createActivationDescriptor<Dtype>(&activ_desc_, CUDNN_ACTIVATION_TANH);
caffe旧官方版本对cudnn只支持到特定的版本,不支持最新的V5版本,v5中需要手动修改
include/caffe/layers/cudnn_tanh_layer.hpp, src/caffe/layers/cudnn_tanh_layer.cpp, src/caffe/layers/cudnn_tanh_layer.cu
include/caffe/layers/cudnn_sigmoid_layer.hpp, src/caffe/layers/cudnn_sigmoid_layer.cpp, src/caffe/layers/cudnn_sigmoid_layer.cu
include/caffe/layers/cudnn_relu_layer.hpp, src/caffe/layers/cudnn_relu_layer.cpp, src/caffe/layers/cudnn_relu_layer.cu
src/caffe/layers/cudnn_conv_layer.cu
把原来改成old_的原.cu和.cpp删掉
再gcc就出现了我解决不了的编译问题了… …
Check failed: registry.count(type) == 0 (1 vs. 0) Layer type Convolution already registered.
需要改代码 工作量太大,改到夜里两点半,发现改不动。。。编译时各种报错,放弃这种安装方法
三.简单粗暴的py-faster-rcnn安装方式
通过上述方法,可以得出结论py-faster-rcnn不支持最新版caffe已及最新版cudnn,新版caffe是支持最新版cudnn的,但py-faster-rcnn报错的原因都是因为caffe和cudnn版本各不同引起,或许有大神可以解决,但我强行手动升级没成功,现在告诉大家一种简单粗暴快速的安装方法。
Cuda7.5
cudnn-7.0-linux-x64-v4.0
py-faster-rcnn中自带版本caffe
再按照二中安装py-faster-rcnn方法
就顺利跑通了!
参考文献:
http://blog.csdn.net/u012208159/article/details/47018095
http://blog.csdn.net/tmylzq187/article/details/51952847?locationNum=8
https://github.com/BVLC/caffe/issues/3947
http://blog.csdn.net/hpp24/article/details/52192682
http://www.oschina.net/question/565065_115133
http://www.cnblogs.com/bishopmoveon/p/4475036.html
http://blog.csdn.net/vbskj/article/details/52120475
http://blog.csdn.net/kkk584520/article/details/51163564