• [Caffe]史上最全的caffe安装过程


    Linux下的GPU版Caffe安装方法

    系统环境:Ubuntu 14.04LTS + NV TitanX

    1.1 (可选)显卡驱动的安装(有风险)

    如果需要重装,需要先卸载已有版本

    sudo apt-get remove - -purge nvidia-*
    sudo apt-get remove - -purge cuda-*
    

    再重装一个比较稳定的版本

    sudo apt-get install -y nvidia-352    // nvidia-361也比较稳定
    

    或者安装其它最新版本的驱动,但一定注意,通过apt-get或者yum安装,不要用官方的.run文件安装驱动,否则死机必须重装系统解决

    1.2 安装cuda

    需要注意的是,cuda安装过程会自己安装显卡驱动,所以第一步也可以不做。使用以下命令检查在安装cuda前后验证是否有显卡驱动附带安装了。

    dpkg –l | grep nvidia
    
    When installing CUDA on Ubuntu, you can choose between the Runfile Installer and the Debian Installer. The Runfile Installer is only available as a Local Installer. The Debian Installer is available as both a Local Installer and a Network Installer. The Network Installer allows you to download only the files you need. The Local Installer is a standalone installer with a large initial download. In the case of the Debian installers, the instructions for the Local and Network variants are the same. For more details, refer to the Linux Installation Guide.
    

    以上是官方的指引,即我们可以使用本地安装(需要下载1G多的安装包),也可以只下载一个种子文件,安装过程需要其它配套程序会自动下载。安装命令如下:

    sudo dpkg --install cuda-repo-<distro>-<version>.<architecture>.deb
    sudo apt-get update
    sudo apt-get install cuda
    reboot
    

    增加环境变量

    vim ~/.bashrc
    export PATH=/usr/local/cuda-7.5/bin:$PATH
    export LD_LIBRARY_PATH=/usr/local/cuda-7.5/lib64:$LD_LIBRARY_PATH
    source ~/.bashrc
    

    安装测试样例

    cuda-install-samples-7.5.sh ~
    cd ~/NVIDIA_CUDA-Samples_7.5/5_Simulations/nbody
    make
    ./nbody
    

    检查是否安装完毕

    Dpkg –l |grep cuda
    Dpkg –l |grep nvidia
    Cd /usr/local/cuda/samples/1_Utilities/deviceQuery
    ./deviceQuery
    

    cuda安装参考

    1.3 安装cudnn

    Step 0: Install cuda from the standard repositories.
    Step 1: Register an nvidia developer account and download cudnn here (about 80 MB)
    Step 2: Check where your cuda installation is. For the installation from the repository it is /usr/lib/... and /usr/include. Otherwise, it will be /urs/local/cuda/.
    You can check it with:

    which nvcc 
    

    or

    ldconfig -p | grep cuda
    

    Step 3: Copy the files:

    cd folder/extracted/contents
    sudo cp -P include/cudnn.h /usr/include
    sudo cp -P lib64/libcudnn* /usr/lib/x86_64-linux-gnu/
    sudo chmod a+r /usr/lib/x86_64-linux-gnu/libcudnn*	
    

    cudnn安装参考

    1.4 安装anaconda-python

    Download the software in following address

    sudo chmod 777 Anaconda2-4.1.1-Linux-x86_64.sh
    ./Anaconda2-4.1.1-Linux-x86_64.sh
    

    然后按照提示安装即可,安装路径选择默认的路径即可,安装结束后会自动添加环境变量,退出终端之后再进入后生效。
    note: anadconda的包管理机制虽然很方便,但有时会和系统版本产生各种不兼容反应
    如果想使用系统自带python的话,最好使用python本地虚拟环境,这样更加便于管理。安装Python本地虚拟环境参考

    2.1 编译Caffe源码

    git clone https://github.com/BVLC/caffe && cd caffe
    cp Makefile.config.example Makefile.config 
    vim Makefile.config 
    

    uncomment

    USED_CUDNN = 1
    USE_CUDNN := 1
    OPENCV_VERSION := 3
    # Modify PYTHONPATH of include and lib	
    ANACONDA_HOME := $(HOME)/pyenv/pycaffe
    PYTHON_INCLUDE := $(ANACONDA_HOME)/include 
                     $(ANACONDA_HOME)/include/python2.7 
                     $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include 
    USE_PKG_CONFIG := 1
    WITH_PYTHON_LAYER := 1
    

    编译

    make all -j16
    make pycaffe
    

    这两步不出问题一步就OK了,但是为了让python能够找到编译好的caffe位置,需要:

    cd ~/caffe/python    
    # 这里面是了些python常用包,需要装不上的直接删去
    pip install –r requirements.txt 
    pwd > ~/pyenv/pycaffe/lib/python2.7/site-packages/caffe.pth    
    # 设置python中caffe路径,其中~/pyenv/pycaffe是我用的python的位置
    

    在任意路径下,用caffe.pth中的python,import caffe不错误即成功了。

    2.2 常见编译或使用中遇到的问题

    (1) error: while loading shared libraries: libcudart.so.7.5: cannot open shared object file: No such file or directory
    解决:

    sudo ldconfig  /usr/local/cuda/lib64
    step 4:sudo make pycaffe –j16  
    

    (2) error: Python.h no such file or directory
    解决:

    ANACONDA_HOME := /home/wangyuanjiang/anaconda2
    PYTHON_INCLUDE := $(ANACONDA_HOME)/include 
    		 $(ANACONDA_HOME)/include/python2.7 
    		 $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include 
    

    solution2 Reference

    (3) error: import caffe 后提示Error: No module named google.protobuf.internal
    solution: pip install protobuf
    solution3 referrence

    (4) error: caffe with anaconda2python RuntimeError: module compiled against API version 0xa but this version of numpy is 0x9
    solution 4:
    download numpy-1.11.tar.gz

    pip install numpy-1.11.tar.gz
    

    (5) error: ImportError: No module named skimage.io
    Solution 5:

    sudo apt-get install python-skimage
    

    reference5

    (6) error6: python import caffe only in root+caffe/python directory

    export PYTHONPATH=/usr/local/src/caffe/python:$PYTHONPATH
    

    (7) error: python –version differs with sudo python –version
    It’s a painful trouble!
    You an add solute path of python when you want to use sudo python
    You can change the permission of you directory by type

    Chmod –R 777 destdir
    

    (8) error: cublas_v2.h: No such file or directory:
    ./include/caffe/util/device_alternate.hpp:34:23: fatal error: cublas_v2.h: No such file or directory
    Solution:
    本来以为是自己的openblas没有安装好,后来发现是因为在Makefile.config文件中写错了cuda的路径。本来把cuda的路径uncomment成为/usr/local/cuda就行了,但是我却把它uncomment成为/usr了。

    (9) error: when make runtest –j16, there is no share lib libhdf5_h1.so.10 and libhd5.so.10
    Solution:

    cd /usr/lib/x86_64-linux-gnu/
    sudo cp libhdf5_hl.so.7 libhdf5_hl.so.10
    sudo cp libhdf5.so.7 libhdf5.so.10
    

    (10) error: 发现训练一次mnist非常慢,花了大概30min但是make runtest 没有运行完,把它运行完发现训练变快了。还没解决。
    (11) error: caffe/proto/caffe.pb.h: No such file or directory
    解决:

    cp –rf build/src/caffe/proto include/caffe/
    

    参考
    official
    recommend
    关于pycaffe、matcaffe的使用
    博客1
    博客2

  • 相关阅读:
    找出MySQL库中设计不好的Schemas并修复
    【转载】分享一个查看分析Oracle表空间使用情况的脚本
    SQLServer比较两个数据库的对象
    如何修复处于recovery挂起状态的数据库
    哪些因素会影响sqlserver shrink的速度
    mysql 修改字段长度
    ASP.NET CORE 发布时不编译Views文件夹
    HTTP Error 500.19 Internal Server Error 错误解决方案 (0x8007000d)
    Sql Prompt 10下载安装破解图文教程
    a标签不会下载txt文件的问题
  • 原文地址:https://www.cnblogs.com/fariver/p/7455433.html
Copyright © 2020-2023  润新知