• 3.24-Caffe-opencl-构建


    神经网络with AMD 显卡

    前言

    参考

    Problem with greentea
    Github-caffe-opencl
    TestSharedWeightsUpdate Test Failed
    opencl-caffe github page
    amd opencl-zone
    Setup clBLAS and OpenCL

    记录

    安装私有显卡驱动程序

    • 到官网即可

    安装显卡APPSDK

    • 到官网即可
    • 记住clinfo查看情况,如果没有查看到AMD APP,则按下步骤解决
      • sudo apt-get install fglrx fglrx-core fglrx-amdcccle fglrx-dev fglrx-pxpress

    安装CUDA(不要安装!)

    • sudo apt-get install nvidia-cuda-toolkit

    安装boost

    • sudo apt-get install libboost-all-dev

    安装googletest

    • sudo apt-get install libgtest-dev

    使用ACML数学库(AMD Concurrent Mathmatical Library)

    先安装gfortran

    • sudo apt-get install gfortran
    • 查看自己CPU支持哪些指令集,本机支持SSE2,AVX,只要支持AVX或者FMA4就可以使用这个库加速以提高性能,不然就不行
      • cat /proc/cpuinfo | grep flags

    安装ACML

    • sudo mv ./acml6.1.0 /opt
    • 编译时一般使用动态链接库,要使用动态链接库,需设置环境变量LD_LIBRARY_PATH
      • If you have an SMP machine and want to take best advantage of it, link against the
        gfortran OpenMP version of ACML like this:
      • 由于本机不是多处理器机器,所以使用gfortran64
      • 到底是不是呢?多核处理器应该也是多处理器才对啊~加到变量里面再说吧!
    • echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/acml6.1.0/gfortran64/lib' >> ~/.bashrc
    • echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/acml6.1.0/gfortran64_mp/lib' >> ~/.bashrc
    • source ~/.bashrc
      • 环境变量立即生效
    • 示例程序在/opt/acml6.1.0/gfrotran64/examples
      • 使用GNUMakefile进行编译
      • 使用计时程序画图表示
      • sudo apt-get install gnuplot
      • 编译时使用多线程
      • % make OMP_NUM_THREADS=2
      • 编译时画时间图
      • % make plots
      • You may need to edit the GNUmakefile to point at the correct location of your
        installed copy of the ACML, or to change compilers and flags.

    安装cIBLAS(a software library containing BLAS functions written in OpenCL)

    • git clone
    • cd src & mkdir build & cd build
    • cmake ..
    • sudo make
      • collect2: error: ld returned 1 exit status
      • make[2]: * [staging/test-correctness] Error 1
      • make[1]: * [tests/CMakeFiles/test-correctness.dir/all] Error 2
      • make: * [all] Error 2
      • 这是个BUG,去下载dev版本的clBLAS吧
      • 还没解决。。。看帖子
      • 检查gcc(4.8.4),CMake(2.8.12.2),Make(3.8.1)版本
    • 检查各个依赖项的配置是否正确
      • OPENCL配置
      • 无误
      • boost
      • 没有设置环境变量
      • export BOOST_ROOT=/usr/lib/x86_64-linux-gnu
      • gTest
      • 已经安装好了
      • 不仅要装clACML,也要装ACML
      • 早就装好了
      • clBLAS
      • install cmake-gui
      • 检查cmake 输出
      • sudo apt-get install liblapack-dev
    • 成功!!!
    • 安装在/usr/local/lib64
      • export CLBALS_ROOT=/usr/local/lib64
      • /usr/local/include

    安装Caffe-opencl

    • sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
    • sudo apt-get install --no-install-recommends libboost-all-dev
    • sudo apt-get install libatlas-base-dev
    • sudo apt-get install the python-dev
    • sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev
    • compile opencl-caffe
      • 修改Makefile.config
    # Override BLAS, use clBLAS insead of ViennaclBLAS.
    USE_CLBLAS := 1
    # Custom clBLAS lib and include directories.
    CLBLAS_INCLUDE := /usr/local/include
    CLBLAS_LIB     := /usr/local/lib64
    ...
    
    # CUDA directory contains bin/ and lib/ directories that we need.
    # CUDA_DIR := /usr/local/cuda
    # On Ubuntu 14.04, if cuda tools are installed via
    # "sudo apt-get install nvidia-cuda-toolkit" then use this instead:
    CUDA_DIR := /usr
    • make
    • make runtest
      • 出错,发现还是GREENTEA的问题,决定重新安装Vienna库(/usr/local/include/)
      • 记住,重新构建一定要make clean
    • 还有一个测试错误

      That's ok, it's not actually a fail, the epsilon-distance between "is" and "should" is just a bit bigger than the test setting appreciates.

      • 这只是计算误差而已,没有问题!
    • 大功告成,开始使用

    mnist实例

    cd ./caffe-opencl
    data/mnist/get_mnist.sh
    examples/mnist/create_mnist.sh
    examples/mnist/train_lenet.sh
  • 相关阅读:
    聚焦LSMIMO的四大层面,浅谈5G关键技术
    基于LiteOS Studio零成本学习LiteOS物联网操作系统
    使用LiteOS Studio图形化查看LiteOS在STM32上运行的奥秘
    GaussDB(DWS)应用实践丨负载管理与作业排队处理方法
    GaussDB(DWS)磁盘维护:vacuum full执行慢怎么办?
    从物理空间到数字世界,数字孪生打造智能化基础设施
    Lab 4 : OpenFlow
    SDN控制器拓扑发现(一)
    pxe dhcp
    RyuBook1.0案例二:Traffic Monitor项目源码分析
  • 原文地址:https://www.cnblogs.com/lizhensheng/p/11183667.html
Copyright © 2020-2023  润新知