• 在服务器上实现SSH(Single Stage Headless)


    服务器上ssh实现

    写在前面:这只是我在服务器上的环境实现的,仅供参考。要根据自己系统的环境做出修改。

    ==github源码(https://github.com/mahyarnajibi/SSH)==

    **实现参考(https://blog.csdn.net/qq_14845119/article/details/79105360)**

    https://blog.csdn.net/zziahgf/article/details/72900948)

    初始工作:安装cuda和cudnn还有nccl

    因为服务器上装好了cuda和cudnn,我选择了cuda9.0和cudnn7.0。所以直接安装nccl

    从github获取并安装

    git clone https://github.com/NVIDIA/nccl.git
    cd nccl 
    make clean && make PREFIX=$NCCL_ROOT_DIR install
    

    $NCCL_ROOT_DIR是自己安装的路径:比如我的路径是 /home/lzm/data/nccl/install则为:

    make clean && make PREFIX=/home/lzm/data/nccl/install install
    

    等nccl安装完成

    安装caffe-ssh

    1、所有都在conda建立的python虚拟环境下进行如:

    conda create -n caffetest(虚拟)  python=2.7(不是2.7貌似会报错) anaconda 
    conda activate caffetest
    

    2、从github获取源码:

    git clone --recursive https://github.com/mahyarnajibi/SSH.git
    

    3、进入目录SSH安装需要的python模块:

    cd SSH 
    pip install -r requirements.txt
    

    4、建立临时环境变量env

    (1)把nccl和conda环境写入env文件:

    dlm-conda activate caffetest
    export CPLUS_INCLUDE_PATH=$CPLUS_INCLUDE_PATH:/home/lzm/data/caffe/caffe1.0_nccl/nccl/install/include
    export C_INCLUDE_PATH=$C_INCLUDE_PATH:/home/lzm/data/caffe/caffe1.0_nccl/nccl/install/include
    export LIBRARY_PATH=$LIBRARY_PATH:/home/lzm/data/caffe/caffe1.0_nccl/nccl/install/lib
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/lzm/data/caffe/caffe1.0_nccl/nccl/install/lib
    

    (2)激活环境变量:

    source ./env
    

    5、配置文件将Makefile.config.example拷贝一份成配置文件:

    cd caffe-ssh
    cp Makefile.config.example Makefile.config
    

    修改Makefile.config

    (1)改成自己cuda的目录:​

    CUDA_DIR := /usr/local/cuda
    改成
    CUDA_DIR := /usr/local/nvidia/cuda/9.0
    

    (2)去掉的注释:

    #OPENCV_VERSION := 3
    改成
    OPENCV_VERSION := 3
    

    (3)修改环境路径

    INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
    LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib
    改为
    INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/
    LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial/
    

    6、安装缺少模块

    conda install -c conda-forge readline=6.2
    conda install libgcc
    

    7、编译

    make all -j32
    

    8、编译pycaffe生成接口

    make pycaffe
    

    9、在lib中编译运行setup.py

    cd ../lib/
    make
    

    10、用scripts中的脚本下载模型

    cd ..
    bash scripts/download_ssh_model.sh
    bash scripts/ download_imgnet_model.sh
    

    11、运行模型演示

    python demo.py
    

    结果如下:

    可能出现的问题

    (1)

    问题:

    Unsupported gpu architecture 'compute_20'

    解决方案:

    https://askubuntu.com/questions/960238/nvcc-fatal-unsupported-gpu-architecture-compute-20

    即去掉Makefile.config 中两行:

    CUDA_ARCH := -gencode arch=compute_20,code=sm_20 
            -gencode arch=compute_20,code=sm_21 
            -gencode arch=compute_30,code=sm_30 
            -gencode arch=compute_35,code=sm_35 
            -gencode arch=compute_50,code=sm_50 
            -gencode arch=compute_50,code=compute_50
     改为:
     CUDA_ARCH := -gencode arch=compute_50,code=sm_50 
            -gencode arch=compute_52,code=sm_52 
            -gencode arch=compute_60,code=sm_60 
            -gencode arch=compute_62,code=sm_62 
            -gencode arch=compute_61,code=compute_61
    

    (2)

    问题:

    awk: symbol lookup error: /home/lzm/.conda/envs/lzm2/lib/libreadline.so.6: undefined symbol: PC

    解决方案:

    https://github.com/conda-forge/rpy2-feedstock/issues/1

    https://github.com/bioconda/bioconda-recipes/issues/5350

    即 run

    conda install -c conda-forge readline = 6.2
    

    (3)

    问题:

    ./include/caffe/util/hdf5.hpp:6:18: fatal error: hdf5.h: no such file or directory

    解决方案:

    https://github.com/BVLC/caffe/issues/2690

    https://github.com/NVIDIA/DIGITS/issues/156

    即Makefile.config 拿两行改掉:

    INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
    LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib
    改为
    INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/
    LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial/
    

    (4)

    问题:

    ./include/caffe/util/nccl.hpp:5:18: fatal error: nccl.h: No such file or directory

    解决方案:

    新建文件为env

    将服务器已经安装的nccl路径配置到env:

    export CPLUS_INCLUDE_PATH=$CPLUS_INCLUDE_PATH:/home/lzm/data/caffe/caffe1.0_nccl/nccl/install/include
    export C_INCLUDE_PATH=$C_INCLUDE_PATH:/home/lzm/data/caffe/caffe1.0_nccl/nccl/install/include
    export LIBRARY_PATH=$LIBRARY_PATH:/home/lzm/data/caffe/caffe1.0_nccl/nccl/install/lib
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/lzm/data/caffe/caffe1.0_nccl/nccl/install/lib
    

    每次要用的时候都激活环境:

    source  ./env
    

    (5)

    问题:

    .build_release/lib/libcaffe.so: undefined reference to `cv::imdecode

    解决方案:https://github.com/BVLC/caffe/issues/4621

    把Makefile.config 中 OPENCV_VERSION = 3的注释去掉即可

    (6)

    问题:

    /caffe/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by caffe-ssh/python/caffe/_caffe.so)

    解决方案:https://github.com/BVLC/caffe/issues/4953

    conda install libgcc
    

    PS:以上问题也是自己经过很久的搜索排查得出来的,不要怕麻烦,要善于搜索引擎,一切水到渠成

  • 相关阅读:
    Java接口的实现理解
    RDP |SSH |VNC简介
    关于彻底理解cookie,session,token的摘录,生动形象
    7.Reverse Integer  
    1.Two Sum
    图形化编程娱乐于教,Kittenblock实例,播放与录制声音
    图形化编程娱乐于教,Kittenblock实例,一只思考的变色猫
    内存条性能参数查询(任务8)
    任务8选配内存,重点解读兼容与接口的搭配技术,解读选配内存的过程
    图形化编程娱乐于教,Kittenblock实例,键盘操控角色
  • 原文地址:https://www.cnblogs.com/luzeming/p/10348342.html
Copyright © 2020-2023  润新知