• 使用Azure的GPU系列虚拟机Ubuntu-16.0.4安装GPU驱动并使用Tensorflow-GPU的过程。


    1、source activate python36
    2、source activate tensorflow-gpu
    3、pip install tensorflow-gpu(提示安装的这个版本:tensorflow_gpu-1.12.0-cp36-cp36m-m)

    4、查询GPU
    from tensorflow.python.client import device_lib

    def get_available_gpus():
    """
    查看GPU的命令:nvidia-smi
    查看被占用的情况:ps aux | grep PID
    :return: GPU个数
    """
    local_device_protos = device_lib.list_local_devices()
    print "all: %s" % [x.name for x in local_device_protos]
    print "gpu: %s" % [x.name for x in local_device_protos if x.device_type == 'GPU']


    get_available_gpus()

    报错提示ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory,因此需要安装cuda9

    5、使用https://developer.nvidia.com/cuda-90-download-archive?target_os=Linux下载。
    命令如下:
    cd /opt
    wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/cuda_9.0.176_384.81_linux-run
    sudo sh cuda_9.0.176_384.81_linux-run

    安装位置:/usr/local/cuda-9.0
    安装信息:
    Linux platform:

    /usr/local/cuda-#.#
    Do you accept the previously read EULA?
    accept/decline/quit: accept

    Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 384.81?
    (y)es/(n)o/(q)uit: n

    Install the CUDA 9.0 Toolkit?
    (y)es/(n)o/(q)uit: y

    Enter Toolkit Location
    [ default is /usr/local/cuda-9.0 ]:

    Do you want to install a symbolic link at /usr/local/cuda?
    (y)es/(n)o/(q)uit: y

    Install the CUDA 9.0 Samples?
    (y)es/(n)o/(q)uit: y

    Enter CUDA Samples Location
    [ default is /home/adai ]:

    Installing the CUDA Toolkit in /usr/local/cuda-9.0 ...

    Installing the CUDA Toolkit in /usr/local/cuda-9.0 ...
    Installing the CUDA Samples in /home/adai ...
    Copying samples to /home/adai/NVIDIA_CUDA-9.0_Samples now...
    Finished copying samples.

    ===========
    = Summary =
    ===========

    Driver: Not Selected
    Toolkit: Installed in /usr/local/cuda-9.0
    Samples: Installed in /home/adai

    Please make sure that
    - PATH includes /usr/local/cuda-9.0/bin
    - LD_LIBRARY_PATH includes /usr/local/cuda-9.0/lib64, or, add /usr/local/cuda-9.0/lib64 to /etc/ld.so.conf and run ldconfig as root

    To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-9.0/bin

    Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-9.0/doc/pdf for detailed information on setting up CUDA.

    ***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 9.0 functionality to work.
    To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run -silent -driver

    Logfile is /tmp/cuda_install_32689.log
    Signal caught, cleaning up
    (tensorflow-gpu) root@adailearninggpu:/opt#

    6、执行步骤4测试列出GPU,这时提示:
    libnvidia-fatbinaryloader.so.415.27: cannot open shared object file: No such file or directory

    7、解决办法:下载https://www.nvidia.com/content/DriverDownload-March2009/confirmation.php?url=/XFree86/Linux-x86_64/415.27/NVIDIA-Linux-x86_64-415.27.run&lang=us&type=TITAN
    执行:
    cd /opt/
    wget http://us.download.nvidia.com/XFree86/Linux-x86_64/415.27/NVIDIA-Linux-x86_64-415.27.run
    chmod 777 NVIDIA-Linux-x86_64-415.27.run
    ./NVIDIA-Linux-x86_64-415.27.run
    如果安装失败,则sudo apt-get --purge remove nvidia-*卸载原有Nvidia驱动。

    8、修改/etc/profile,添加下列到末尾,添加后执行:source /etc/profile
    export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
    export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}:/usr/lib/nvidia-415/

    9、测试第4步,成功时,会显示cpu、gpu设备。

  • 相关阅读:
    acm 总结之大数加法
    hdu 1004
    hdu 1887
    hdu 2007
    hdu 2004
    ACM总结之 A+B problem 总结
    nyoj_42_一笔画问题_201403181935
    最短路径--Floyd算法
    最短路径—Dijkstra算法
    nyoj_114_某种序列_201403161700
  • 原文地址:https://www.cnblogs.com/songxingzhu/p/10401386.html
Copyright © 2020-2023  润新知