• Crontab和sudo中无法使用TensorFlow ImportError libcublas.so.9.0


    最近因为特殊的原因重新安装了python,但是引发了一个很严重的问题——TensorFlow不好使了。

    比如我下面这个执行文件test.py

    import tensorflow as tf
    print(tf.__version__)
    

    得到的结果:

    ubuntu@ubuntu:~/workspace$ sudo python test.py
    Traceback (most recent call last):
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
        from tensorflow.python.pywrap_tensorflow_internal import *
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
        _pywrap_tensorflow_internal = swig_import_helper()
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
        _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
      File "/usr/local/python3/lib/python3.6/imp.py", line 243, in load_module
        return load_dynamic(name, filename, file)
      File "/usr/local/python3/lib/python3.6/imp.py", line 343, in load_dynamic
        return _load(spec)
    ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "test.py", line 1, in <module>
        import tensorflow as tf
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/__init__.py", line 24, in <module>
        from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/__init__.py", line 49, in <module>
        from tensorflow.python import pywrap_tensorflow
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
        raise ImportError(msg)
    ImportError: Traceback (most recent call last):
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
        from tensorflow.python.pywrap_tensorflow_internal import *
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
        _pywrap_tensorflow_internal = swig_import_helper()
      File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
        _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
      File "/usr/local/python3/lib/python3.6/imp.py", line 243, in load_module
        return load_dynamic(name, filename, file)
      File "/usr/local/python3/lib/python3.6/imp.py", line 343, in load_dynamic
        return _load(spec)
    ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory
    
    
    Failed to load the native TensorFlow runtime.
    
    See https://www.tensorflow.org/install/errors
    
    for some common reasons and solutions.  Include the entire stack trace
    above this error message when asking for help.
    

    直接使用Python可以执行,但是sudo或者crontab定时任务都无法正常运行。

    使用find命令查找文件

    find / -name libcublas.so.9.0
    

    可以发现在我安装的目录下:/usr/local/cuda-9.0/lib64

    google后,发现有个链接跟我遇到的情况很像:https://github.com/tensorflow/tensorflow/issues/15604

    原来是动态链接库没有正常链接到,改正的方法就是在环境变量或者配置文件中添加。环境变量之前已经配过了,但是仍然不好使。

    就尝试使用配置文件:

    cat /etc/ld.so.conf
    
    include /etc/ld.so.conf.d/*.conf
    

    然后创建新的配置文件

    vi /etc/ld.so.conf.d/cuda.conf
    
    添加如下内容:
    /usr/local/cuda-9.0/lib64
    

    再次执行ldconfig -v | grep libcu

    ubuntu@ubuntu:/usr/local/cuda-9.0/lib64$ ldconfig -v | grep libcu
    /sbin/ldconfig.real: Path `/usr/lib/nvidia-384' given more than once
    /sbin/ldconfig.real: Path `/usr/lib32/nvidia-384' given more than once
    /sbin/ldconfig.real: Path `/lib/x86_64-linux-gnu' given more than once
    /sbin/ldconfig.real: Path `/usr/lib/x86_64-linux-gnu' given more than once
    /sbin/ldconfig.real: /lib/x86_64-linux-gnu/ld-2.23.so is the dynamic linker, ignoring
    
    	libcufft.so.9.0 -> libcufft.so.9.0.176
    	libcuinj64.so.9.0 -> libcuinj64.so.9.0.176
    	libcurand.so.9.0 -> libcurand.so.9.0.176
    	libcufftw.so.9.0 -> libcufftw.so.9.0.176
    	libcudart.so.9.0 -> libcudart.so.9.0.176
    	libcublas.so.9.0 -> libcublas.so.9.0.176
    	libcusparse.so.9.0 -> libcusparse.so.9.0.176
    	libcusolver.so.9.0 -> libcusolver.so.9.0.176
    	libcudnn.so.7 -> libcudnn.so.7.4.1
    	libcups.so.2 -> libcups.so.2
    
    

    再次执行sudo python test.py就没问题了。

  • 相关阅读:
    android通过Canvas和Paint截取无锯齿圆形图片
    【转】mysql的cardinality异常,导致索引不可用
    mysql索引无效且sending data耗时巨大原因分析
    linux shell脚本通过参数名传递参数值
    git日志输出格式及两个版本之间差异列表
    jenkins结合ansible用shell实现自动化部署和回滚
    Linux下cp -rf总是提示覆盖的解决办法
    jenkins集成ansible注意事项Failed to connect to the host via ssh.
    ansible操作远程服务器报Error: ansible requires the stdlib json or simplejson module, neither was found!
    利用ssh-copy-id无需密码登录远程服务器
  • 原文地址:https://www.cnblogs.com/xing901022/p/10211407.html
Copyright © 2020-2023  润新知