最近因为特殊的原因重新安装了python,但是引发了一个很严重的问题——TensorFlow不好使了。
比如我下面这个执行文件test.py
:
import tensorflow as tf
print(tf.__version__)
得到的结果:
ubuntu@ubuntu:~/workspace$ sudo python test.py
Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "/usr/local/python3/lib/python3.6/imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
File "/usr/local/python3/lib/python3.6/imp.py", line 343, in load_dynamic
return _load(spec)
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test.py", line 1, in <module>
import tensorflow as tf
File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/__init__.py", line 24, in <module>
from tensorflow.python import pywrap_tensorflow # pylint: disable=unused-import
File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/__init__.py", line 49, in <module>
from tensorflow.python import pywrap_tensorflow
File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
raise ImportError(msg)
ImportError: Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
from tensorflow.python.pywrap_tensorflow_internal import *
File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
_pywrap_tensorflow_internal = swig_import_helper()
File "/home/ubuntu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "/usr/local/python3/lib/python3.6/imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
File "/usr/local/python3/lib/python3.6/imp.py", line 343, in load_dynamic
return _load(spec)
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory
Failed to load the native TensorFlow runtime.
See https://www.tensorflow.org/install/errors
for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.
直接使用Python可以执行,但是sudo或者crontab定时任务都无法正常运行。
使用find命令查找文件
find / -name libcublas.so.9.0
可以发现在我安装的目录下:/usr/local/cuda-9.0/lib64
google后,发现有个链接跟我遇到的情况很像:https://github.com/tensorflow/tensorflow/issues/15604
原来是动态链接库没有正常链接到,改正的方法就是在环境变量或者配置文件中添加。环境变量之前已经配过了,但是仍然不好使。
就尝试使用配置文件:
cat /etc/ld.so.conf
include /etc/ld.so.conf.d/*.conf
然后创建新的配置文件
vi /etc/ld.so.conf.d/cuda.conf
添加如下内容:
/usr/local/cuda-9.0/lib64
再次执行ldconfig -v | grep libcu
ubuntu@ubuntu:/usr/local/cuda-9.0/lib64$ ldconfig -v | grep libcu
/sbin/ldconfig.real: Path `/usr/lib/nvidia-384' given more than once
/sbin/ldconfig.real: Path `/usr/lib32/nvidia-384' given more than once
/sbin/ldconfig.real: Path `/lib/x86_64-linux-gnu' given more than once
/sbin/ldconfig.real: Path `/usr/lib/x86_64-linux-gnu' given more than once
/sbin/ldconfig.real: /lib/x86_64-linux-gnu/ld-2.23.so is the dynamic linker, ignoring
libcufft.so.9.0 -> libcufft.so.9.0.176
libcuinj64.so.9.0 -> libcuinj64.so.9.0.176
libcurand.so.9.0 -> libcurand.so.9.0.176
libcufftw.so.9.0 -> libcufftw.so.9.0.176
libcudart.so.9.0 -> libcudart.so.9.0.176
libcublas.so.9.0 -> libcublas.so.9.0.176
libcusparse.so.9.0 -> libcusparse.so.9.0.176
libcusolver.so.9.0 -> libcusolver.so.9.0.176
libcudnn.so.7 -> libcudnn.so.7.4.1
libcups.so.2 -> libcups.so.2
再次执行sudo python test.py
就没问题了。