如果内容侵权的话,联系我,我会立马删了的~因为参考的太多了,如果一一联系再等回复,战线太长了~~蟹蟹给我贡献技术源泉的作者们~
最近准备从理论和实验两个方面学习深度学习,所以,前面装好了Theano环境,后来知乎上看到这个回答,就调研了一下各个深度学习框架,我没有看源码,调研也不是很深入,仅仅是为了选择深度学习框架做的一个大概了解~
1. 如何选择深度学习框架?
参考资料如下:
1. https://github.com/zer0n/deepframeworks/blob/master/README.md
2. http://blog.csdn.net/qiexingqieying/article/details/51734347
3. https://www.zhihu.com/question/41907061
4. http://www.open-open.com/news/view/1069a70
5. http://www.kuqin.com/shuoit/20151124/349098.html
博客2总结如下:
库名称 | 开发语言 | 速度 | 灵活性 | 文档 | 适合模型 | 平台 | 上手 |
Caffe | c++/cuda | 快 | 一般 | 全面 | CNN | 所有系统 | 中等 |
TensorFlow | c++/cuda/python | 中等 | 好 | 中等 | CNN/RNN | LinuxOSX | 难 |
MXNet | c++/cuda | 快 | 好 | 全面 | CNN | 所有系统 | 中等 |
Torch | c/lua/cuda | 快 | 好 | 全面 | CNN/RNN | LinuxOSX | 中等 |
Theano | python/c++/cuda | 中等 | 好 | 中等 | CNN/RNN | LinuxOSX | 易 |
(1)Caffe
(2)TensorFlow
(3)MXNet
1.Extract AlexNet or VGG features? Use Caffe
2.Fine tune AlexNet for new classes? Use Caffe
3.Image caption with finetuning?
-> Need pretrained models (Caffe, Torch, Lasagne)
-> Need RNNs (Torch or Lasagne)
-> Use Torch or Lasagna
-> Need pretrained model (Caffe, Torch, Lasagna)
-> Need funny loss function
-> If loss function exists in Caffe: Use Caffe
-> If you want to write your own loss: Use Torch
5.Object Detection?
-> Need pretrained model (Torch, Caffe, Lasagne)
-> Need lots of custom imperative code (NOT Lasagne)
-> Use Caffe + Python or Torch
6.Language modeling with new RNN structure?
-> Need easy recurrent nets (NOT Caffe, Torch)
-> No need for pretrained models
-> Use Theano or TensorFlow
7.Implemente BatchNorm?
-> Don’t want to derive gradient? Theano or TensorFlow
-> Implement efficient backward pass? Use Torch
最后,JJ比较个人化地给出了自己的偏好:
第一部分对于这5个框架的介绍讲述了一些概念以及基本优缺点,首先我的使用情况就是文本训练学习,可能需要用到RNN模型,而且我比较熟悉python一些,C++以及lua都不太会,所以基本确定要了解Theano 和 Tensorflow这两个框架,杜客在知乎回答的内容中,选择tensorflow还是Theano,可以看出大牛介绍的主要还是图像领域的一些应用,然后第6点,Language modeling with new RNN structure也可以基本确定我们需要这两个框架。
然后选择谁?虽然Caffe的作者贾扬清老师说“都是基于Python的符号运算库,TensorFlow显然支持更好,Google也比高校有更多的人力投入。Theano的主要开发者现在都在Google,可以想见将来的工程资源上也会更偏向于TF一些”。知乎用户张昊说“ 1. 看你做什么application 2. 看哪个framework能够提供给你最多与你所做的问题相关的资源。举个例子,比如做language相关,在小数据上跑跑实验的话我觉得theano不错,网上能找到的相关资源(比如其他相关paper的实现,model)很多。如果做视觉相关的那theano的资源跟caffe和torch比就少多了,所以caffe和torch可能会是更好的选择。TF也不错,最近Google promote的很厉害,估计随着用的人越来越多在一两年内资源也会越来越多。”鉴于我目前只是学习一下,所以决定使用Theano ,但是今天还是花了蟹时间安装Tensorflow。
2.安装Tensorflow
Ubuntu14.04+cuda7.5+cudnnv4+Tensorflow
基本根据官方给的教程就可以安装了https://www.tensorflow.org,然后学校有时候打不开界面,所以也可以参考这里。
我选择的pip install方式。
$ sudo apt-get install python-pip python-dev
其实这些工具前面好像安装过了,但是怕有问题就再执行一遍,选择符合自己情况的命令执行下去。
# Ubuntu/Linux 64-bit, GPU enabled, Python 2.7 # Requires CUDA toolkit 7.5 and CuDNN v4. For other versions, see "Install from sources" below. $ export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.9.0-cp27-none-linux_x86_64.whl
出现错误,在教程里的common problems中说:
...
SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed
Solution: Download the wheel manually via curl or wget, and pip install locally.所以使用wget命令下载再执行安装。
wget https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.9.0-cp27-none-linux_x86_64.whl sudo pip install tensorflow-0.9.0-cp27-none-linux_x86_64.whl
接着测试tensorflow.
Open a terminal and type the following:
$ python ... >>> import tensorflow as tf >>> hello = tf.constant('Hello, TensorFlow!') >>> sess = tf.Session() >>> print(sess.run(hello)) Hello, TensorFlow! >>> a = tf.constant(10) >>> b = tf.constant(32) >>> print(sess.run(a + b)) 42 >>>
没有问题。
$ python -c 'import os; import inspect; import tensorflow; print(os.path.dirname(inspect.getfile(tensorflow)))'
结果如下:
测试运行:
$ python -m tensorflow.models.image.mnist.convolutional
出现错误:
lvxia@kde:~$ python -m tensorflow.models.image.mnist.convolutional
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally Extracting data/train-images-idx3-ubyte.gz Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/usr/local/lib/python2.7/dist-packages/tensorflow/models/image/mnist/convolutional.py", line 316, in <module> tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run sys.exit(main(sys.argv)) File "/usr/local/lib/python2.7/dist-packages/tensorflow/models/image/mnist/convolutional.py", line 128, in main train_data = extract_data(train_data_filename, 60000) File "/usr/local/lib/python2.7/dist-packages/tensorflow/models/image/mnist/convolutional.py", line 75, in extract_data buf = bytestream.read(IMAGE_SIZE * IMAGE_SIZE * num_images) File "/usr/lib/python2.7/gzip.py", line 261, in read self._read(readsize) File "/usr/lib/python2.7/gzip.py", line 308, in _read self._read_eof() File "/usr/lib/python2.7/gzip.py", line 347, in _read_eof hex(self.crc))) IOError: CRC check failed 0xe1d362ba != 0x90dd462eL
https://github.com/tensorflow/tensorflow/issues/1319中的解决方式:
因此,进入convolutional.py所在目录,修改文件权限,然后将WORK_DIRECTORY的data修改为 /usr/local/lib/python2.7/dist-packages/tensorflow/models/image/mnist/data 即可。
sudo chmod 777 convolutional.py
重新执行:
python -m tensorflow.models.image.mnist.convolutional
还是出现错误
E tensorflow/stream_executor/cuda/cuda_dnn.cc:286] Loaded cudnn library: 5005 but source was compiled against 4007. If using a binary install, upgrade your cudnn library to match. If building from sources, make sure the library loaded matches the version you specified during compile configuration.
可以看到是cudnn版本不一致的问题导致的。
然后官网上有这么一句“Download cuDNN v4 (v5 is currently a release candidate and is only supported when installing TensorFlow from sources).”,所以我就下载了cuDNN v4。
tar xvzf cudnn-7.0-linux-x64-v4.tgz sudo cp cuda/include/cudnn.h /usr/local/cuda-7.5/include sudo cp cuda/lib64/libcudnn* /usr/local/cuda-7.5/lib64 sudo chmod a+r /usr/local/cuda-7.5/include/cudnn.h /usr/local/cuda-7.5/lib64/libcudnn*
我忘记以前怎么操作的了,我的/esr/local文件夹下有两个cuda文件,一个是cuda一个是cuda-7.5.这里我把他放在cuda7.5文件夹下面。
然后执行上述命令就没有问题了。
中间晕晕呼呼还尝试了一遍源代码安装方式,就是官网上的install from sources,基本步骤也按照上面来,结合 博客 博客 就可以了,碰到蟹问题,基本google能找出解决办法的。
这篇博客讲述了tensorflow源码目录结构的一些知识。
这里记录几个小问题和解决方法:
(1)OSError - Errno 13 Permission denied
chown -R user-id:group-id /path/to/the/directory
(2)AttributeError: type object 'NewBase' has no attribute 'is_abstract'
sudo pip install six --upgrade --target="/Library/Python/2.7/site-packages/"
(3)./configure 在 tensorflow目录下,这个在源代码安装方式中用到这个配置了。