机器学习环境配置系列四之theano

机器学习环境配置系列四之theano
决定撰写机器学习环境配置的主要原因就是因为theano的配置问题，为了能够用上gpu和cudnn加速，我是费劲了力气，因为theano1.0.0在配置方面出现了重大改变，而网上绝大多数都很老，无法解决新版本的问题。

1、安装基于anaconda进行theano安装
```
conda install theano
```
2、环境配置
```
echo "[global]
device = cuda
floatX = float32" > ~/.theanorc
```
官网上的floatX默认float32，原因是float64的运行速度没有float32快，本人没有测试，直接听从了官网的劝告。

device配置是最大的一个坑，网上绝大多数都说就device = gpu 在新版本的theano上是不正确的，为此我吃了很多苦头

3、cuDNN加速

在.theanorc中还要添加如下内容
```
[dnn]
enabled = True
include_path = /usr/local/cuda/include
library_path = /usr/local/cuda/lib64
```
4、出错问题

在运行theano的时候出现了错误，问题是头版本与库版本不一致，原因是创建的运行环境里面安装的cudnn是7.2.1版本，而系统级别安装的cudnn是7.3.1，导致了冲突，为了解决这个问题尝试了各种方法都没有用，最后删除了环境里面的cudnn解决了这个问题
```
conda remove -n cudnn
```
5、在theano必须是cudnn5以上的版本，如果安装了7.0.0以上的cudnn会提示，但是不必理会。

6、测试theano可以运行的gpu和进行了cudnn加速的代码如下
```
from theano import function, config, shared, tensor
import numpy
import time

vlen = 10 * 30 * 768  # 10 x #cores x # threads per core
iters = 1000

rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], tensor.exp(x))
print(f.maker.fgraph.toposort())
t0 = time.time()
for i in range(iters):
    r = f()
t1 = time.time()
print("Looping %d times took %f seconds" % (iters, t1 - t0))
print("Result is %s" % (r,))
if numpy.any([isinstance(x.op, tensor.Elemwise) and
              ('Gpu' not in type(x.op).__name__)
              for x in f.maker.fgraph.toposort()]):
    print('Used the cpu')
else:
    print('Used the gpu')
```
运行命令
```
python 代码文件名.py
```
```
输出如下信息代表配置成功
```
/home/用户名/anaconda3/envs/包名/lib/python2.7/site-packages/theano/gpuarray/dnn.py:184: UserWarning: Your cuDNN version is more recent than Theano. If you encounter problems, try updating Theano or downgrading cuDNN to a version >= v5 and <= v7.
warnings.warn("Your cuDNN version is more recent than "
Using cuDNN version 7301 on context None
Mapped name None to device cuda:GPU型号 (0000:04:00.0)
[GpuElemwise{exp,no_inplace}(<GpuArrayType<None>(float32, vector)>), HostFromGpu(gpuarray)(GpuElemwise{exp,no_inplace}.0)]
Looping 1000 times took 0.270748 seconds
Result is [ 1.23178029 1.61879349 1.52278066 ..., 2.20771813 2.29967761
1.62323296]
Used the gpu

这个帖子的成果用了我大约7天的时间，希望可以帮助到大家。
相关阅读:
尾递归
 Appium环境搭建
 虚拟机与主机的相互访问，虚拟机访问外网
 Python
npm i 安装
 redis过期键删除策略
 Redis的过期策略和内存淘汰机制
 redis的两种持久化方案
 JVM 方法内联
 进程/线程/协程
原文地址：https://www.cnblogs.com/jaww/p/9846184.html