网络上有各种各样的win7 64bit安装theano的方法,我也试过好多,各种各样的问题。因为之前没了解过MinGw等东西,所以安装起来比较费劲,经过不断的尝试,最终我按照以下过程安装成功。
其实过程很简单,首先说一下安装条件:
- win10 (32和64都可以,下载安装包时一定要选择对应的)
- vs2010(不一定非要是vs2010,恰好我有vs2010,应该是配置GPU编程时需要用到vs的编译器)
- Anaconda(转到官方下载,打开之后稍微等一会就会出来下载链接了。之所以选择它是因为它内置了python,以及numpy、scipy两个必要库和一些其他库,比起自己安装要省事。至于版本随便选择了,如果想安装python3.4就下载对应的Anaconda3。本教程使用Anaconda,也就是对应的python2.7版本。安装过程无差别。)
安装过程:
一、卸载之前版本。
把之前单独安装的Python等统统卸载掉。学python的时候直接安装了python2.7,先把他卸载掉,因为Anaconda里边包含了python。
二、安装Anaconda。
这个超级简单,安装目录我用的是的 D:Anaconda2 。这个特别要注意:安装路径千万不要有空格!!!血的教训
三、安装MinGw。
其他教程讲在环境变量中添加 path D:Anaconda2MinGWin;D:Anaconda2MinGWx86_64-w64-mingw32lib; ,但是你会发现 D:Anaconda2 下面根本没有MinGw这个目录,所以最好的方法就是用命令安装,不需要自己下载什么mingw-steup.exe等。
安装方法:
- 打开CMD(注意是windows命令提示符,并不是进入到python环境下,否则会提示语法错误,因为conda命令就是在windows下面执行的。);
- 输入conda install mingw libpython,然后回车,会出现安装进度,稍等片刻即可安装完毕。此时就有D:Anaconda2MinGw目录了。
四、配置环境变量。
- 编辑用户变量中的path变量(如果没有就新建一个,一般会有的),在后边追加D:Anaconda2;D:Anaconda2Scripts; 不要漏掉分号,此处因为我的Anaconda的安装目录是D:Anaconda2,此处需要根据自己的安装目录填写。
- 在用户变量中新建变量pythonpath,变量值为D:Anaconda2Libsite-packages heano; ,此处就是指明安装的theano的目录是哪,但是现在咱们还没有安装,所以不着急,先写完再说。
- 打开cmd,会看到窗口里边有个路径,我的是C:UsersAdministrator>,根据自己的路径,找到对应的目录,在该目录下新建一个文本文档.theanorc.txt (注意有两个“.”),编辑它,写入以下内容:
[global]
openmp=False
[blas]
ldflags=
[gcc]
cxxflags=-ID:Anaconda2MinGW
其中红体字部分是你安装的Anaconda的路径,一定不要弄错。否则找不到MinGw。 - 最好重启一下电脑。
五、安装Theano。
不需要手动下载zip等压缩包,直接用命令安装最简单。
- 打开CMD,方法和安装MinGw一样,不要进入python。
- 输入pip install theano,回车后就是赏心悦目的下载进度条,这个很小,所以安装的比较快。
-
这里我的安装出现了pip命令不能识别的问题
-
Unable to create process using '""
-
暂时用 python -m pip install theano来代替了
-
- 在cmd中,输入python 进入到python环境下,然后先输入import theano回车,需要等一段时间。
- 继续输入theano.test()。又会输出好长一段信息,只要没有error就说明安装成功了。我安装时等了一段时间还在输出,我就ctrl+c退出了。(其实我发现,有部分error信息也没有关系,theano的功能也可以正常使用,包括theano.function(),所以如果有同学无论如何配置还是有error信息的话,可以暂时忽略掉,直接跑一段程序试一下,可以去测试一下卷积操作运算代码。
六、使用GPU
因为博主电脑是AMD的显卡,CUDA显然不支持,也不用想把GPU利用起来。
七、深度学习框架Keras
- 打开CMD,方法和安装MinGw一样,不要进入python。
- 输入pip install theano,回车后就是赏心悦目的下载进度条。
同样pip命令识别不了,用的 python -m pip install keras代替
注:在Anaconda Prompt中是识别pip命令的,上述两个pip命令也可以直接在这里面装,效果是一样的。
八、小例子
1、theano测试
1 from __future__ import print_function 2 """ 3 Created on Tue Aug 16 14:05:45 2016 4 5 @author: Administrator 6 """ 7 8 """ 9 This tutorial introduces logistic regression using Theano and stochastic 10 gradient descent. 11 12 Logistic regression is a probabilistic, linear classifier. It is parametrized 13 by a weight matrix :math:`W` and a bias vector :math:`b`. Classification is 14 done by projecting data points onto a set of hyperplanes, the distance to 15 which is used to determine a class membership probability. 16 17 Mathematically, this can be written as: 18 19 .. math:: 20 P(Y=i|x, W,b) &= softmax_i(W x + b) \ 21 &= frac {e^{W_i x + b_i}} {sum_j e^{W_j x + b_j}} 22 23 24 The output of the model or prediction is then done by taking the argmax of 25 the vector whose i'th element is P(Y=i|x). 26 27 .. math:: 28 29 y_{pred} = argmax_i P(Y=i|x,W,b) 30 31 32 This tutorial presents a stochastic gradient descent optimization method 33 suitable for large datasets. 34 35 36 References: 37 38 - textbooks: "Pattern Recognition and Machine Learning" - 39 Christopher M. Bishop, section 4.3.2 40 41 """ 42 43 44 45 __docformat__ = 'restructedtext en' 46 47 import six.moves.cPickle as pickle 48 import gzip 49 import os 50 import sys 51 import timeit 52 53 import numpy 54 55 import theano 56 import theano.tensor as T 57 58 59 class LogisticRegression(object): 60 """Multi-class Logistic Regression Class 61 62 The logistic regression is fully described by a weight matrix :math:`W` 63 and bias vector :math:`b`. Classification is done by projecting data 64 points onto a set of hyperplanes, the distance to which is used to 65 determine a class membership probability. 66 """ 67 68 def __init__(self, input, n_in, n_out): 69 """ Initialize the parameters of the logistic regression 70 71 :type input: theano.tensor.TensorType 72 :param input: symbolic variable that describes the input of the 73 architecture (one minibatch) 74 75 :type n_in: int 76 :param n_in: number of input units, the dimension of the space in 77 which the datapoints lie 78 79 :type n_out: int 80 :param n_out: number of output units, the dimension of the space in 81 which the labels lie 82 83 """ 84 # start-snippet-1 85 # initialize with 0 the weights W as a matrix of shape (n_in, n_out) 86 self.W = theano.shared( 87 value=numpy.zeros( 88 (n_in, n_out), 89 dtype=theano.config.floatX 90 ), 91 name='W', 92 borrow=True 93 ) 94 # initialize the biases b as a vector of n_out 0s 95 self.b = theano.shared( 96 value=numpy.zeros( 97 (n_out,), 98 dtype=theano.config.floatX 99 ), 100 name='b', 101 borrow=True 102 ) 103 104 # symbolic expression for computing the matrix of class-membership 105 # probabilities 106 # Where: 107 # W is a matrix where column-k represent the separation hyperplane for 108 # class-k 109 # x is a matrix where row-j represents input training sample-j 110 # b is a vector where element-k represent the free parameter of 111 # hyperplane-k 112 self.p_y_given_x = T.nnet.softmax(T.dot(input, self.W) + self.b) 113 114 # symbolic description of how to compute prediction as class whose 115 # probability is maximal 116 self.y_pred = T.argmax(self.p_y_given_x, axis=1) 117 # end-snippet-1 118 119 # parameters of the model 120 self.params = [self.W, self.b] 121 122 # keep track of model input 123 self.input = input 124 125 def negative_log_likelihood(self, y): 126 """Return the mean of the negative log-likelihood of the prediction 127 of this model under a given target distribution. 128 129 .. math:: 130 131 frac{1}{|mathcal{D}|} mathcal{L} ( heta={W,b}, mathcal{D}) = 132 frac{1}{|mathcal{D}|} sum_{i=0}^{|mathcal{D}|} 133 log(P(Y=y^{(i)}|x^{(i)}, W,b)) \ 134 ell ( heta={W,b}, mathcal{D}) 135 136 :type y: theano.tensor.TensorType 137 :param y: corresponds to a vector that gives for each example the 138 correct label 139 140 Note: we use the mean instead of the sum so that 141 the learning rate is less dependent on the batch size 142 """ 143 # start-snippet-2 144 # y.shape[0] is (symbolically) the number of rows in y, i.e., 145 # number of examples (call it n) in the minibatch 146 # T.arange(y.shape[0]) is a symbolic vector which will contain 147 # [0,1,2,... n-1] T.log(self.p_y_given_x) is a matrix of 148 # Log-Probabilities (call it LP) with one row per example and 149 # one column per class LP[T.arange(y.shape[0]),y] is a vector 150 # v containing [LP[0,y[0]], LP[1,y[1]], LP[2,y[2]], ..., 151 # LP[n-1,y[n-1]]] and T.mean(LP[T.arange(y.shape[0]),y]) is 152 # the mean (across minibatch examples) of the elements in v, 153 # i.e., the mean log-likelihood across the minibatch. 154 return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y]) 155 # end-snippet-2 156 157 def errors(self, y): 158 """Return a float representing the number of errors in the minibatch 159 over the total number of examples of the minibatch ; zero one 160 loss over the size of the minibatch 161 162 :type y: theano.tensor.TensorType 163 :param y: corresponds to a vector that gives for each example the 164 correct label 165 """ 166 167 # check if y has same dimension of y_pred 168 if y.ndim != self.y_pred.ndim: 169 raise TypeError( 170 'y should have the same shape as self.y_pred', 171 ('y', y.type, 'y_pred', self.y_pred.type) 172 ) 173 # check if y is of the correct datatype 174 if y.dtype.startswith('int'): 175 # the T.neq operator returns a vector of 0s and 1s, where 1 176 # represents a mistake in prediction 177 return T.mean(T.neq(self.y_pred, y)) 178 else: 179 raise NotImplementedError() 180 181 182 def load_data(dataset): 183 ''' Loads the dataset 184 185 :type dataset: string 186 :param dataset: the path to the dataset (here MNIST) 187 ''' 188 189 ############# 190 # LOAD DATA # 191 ############# 192 193 # Download the MNIST dataset if it is not present 194 data_dir, data_file = os.path.split(dataset) 195 if data_dir == "" and not os.path.isfile(dataset): 196 # Check if dataset is in the data directory. 197 new_path = os.path.join( 198 os.path.split(__file__)[0], 199 "..", 200 "data", 201 dataset 202 ) 203 if os.path.isfile(new_path) or data_file == 'mnist.pkl.gz': 204 dataset = new_path 205 206 if (not os.path.isfile(dataset)) and data_file == 'mnist.pkl.gz': 207 from six.moves import urllib 208 origin = ( 209 'http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz' 210 ) 211 print('Downloading data from %s' % origin) 212 urllib.request.urlretrieve(origin, dataset) 213 214 print('... loading data') 215 216 # Load the dataset 217 with gzip.open(dataset, 'rb') as f: 218 try: 219 train_set, valid_set, test_set = pickle.load(f, encoding='latin1') 220 except: 221 train_set, valid_set, test_set = pickle.load(f) 222 # train_set, valid_set, test_set format: tuple(input, target) 223 # input is a numpy.ndarray of 2 dimensions (a matrix) 224 # where each row corresponds to an example. target is a 225 # numpy.ndarray of 1 dimension (vector) that has the same length as 226 # the number of rows in the input. It should give the target 227 # to the example with the same index in the input. 228 229 def shared_dataset(data_xy, borrow=True): 230 """ Function that loads the dataset into shared variables 231 232 The reason we store our dataset in shared variables is to allow 233 Theano to copy it into the GPU memory (when code is run on GPU). 234 Since copying data into the GPU is slow, copying a minibatch everytime 235 is needed (the default behaviour if the data is not in a shared 236 variable) would lead to a large decrease in performance. 237 """ 238 data_x, data_y = data_xy 239 shared_x = theano.shared(numpy.asarray(data_x, 240 dtype=theano.config.floatX), 241 borrow=borrow) 242 shared_y = theano.shared(numpy.asarray(data_y, 243 dtype=theano.config.floatX), 244 borrow=borrow) 245 # When storing data on the GPU it has to be stored as floats 246 # therefore we will store the labels as ``floatX`` as well 247 # (``shared_y`` does exactly that). But during our computations 248 # we need them as ints (we use labels as index, and if they are 249 # floats it doesn't make sense) therefore instead of returning 250 # ``shared_y`` we will have to cast it to int. This little hack 251 # lets ous get around this issue 252 return shared_x, T.cast(shared_y, 'int32') 253 254 test_set_x, test_set_y = shared_dataset(test_set) 255 valid_set_x, valid_set_y = shared_dataset(valid_set) 256 train_set_x, train_set_y = shared_dataset(train_set) 257 258 rval = [(train_set_x, train_set_y), (valid_set_x, valid_set_y), 259 (test_set_x, test_set_y)] 260 return rval 261 262 263 def sgd_optimization_mnist(learning_rate=0.13, n_epochs=1000, 264 dataset='mnist.pkl.gz', 265 batch_size=600): 266 """ 267 Demonstrate stochastic gradient descent optimization of a log-linear 268 model 269 270 This is demonstrated on MNIST. 271 272 :type learning_rate: float 273 :param learning_rate: learning rate used (factor for the stochastic 274 gradient) 275 276 :type n_epochs: int 277 :param n_epochs: maximal number of epochs to run the optimizer 278 279 :type dataset: string 280 :param dataset: the path of the MNIST dataset file from 281 http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz 282 283 """ 284 datasets = load_data(dataset) 285 286 train_set_x, train_set_y = datasets[0] 287 valid_set_x, valid_set_y = datasets[1] 288 test_set_x, test_set_y = datasets[2] 289 290 # compute number of minibatches for training, validation and testing 291 n_train_batches = train_set_x.get_value(borrow=True).shape[0] // batch_size 292 n_valid_batches = valid_set_x.get_value(borrow=True).shape[0] // batch_size 293 n_test_batches = test_set_x.get_value(borrow=True).shape[0] // batch_size 294 295 ###################### 296 # BUILD ACTUAL MODEL # 297 ###################### 298 print('... building the model') 299 300 # allocate symbolic variables for the data 301 index = T.lscalar() # index to a [mini]batch 302 303 # generate symbolic variables for input (x and y represent a 304 # minibatch) 305 x = T.matrix('x') # data, presented as rasterized images 306 y = T.ivector('y') # labels, presented as 1D vector of [int] labels 307 308 # construct the logistic regression class 309 # Each MNIST image has size 28*28 310 classifier = LogisticRegression(input=x, n_in=28 * 28, n_out=10) 311 312 # the cost we minimize during training is the negative log likelihood of 313 # the model in symbolic format 314 cost = classifier.negative_log_likelihood(y) 315 316 # compiling a Theano function that computes the mistakes that are made by 317 # the model on a minibatch 318 test_model = theano.function( 319 inputs=[index], 320 outputs=classifier.errors(y), 321 givens={ 322 x: test_set_x[index * batch_size: (index + 1) * batch_size], 323 y: test_set_y[index * batch_size: (index + 1) * batch_size] 324 } 325 ) 326 327 validate_model = theano.function( 328 inputs=[index], 329 outputs=classifier.errors(y), 330 givens={ 331 x: valid_set_x[index * batch_size: (index + 1) * batch_size], 332 y: valid_set_y[index * batch_size: (index + 1) * batch_size] 333 } 334 ) 335 336 # compute the gradient of cost with respect to theta = (W,b) 337 g_W = T.grad(cost=cost, wrt=classifier.W) 338 g_b = T.grad(cost=cost, wrt=classifier.b) 339 340 # start-snippet-3 341 # specify how to update the parameters of the model as a list of 342 # (variable, update expression) pairs. 343 updates = [(classifier.W, classifier.W - learning_rate * g_W), 344 (classifier.b, classifier.b - learning_rate * g_b)] 345 346 # compiling a Theano function `train_model` that returns the cost, but in 347 # the same time updates the parameter of the model based on the rules 348 # defined in `updates` 349 train_model = theano.function( 350 inputs=[index], 351 outputs=cost, 352 updates=updates, 353 givens={ 354 x: train_set_x[index * batch_size: (index + 1) * batch_size], 355 y: train_set_y[index * batch_size: (index + 1) * batch_size] 356 } 357 ) 358 # end-snippet-3 359 360 ############### 361 # TRAIN MODEL # 362 ############### 363 print('... training the model') 364 # early-stopping parameters 365 patience = 5000 # look as this many examples regardless 366 patience_increase = 2 # wait this much longer when a new best is 367 # found 368 improvement_threshold = 0.995 # a relative improvement of this much is 369 # considered significant 370 validation_frequency = min(n_train_batches, patience // 2) 371 # go through this many 372 # minibatche before checking the network 373 # on the validation set; in this case we 374 # check every epoch 375 376 best_validation_loss = numpy.inf 377 test_score = 0. 378 start_time = timeit.default_timer() 379 380 done_looping = False 381 epoch = 0 382 while (epoch < n_epochs) and (not done_looping): 383 epoch = epoch + 1 384 for minibatch_index in range(n_train_batches): 385 386 minibatch_avg_cost = train_model(minibatch_index) 387 # iteration number 388 iter = (epoch - 1) * n_train_batches + minibatch_index 389 390 if (iter + 1) % validation_frequency == 0: 391 # compute zero-one loss on validation set 392 validation_losses = [validate_model(i) 393 for i in range(n_valid_batches)] 394 this_validation_loss = numpy.mean(validation_losses) 395 396 print( 397 'epoch %i, minibatch %i/%i, validation error %f %%' % 398 ( 399 epoch, 400 minibatch_index + 1, 401 n_train_batches, 402 this_validation_loss * 100. 403 ) 404 ) 405 406 # if we got the best validation score until now 407 if this_validation_loss < best_validation_loss: 408 #improve patience if loss improvement is good enough 409 if this_validation_loss < best_validation_loss * 410 improvement_threshold: 411 patience = max(patience, iter * patience_increase) 412 413 best_validation_loss = this_validation_loss 414 # test it on the test set 415 416 test_losses = [test_model(i) 417 for i in range(n_test_batches)] 418 test_score = numpy.mean(test_losses) 419 420 print( 421 ( 422 ' epoch %i, minibatch %i/%i, test error of' 423 ' best model %f %%' 424 ) % 425 ( 426 epoch, 427 minibatch_index + 1, 428 n_train_batches, 429 test_score * 100. 430 ) 431 ) 432 433 # save the best model 434 with open('best_model.pkl', 'wb') as f: 435 pickle.dump(classifier, f) 436 437 if patience <= iter: 438 done_looping = True 439 break 440 441 end_time = timeit.default_timer() 442 print( 443 ( 444 'Optimization complete with best validation score of %f %%,' 445 'with test performance %f %%' 446 ) 447 % (best_validation_loss * 100., test_score * 100.) 448 ) 449 print('The code run for %d epochs, with %f epochs/sec' % ( 450 epoch, 1. * epoch / (end_time - start_time))) 451 print(('The code for file ' + 452 os.path.split(__file__)[1] + 453 ' ran for %.1fs' % ((end_time - start_time))), file=sys.stderr) 454 455 456 def predict(): 457 """ 458 An example of how to load a trained model and use it 459 to predict labels. 460 """ 461 462 # load the saved model 463 classifier = pickle.load(open('best_model.pkl')) 464 465 # compile a predictor function 466 predict_model = theano.function( 467 inputs=[classifier.input], 468 outputs=classifier.y_pred) 469 470 # We can test it on some examples from test test 471 dataset='mnist.pkl.gz' 472 datasets = load_data(dataset) 473 test_set_x, test_set_y = datasets[2] 474 test_set_x = test_set_x.get_value() 475 476 predicted_values = predict_model(test_set_x[:10]) 477 print("Predicted values for the first 10 examples in test set:") 478 print(predicted_values) 479 480 481 if __name__ == '__main__': 482 sgd_optimization_mnist()
2、Keras测试
1 '''Trains a simple convnet on the MNIST dataset. 2 Gets to 99.25% test accuracy after 12 epochs 3 (there is still a lot of margin for parameter tuning). 4 16 seconds per epoch on a GRID K520 GPU. 5 ''' 6 7 from __future__ import print_function 8 import numpy as np 9 np.random.seed(1337) # for reproducibility 10 11 from keras.datasets import mnist 12 from keras.models import Sequential 13 from keras.layers import Dense, Dropout, Activation, Flatten 14 from keras.layers import Convolution2D, MaxPooling2D 15 from keras.utils import np_utils 16 17 batch_size = 128 18 nb_classes = 10 19 nb_epoch = 12 20 21 # input image dimensions 22 img_rows, img_cols = 28, 28 23 # number of convolutional filters to use 24 nb_filters = 32 25 # size of pooling area for max pooling 26 nb_pool = 2 27 # convolution kernel size 28 nb_conv = 3 29 30 # the data, shuffled and split between train and test sets 31 (X_train, y_train), (X_test, y_test) = mnist.load_data() 32 33 X_train = X_train.reshape(X_train.shape[0], 1, img_rows, img_cols) 34 X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols) 35 X_train = X_train.astype('float32') 36 X_test = X_test.astype('float32') 37 X_train /= 255 38 X_test /= 255 39 print('X_train shape:', X_train.shape) 40 print(X_train.shape[0], 'train samples') 41 print(X_test.shape[0], 'test samples') 42 43 # convert class vectors to binary class matrices 44 Y_train = np_utils.to_categorical(y_train, nb_classes) 45 Y_test = np_utils.to_categorical(y_test, nb_classes) 46 47 model = Sequential() 48 49 model.add(Convolution2D(nb_filters, nb_conv, nb_conv, 50 border_mode='valid', 51 input_shape=(1, img_rows, img_cols))) 52 model.add(Activation('relu')) 53 model.add(Convolution2D(nb_filters, nb_conv, nb_conv)) 54 model.add(Activation('relu')) 55 model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool))) 56 model.add(Dropout(0.25)) 57 58 model.add(Flatten()) 59 model.add(Dense(128)) 60 model.add(Activation('relu')) 61 model.add(Dropout(0.5)) 62 model.add(Dense(nb_classes)) 63 model.add(Activation('softmax')) 64 65 model.compile(loss='categorical_crossentropy', 66 optimizer='adadelta', 67 metrics=['accuracy']) 68 69 model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch, 70 verbose=1, validation_data=(X_test, Y_test)) 71 score = model.evaluate(X_test, Y_test, verbose=0) 72 print('Test score:', score[0]) 73 print('Test accuracy:', score[1])
参考链接:
2. 小白Windows7/10 64Bit安装Theano并实现GPU加速(没有MinGw等,详细步骤)
3. https://bitbucket.org/pypa/distlib/issues/47/exe-launcher-fails-if-there-is-a-space-in
4. http://stackoverflow.com/questions/24627525/fatal-error-in-launcher-unable-to-create-process-using-c-program-files-x86/26428562#26428562
5. Theano 安装教程
6. Installation of Theano on Windows
7. http://stackoverflow.com/questions/33687103/how-to-install-theano-on-anaconda-python-2-7-x64-on-windows?noredirect=1&lq=1
8. Keras官方教程
9. Theano官方教程