Theano初探
为了搞清楚theano到底是什么东西,我们拿它做个简单的实验:写一个加法函数。
>>> import theano.tensor as T >>> from theano import function >>> x = T.dscalar('x') >>> y = T.dscalar('y') >>> z = x + y >>> f = function([x, y], z)
使用如下代码来使用
>>> f(2, 3) array(5.0) >>> f(16.3, 12.1) array(28.4)
在Theano中,所有的符号都必须要定义类型。在上面的代码中,T.dscalar就是一个类型,d代表double,scalar表示标量。
dscalar并不是一个类,因此,x和y并不是dscalar的实例,而是TensorVariable的实例。然而在type地界上,x和y确实归dscalar管。
>>> type(x) #有什么fuck区别? <class 'theano.tensor.basic.TensorVariable'> >>> x.type TensorType(float64, scalar) >>> T.dscalar TensorType(float64, scalar) >>> x.type is T.dscalar True
也可以用eval函数来完成,eval虽不如functioni灵活,但是也能凑活着用,还省去了import function的麻烦。
>>> import theano.tensor as T >>> x = T.dscalar('x') >>> y = T.dscalar('y') >>> z = x + y >>> z.eval({x : 16.3, y : 12.1}) array(28.4)
eval就是evaluate的缩写,意思是求值。
其他例子
整个复杂点的:
>>> x = T.dmatrix('x') >>> s = 1 / (1 + T.exp(-x)) >>> logistic = function([x], s) >>> logistic([[0, 1], [-1, -2]]) array([[ 0.5 , 0.73105858], [ 0.26894142, 0.11920292]])
因为这个函数正好是大部分取值接近于0或者1,所以取名 logistic,是逻辑的意思
正式开始
import cPickle, gzip, numpy #cPickle 用c编制的“腌制”模块,用于存储
#gzip GNU自由软件的zip文件 # Load the dataset f = gzip.open('mnist.pkl.gz', 'rb') train_set, valid_set, test_set = cPickle.load(f) #前面 是load函数的3个输出?(3个输出并不相同) f.close()
def shared_dataset(data_xy): """ Function that loads the dataset into shared variables The reason we store our dataset in shared variables is to allow Theano to copy it into the GPU memory (when code is run on GPU). #GPU 英文全称Graphic Processing Unit,中文翻译为“图形处理器” Since copying data into the GPU is slow, copying a minibatch everytime #batch 批 is needed (the default behaviour if the data is not in a shared variable) would lead to a large decrease in performance. """ data_x, data_y = data_xy shared_x = theano.shared(numpy.asarray(data_x, dtype=theano.config.floatX)) #asarray as脚本 array数组 shared_y = theano.shared(numpy.asarray(data_y, dtype=theano.config.floatX)) # When storing data on the GPU it has to be stored as floats # therefore we will store the labels as ``floatX`` as well # (``shared_y`` does exactly that). But during our computations # we need them as ints (we use labels as index, and if they are # floats it doesn't make sense) therefore instead of returning # ``shared_y`` we will have to cast it to int. This little hack # lets us get around this issue return shared_x, T.cast(shared_y, 'int32') #cast是铸造的意思 test_set_x, test_set_y = shared_dataset(test_set) valid_set_x, valid_set_y = shared_dataset(valid_set) train_set_x, train_set_y = shared_dataset(train_set) batch_size = 500 # size of the minibatch # accessing the third minibatch of the training set data = train_set_x[2 * 500: 3 * 500] label = train_set_y[2 * 500: 3 * 500]