• 卷积神经网络


    LeNet-5是一种用于手写字符识别的卷积神经网络(效果见这),为了熟悉theano的卷积神经网络工具包,对它进行了研究。

    一、准备工作

    1.Windows7

    2.安装Enthought Canopy(下载)

    3.安装Theano

    ->easy_install pip

    ->pip install theano

    4.代码数据

    二、模型及代码

    1. 模型

    1). 输入图像是32x32的大小,局部滑动窗的大小是5x5,由于不考虑对边界进行拓展,则滑动窗将有28x28个位置,也就是C1层的大小是28x28。这里设定有六个不同的C1层,每个C1层内的权值是相同的。

    2). S2层是一个下采样层,由四个点下采样为一个点,也就是4个数的加权平均。这里采用max-pooling,取四个点的最大值。下采样后,S2层的大小是14x14。

    3). 根据对S1层的理解,很容易知道C3层的大小为10x10,只不过,C3层变成了16个10x10网络。如果S2层只有1个平面,那么由S2层得到C3就和由输入层得到C1层是完全一样的。但是,S2层有多层,那么,我们只需要按照一定的顺序组合这些层就可以了。具体的组合规则如下:

    例如对于C3层第0张特征图,其每一个节点与S2层的第0张特征图,第1张特征图,第2张特征图,总共3个5x5个节点相连接。C3层每一张特征映射图的权值是相同的。

    4).  S4 层是在C3层基础上下采样,方法与S2层相同。

    2. ConvPoolLayer

    class ConvPoolLayer(object):
        """ 卷积层 """
        
        def __init__(self, rng, input, filter_shape, image_shape, poolsize=(2, 2)):
            """
            rng
                类型:numpy.random.RandomState
                描述:用于初始化权值的随机数产生器
            input
                类型:theano.tensor.dtensor4
                描述:图像数据      
            filter_shape
                类型:长度为4的元组或序列
                描述:(过滤器数量,输入的特征图数量,过滤器高度,过滤器宽度)
            image_shape
                类型:长度为4的元组或序列
                描述:(batch size,输入的特征图数量,图像高度,图像宽度)      
            poolsize
                类型:长度为2的元组或序列
                描述:下采样系数(#rows, #cols)
            """
    
            assert image_shape[1] == filter_shape[1]
            self.input = input
    
            # 每个神经元的输入 = 输入的特征图数量 * 过滤器高度 * 过滤器宽度
            fan_in = numpy.prod(filter_shape[1:])
    
            # 每个神经元的输出 = (输出的特征图数量 * 过滤器高度 * 过滤器宽度)/ 池大小
            fan_out = (filter_shape[0] * numpy.prod(filter_shape[2:]) / numpy.prod(poolsize))
                       
            # 用随机数初始化权值
            W_bound = numpy.sqrt(6. / (fan_in + fan_out))
            self.W = theano.shared(
                numpy.asarray(
                    rng.uniform(low=-W_bound, high=W_bound, size=filter_shape),
                    dtype=theano.config.floatX
                ),
                borrow=True
            )
    
            # 初始化偏差--每个输出特征图对应一个偏差
            b_values = numpy.zeros((filter_shape[0],), dtype=theano.config.floatX)
            self.b = theano.shared(value=b_values, borrow=True)
            
            # 对输入的特征图求卷积
            conv_out = conv.conv2d(
                input=input,
                filters=self.W,
                filter_shape=filter_shape,
                image_shape=image_shape
            )
    
            # 对每张求卷积后每张特征图用最大池法进行下采样
            pooled_out = downsample.max_pool_2d(
                input=conv_out,
                ds=poolsize,
                ignore_border=True
            )
    
            # 添加偏差
            self.output = T.tanh(pooled_out + self.b.dimshuffle('x', 0, 'x', 'x'))
    
            # 保存本层的参数
            self.params = [self.W, self.b]

    3. HiddenLayer

    class HiddenLayer(object):
        """ 隐层 """
        
        def __init__(self, rng, input, n_in, n_out, W=None, b=None,
                     activation=T.tanh):
            """
            rng
                类型:numpy.random.RandomState
                描述:用于初始化权值的随机数产生器
            input
                类型:theano.tensor.dmatrix
                描述:输入数据
            n_in
                类型:int
                描述:输入数据的大小
            n_out
                类型:int
                描述:神经元数量
            activation
                类型:theano.Op或函数
                描述:非线性,用于隐层的激活
            """
            
            self.input = input
    
            # 初始化权值
            if W is None:
                W_values = numpy.asarray(
                    rng.uniform(
                        low=-numpy.sqrt(6. / (n_in + n_out)),
                        high=numpy.sqrt(6. / (n_in + n_out)),
                        size=(n_in, n_out)
                    ),
                    dtype=theano.config.floatX
                )
                if activation == theano.tensor.nnet.sigmoid:
                    W_values *= 4
                W = theano.shared(value=W_values, name='W', borrow=True)
            
            # 初始化偏差
            if b is None:
                b_values = numpy.zeros((n_out,), dtype=theano.config.floatX)
                b = theano.shared(value=b_values, name='b', borrow=True)
    
            self.W = W
            self.b = b
    
            # 激活
            lin_output = T.dot(input, self.W) + self.b
            self.output = (
                lin_output if activation is None
                else activation(lin_output)
            )
            
            self.params = [self.W, self.b]

    4. LogisticRegression

    class LogisticRegression(object):
        """ 逻辑回归 """
        
        def __init__(self, input, n_in, n_out):
            """
            input
                类型:theano.tensor.TensorType
                描述:输入数据               
            n_in
                类型:int
                描述:输入单元的个数
            n_out
                类型:int
                描述:输出单元的个数
            """
            
            self.W = theano.shared(
                value=numpy.zeros(
                    (n_in, n_out),
                    dtype=theano.config.floatX
                ),
                name='W',
                borrow=True
            )
            
            self.b = theano.shared(
                value=numpy.zeros(
                    (n_out,),
                    dtype=theano.config.floatX
                ),
                name='b',
                borrow=True
            )
    
            # 分类
            self.p_y_given_x = T.nnet.softmax(T.dot(input, self.W) + self.b)
            
            # 分类结果
            self.y_pred = T.argmax(self.p_y_given_x, axis=1)
           
            self.params = [self.W, self.b]
    
        def negative_log_likelihood(self, y):
            """ 返回该模型的对于某一指定变量的最小负的对数似然函数值 """
            return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y])
    
        def errors(self, y):
            # 检查与预测值是否有相同的维数
            if y.ndim != self.y_pred.ndim:
                raise TypeError(
                    'y should have the same shape as self.y_pred',
                    ('y', y.type, 'y_pred', self.y_pred.type)
                )
            # 检查数据类型是否正确
            if y.dtype.startswith('int'):
                return T.mean(T.neq(self.y_pred, y))
            else:
                raise NotImplementedError()

    5.完整代码

    # -*- coding: utf-8 -*-
    import numpy
    import cPickle
    import gzip
    import os
    import sys
    import time
    
    import theano
    import theano.tensor as T
    from theano.tensor.signal import downsample
    from theano.tensor.nnet import conv
    
    class ConvPoolLayer(object):
        """ 卷积层 """
        
        def __init__(self, rng, input, filter_shape, image_shape, poolsize=(2, 2)):
            """
            rng
                类型:numpy.random.RandomState
                描述:用于初始化权值的随机数产生器
            input
                类型:theano.tensor.dtensor4
                描述:预处理了的图像数据      
            filter_shape
                类型:长度为4的元组或序列
                描述:(过滤器数量,输入的特征图数量,过滤器高度,过滤器宽度)
            image_shape
                类型:长度为4的元组或序列
                描述:(batch size,输入的特征图数量,图像高度,图像宽度)      
            poolsize
                类型:长度为2的元组或序列
                描述:下采样系数(#rows, #cols)
            """
    
            assert image_shape[1] == filter_shape[1]
            self.input = input
    
            # 每个神经元的输入 = 输入的特征图数量 * 过滤器高度 * 过滤器宽度
            fan_in = numpy.prod(filter_shape[1:])
    
            # 每个神经元的输出 = (输出的特征图数量 * 过滤器高度 * 过滤器宽度)/ 池大小
            fan_out = (filter_shape[0] * numpy.prod(filter_shape[2:]) / numpy.prod(poolsize))
                       
            # 用随机数初始化权值
            W_bound = numpy.sqrt(6. / (fan_in + fan_out))
            self.W = theano.shared(
                numpy.asarray(
                    rng.uniform(low=-W_bound, high=W_bound, size=filter_shape),
                    dtype=theano.config.floatX
                ),
                borrow=True
            )
    
            # 初始化偏差--每个输出特征图对应一个偏差
            b_values = numpy.zeros((filter_shape[0],), dtype=theano.config.floatX)
            self.b = theano.shared(value=b_values, borrow=True)
            
            # 对输入的特征图求卷积
            conv_out = conv.conv2d(
                input=input,
                filters=self.W,
                filter_shape=filter_shape,
                image_shape=image_shape
            )
    
            # 对每张求卷积后每张特征图用最大池法进行下采样
            pooled_out = downsample.max_pool_2d(
                input=conv_out,
                ds=poolsize,
                ignore_border=True
            )
    
            # 添加偏差
            self.output = T.tanh(pooled_out + self.b.dimshuffle('x', 0, 'x', 'x'))
    
            # 保存本层的参数
            self.params = [self.W, self.b]
    
                    
    class HiddenLayer(object):
        """ 隐层 """
        
        def __init__(self, rng, input, n_in, n_out, W=None, b=None,
                     activation=T.tanh):
            """
            rng
                类型:numpy.random.RandomState
                描述:用于初始化权值的随机数产生器
            input
                类型:theano.tensor.dmatrix
                描述:输入数据
            n_in
                类型:int
                描述:输入数据的大小
            n_out
                类型:int
                描述:神经元数量
            activation
                类型:theano.Op或函数
                描述:非线性,用于隐层的激活
            """
            
            self.input = input
    
            # 初始化权值
            if W is None:
                W_values = numpy.asarray(
                    rng.uniform(
                        low=-numpy.sqrt(6. / (n_in + n_out)),
                        high=numpy.sqrt(6. / (n_in + n_out)),
                        size=(n_in, n_out)
                    ),
                    dtype=theano.config.floatX
                )
                if activation == theano.tensor.nnet.sigmoid:
                    W_values *= 4
                W = theano.shared(value=W_values, name='W', borrow=True)
            
            # 初始化偏差
            if b is None:
                b_values = numpy.zeros((n_out,), dtype=theano.config.floatX)
                b = theano.shared(value=b_values, name='b', borrow=True)
    
            self.W = W
            self.b = b
    
            # 激活
            lin_output = T.dot(input, self.W) + self.b
            self.output = (
                lin_output if activation is None
                else activation(lin_output)
            )
            
            self.params = [self.W, self.b]
            
    class LogisticRegression(object):
        """ 逻辑回归 """
        
        def __init__(self, input, n_in, n_out):
            """
            input
                类型:theano.tensor.TensorType
                描述:输入数据               
            n_in
                类型:int
                描述:输入单元的个数
            n_out
                类型:int
                描述:输出单元的个数
            """
            
            self.W = theano.shared(
                value=numpy.zeros(
                    (n_in, n_out),
                    dtype=theano.config.floatX
                ),
                name='W',
                borrow=True
            )
            
            self.b = theano.shared(
                value=numpy.zeros(
                    (n_out,),
                    dtype=theano.config.floatX
                ),
                name='b',
                borrow=True
            )
    
            self.p_y_given_x = T.nnet.softmax(T.dot(input, self.W) + self.b)
            
            self.y_pred = T.argmax(self.p_y_given_x, axis=1)
           
            self.params = [self.W, self.b]
    
        def negative_log_likelihood(self, y):
            """ 返回该模型的对于某一指定变量的最小负的对数似然函数值 """
            return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y])
    
        def errors(self, y):
            # 检查与预测值是否有相同的维数
            if y.ndim != self.y_pred.ndim:
                raise TypeError(
                    'y should have the same shape as self.y_pred',
                    ('y', y.type, 'y_pred', self.y_pred.type)
                )
            # 检查数据类型是否正确
            if y.dtype.startswith('int'):
                return T.mean(T.neq(self.y_pred, y))
            else:
                raise NotImplementedError()
    
    
    def load_data(dataset):
        data_dir, data_file = os.path.split(dataset)
        if data_dir == "" and not os.path.isfile(dataset):
            # Check if dataset is in the data directory.
            new_path = os.path.join(
                os.path.split(__file__)[0],
                "data",
                dataset
            )
            if os.path.isfile(new_path) or data_file == 'mnist.pkl.gz':
                dataset = new_path
    
        if (not os.path.isfile(dataset)) and data_file == 'mnist.pkl.gz':
            import urllib
            origin = (
                'http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz'
            )
            print 'Downloading data from %s' % origin
            urllib.urlretrieve(origin, dataset)
    
        print '... loading data'
    
        # Load the dataset
        f = gzip.open(dataset, 'rb')
        train_set, valid_set, test_set = cPickle.load(f)
        f.close()
    
        def shared_dataset(data_xy, borrow=True):
            """ Function that loads the dataset into shared variables """
            data_x, data_y = data_xy
            shared_x = theano.shared(numpy.asarray(data_x,
                                                   dtype=theano.config.floatX),
                                     borrow=borrow)
            shared_y = theano.shared(numpy.asarray(data_y,
                                                   dtype=theano.config.floatX),
                                     borrow=borrow)
            return shared_x, T.cast(shared_y, 'int32')
    
        test_set_x, test_set_y = shared_dataset(test_set)
        valid_set_x, valid_set_y = shared_dataset(valid_set)
        train_set_x, train_set_y = shared_dataset(train_set)
    
        rval = [(train_set_x, train_set_y), (valid_set_x, valid_set_y),
                (test_set_x, test_set_y)]
        return rval
        
    def evaluate_lenet5(learning_rate=0.1, n_epochs=200,
                        dataset='mnist.pkl.gz',
                        nkerns=[20, 50], batch_size=500):
        """ Demonstrates lenet on MNIST dataset
    
        :type learning_rate: float
        :param learning_rate: learning rate used (factor for the stochastic
                              gradient)
    
        :type n_epochs: int
        :param n_epochs: maximal number of epochs to run the optimizer
    
        :type dataset: string
        :param dataset: path to the dataset used for training /testing (MNIST here)
    
        :type nkerns: list of ints
        :param nkerns: number of kernels on each layer
        """
    
        rng = numpy.random.RandomState(23455)
    
        datasets = load_data(dataset)
    
        train_set_x, train_set_y = datasets[0]
        valid_set_x, valid_set_y = datasets[1]
        test_set_x, test_set_y = datasets[2]
    
        # compute number of minibatches for training, validation and testing
        n_train_batches = train_set_x.get_value(borrow=True).shape[0]
        n_valid_batches = valid_set_x.get_value(borrow=True).shape[0]
        n_test_batches = test_set_x.get_value(borrow=True).shape[0]
        n_train_batches /= batch_size
        n_valid_batches /= batch_size
        n_test_batches /= batch_size
    
        # allocate symbolic variables for the data
        index = T.lscalar()  # index to a [mini]batch
    
        # start-snippet-1
        x = T.matrix('x')   # the data is presented as rasterized images
        y = T.ivector('y')  # the labels are presented as 1D vector of
                            # [int] labels
    
        ######################
        # BUILD ACTUAL MODEL #
        ######################
        print '... building the model'
    
        # Reshape matrix of rasterized images of shape (batch_size, 28 * 28)
        # to a 4D tensor, compatible with our LeNetConvPoolLayer
        # (28, 28) is the size of MNIST images.
        layer0_input = x.reshape((batch_size, 1, 28, 28))
    
        # Construct the first convolutional pooling layer:
        # filtering reduces the image size to (28-5+1 , 28-5+1) = (24, 24)
        # maxpooling reduces this further to (24/2, 24/2) = (12, 12)
        # 4D output tensor is thus of shape (batch_size, nkerns[0], 12, 12)
        layer0 = ConvPoolLayer(
            rng,
            input=layer0_input,
            image_shape=(batch_size, 1, 28, 28),
            filter_shape=(nkerns[0], 1, 5, 5),
            poolsize=(2, 2)
        )
    
        # Construct the second convolutional pooling layer
        # filtering reduces the image size to (12-5+1, 12-5+1) = (8, 8)
        # maxpooling reduces this further to (8/2, 8/2) = (4, 4)
        # 4D output tensor is thus of shape (nkerns[0], nkerns[1], 4, 4)
        layer1 = ConvPoolLayer(
            rng,
            input=layer0.output,
            image_shape=(batch_size, nkerns[0], 12, 12),
            filter_shape=(nkerns[1], nkerns[0], 5, 5),
            poolsize=(2, 2)
        )
    
        # the HiddenLayer being fully-connected, it operates on 2D matrices of
        # shape (batch_size, num_pixels) (i.e matrix of rasterized images).
        # This will generate a matrix of shape (batch_size, nkerns[1] * 4 * 4),
        # or (500, 50 * 4 * 4) = (500, 800) with the default values.
        layer2_input = layer1.output.flatten(2)
    
        # construct a fully-connected sigmoidal layer
        layer2 = HiddenLayer(
            rng,
            input=layer2_input,
            n_in=nkerns[1] * 4 * 4,
            n_out=500,
            activation=T.tanh
        )
    
        # classify the values of the fully-connected sigmoidal layer
        layer3 = LogisticRegression(input=layer2.output, n_in=500, n_out=10)
    
        # the cost we minimize during training is the NLL of the model
        cost = layer3.negative_log_likelihood(y)
    
        # create a function to compute the mistakes that are made by the model
        test_model = theano.function(
            [index],
            layer3.errors(y),
            givens={
                x: test_set_x[index * batch_size: (index + 1) * batch_size],
                y: test_set_y[index * batch_size: (index + 1) * batch_size]
            }
        )
    
        validate_model = theano.function(
            [index],
            layer3.errors(y),
            givens={
                x: valid_set_x[index * batch_size: (index + 1) * batch_size],
                y: valid_set_y[index * batch_size: (index + 1) * batch_size]
            }
        )
    
        # create a list of all model parameters to be fit by gradient descent
        params = layer3.params + layer2.params + layer1.params + layer0.params
    
        # create a list of gradients for all model parameters
        grads = T.grad(cost, params)
    
        # train_model is a function that updates the model parameters by
        # SGD Since this model has many parameters, it would be tedious to
        # manually create an update rule for each model parameter. We thus
        # create the updates list by automatically looping over all
        # (params[i], grads[i]) pairs.
        updates = [
            (param_i, param_i - learning_rate * grad_i)
            for param_i, grad_i in zip(params, grads)
        ]
    
        train_model = theano.function(
            [index],
            cost,
            updates=updates,
            givens={
                x: train_set_x[index * batch_size: (index + 1) * batch_size],
                y: train_set_y[index * batch_size: (index + 1) * batch_size]
            }
        )
        # end-snippet-1
    
        ###############
        # TRAIN MODEL #
        ###############
        print '... training'
        # early-stopping parameters
        patience = 10000  # look as this many examples regardless
        patience_increase = 2  # wait this much longer when a new best is
                               # found
        improvement_threshold = 0.995  # a relative improvement of this much is
                                       # considered significant
        validation_frequency = min(n_train_batches, patience / 2)
                                      # go through this many
                                      # minibatche before checking the network
                                      # on the validation set; in this case we
                                      # check every epoch
    
        best_validation_loss = numpy.inf
        best_iter = 0
        test_score = 0.
        start_time = time.clock()
    
        epoch = 0
        done_looping = False
    
        while (epoch < n_epochs) and (not done_looping):
            epoch = epoch + 1
            for minibatch_index in xrange(n_train_batches):
    
                iter = (epoch - 1) * n_train_batches + minibatch_index
    
                if iter % 100 == 0:
                    print 'training @ iter = ', iter
                cost_ij = train_model(minibatch_index)
    
                if (iter + 1) % validation_frequency == 0:
    
                    # compute zero-one loss on validation set
                    validation_losses = [validate_model(i) for i
                                         in xrange(n_valid_batches)]
                    this_validation_loss = numpy.mean(validation_losses)
                    print('epoch %i, minibatch %i/%i, validation error %f %%' %
                          (epoch, minibatch_index + 1, n_train_batches,
                           this_validation_loss * 100.))
    
                    # if we got the best validation score until now
                    if this_validation_loss < best_validation_loss:
    
                        #improve patience if loss improvement is good enough
                        if this_validation_loss < best_validation_loss *  
                           improvement_threshold:
                            patience = max(patience, iter * patience_increase)
    
                        # save best validation score and iteration number
                        best_validation_loss = this_validation_loss
                        best_iter = iter
    
                        # test it on the test set
                        test_losses = [
                            test_model(i)
                            for i in xrange(n_test_batches)
                        ]
                        test_score = numpy.mean(test_losses)
                        print(('     epoch %i, minibatch %i/%i, test error of '
                               'best model %f %%') %
                              (epoch, minibatch_index + 1, n_train_batches,
                               test_score * 100.))
    
                if patience <= iter:
                    done_looping = True
                    break
    
        end_time = time.clock()
        print('Optimization complete.')
        print('Best validation score of %f %% obtained at iteration %i, '
              'with test performance %f %%' %
              (best_validation_loss * 100., best_iter + 1, test_score * 100.))
        print >> sys.stderr, ('The code for file ' +
                              os.path.split(__file__)[1] +
                              ' ran for %.2fm' % ((end_time - start_time) / 60.))
    
    if __name__ == '__main__':
        evaluate_lenet5()
    View Code

  • 相关阅读:
    POJ-2386 Lake Counting
    白书-部分和问题
    STL-map/multimap 简述
    STL-set&&multiset 集合
    STL-优先级队列-priority_queue
    挣脱虚无,化身虚无
    C
    B
    A
    STL-list 链表
  • 原文地址:https://www.cnblogs.com/ilovebcz/p/4086013.html
Copyright © 2020-2023  润新知