• Deep learning:四十七(Stochastic Pooling简单理解)


      CNN中卷积完后有个步骤叫pooling, 在ICLR2013上,作者Zeiler提出了另一种pooling手段(最常见的就是mean-pooling和max-pooling),叫stochastic pooling,在他的文章还给出了效果稍差点的probability weighted pooling方法。

      stochastic pooling方法非常简单,只需对feature map中的元素按照其概率值大小随机选择,即元素值大的被选中的概率也大。而不像max-pooling那样,永远只取那个最大值元素。

      假设feature map中的pooling区域元素值如下:

       

      3*3大小的,元素值和sum=0+1.1+2.5+0.9+2.0+1.0+0+1.5+1.0=10

      方格中的元素同时除以sum后得到的矩阵元素为:

       

      每个元素值表示对应位置处值的概率,现在只需要按照该概率来随机选一个,方法是:将其看作是9个变量的多项式分布,然后对该多项式分布采样即可,theano中有直接的multinomial()来函数完成。当然也可以自己用01均匀分布来采样,将单位长度1按照那9个概率值分成9个区间(概率越大,覆盖的区域越长,每个区间对应一个位置),然随机生成一个数后看它落在哪个区间。

      比如如果随机采样后的矩阵为:

       

      则这时候的poolng值为1.5

      使用stochastic pooling时(即test过程),其推理过程也很简单,对矩阵区域求加权平均即可。比如对上面的例子求值过程为为:

         0*0+1.1*0.11+2.5*0.25+0.9*0.09+2.0*0.2+1.0*0.1+0*0+1.5*0.15+1.0*0.1=1.625 说明此时对小矩形pooling后的结果为1.625.

      在反向传播求导时,只需保留前向传播已经记录被选中节点的位置的值,其它值都为0,这和max-pooling的反向传播非常类似。

      Stochastic pooling优点:

      方法简单;

      泛化能力更强;

      可用于卷积层(文章中是与Dropout和DropConnect对比的,说是Dropout和DropConnect不太适合于卷积层. 不过个人感觉这没什么可比性,因为它们在网络中所处理的结构不同);

      至于为什么stochastic pooling效果好,作者说该方法也是模型平均的一种,没怎么看懂。

      关于Stochastic Pooling的前向传播过程和推理过程的代码可参考(没包括bp过程,所以代码中pooling选择的位置没有保存下来)

      源码:pylearn2/stochastic_pool.py

    """
    An implementation of stochastic max-pooling, based on
    
    Stochastic Pooling for Regularization of Deep Convolutional Neural Networks
    Matthew D. Zeiler, Rob Fergus, ICLR 2013
    """
    
    __authors__ = "Mehdi Mirza"
    __copyright__ = "Copyright 2010-2012, Universite de Montreal"
    __credits__ = ["Mehdi Mirza", "Ian Goodfellow"]
    __license__ = "3-clause BSD"
    __maintainer__ = "Mehdi Mirza"
    __email__ = "mirzamom@iro"
    
    import numpy
    import theano
    from theano import tensor
    from theano.sandbox.rng_mrg import MRG_RandomStreams as RandomStreams
    from theano.gof.op import get_debug_values
    
    def stochastic_max_pool_bc01(bc01, pool_shape, pool_stride, image_shape, rng = None):
        """
        Stochastic max pooling for training as defined in:
    
        Stochastic Pooling for Regularization of Deep Convolutional Neural Networks
        Matthew D. Zeiler, Rob Fergus
    
        bc01: minibatch in format (batch size, channels, rows, cols),
            IMPORTANT: All values should be poitivie
        pool_shape: shape of the pool region (rows, cols)
        pool_stride: strides between pooling regions (row stride, col stride)
        image_shape: avoid doing some of the arithmetic in theano
        rng: theano random stream
        """
        r, c = image_shape
        pr, pc = pool_shape
        rs, cs = pool_stride
    
        batch = bc01.shape[0] #总共batch的个数
        channel = bc01.shape[1] #通道个数
    
        if rng is None:
            rng = RandomStreams(2022)
    
        # Compute index in pooled space of last needed pool
        # (needed = each input pixel must appear in at least one pool)
        def last_pool(im_shp, p_shp, p_strd):
            rval = int(numpy.ceil(float(im_shp - p_shp) / p_strd))
            assert p_strd * rval + p_shp >= im_shp
            assert p_strd * (rval - 1) + p_shp < im_shp
            return rval #表示pool过程中需要移动的次数
            return T.dot(x, self._W)
    
        # Compute starting row of the last pool
        last_pool_r = last_pool(image_shape[0] ,pool_shape[0], pool_stride[0]) * pool_stride[0] #最后一个pool的起始位置
        # Compute number of rows needed in image for all indexes to work out
        required_r = last_pool_r + pr #满足上面pool条件时所需要image的高度
    
        last_pool_c = last_pool(image_shape[1] ,pool_shape[1], pool_stride[1]) * pool_stride[1]
        required_c = last_pool_c + pc
    
        # final result shape
        res_r = int(numpy.floor(last_pool_r/rs)) + 1 #最后pool完成时图片的shape
        res_c = int(numpy.floor(last_pool_c/cs)) + 1
    
        for bc01v in get_debug_values(bc01):
            assert not numpy.any(numpy.isinf(bc01v))
            assert bc01v.shape[2] == image_shape[0]
            assert bc01v.shape[3] == image_shape[1]
    
        # padding,如果不能整除移动,需要对原始图片进行扩充
        padded = tensor.alloc(0.0, batch, channel, required_r, required_c)
        name = bc01.name
        if name is None:
            name = 'anon_bc01'
        bc01 = tensor.set_subtensor(padded[:,:, 0:r, 0:c], bc01)
        bc01.name = 'zero_padded_' + name
    
        # unraveling
        window = tensor.alloc(0.0, batch, channel, res_r, res_c, pr, pc)
        window.name = 'unravlled_winodows_' + name
    
        for row_within_pool in xrange(pool_shape[0]):
            row_stop = last_pool_r + row_within_pool + 1
            for col_within_pool in xrange(pool_shape[1]):
                col_stop = last_pool_c + col_within_pool + 1
                win_cell = bc01[:,:,row_within_pool:row_stop:rs, col_within_pool:col_stop:cs]
                window  =  tensor.set_subtensor(window[:,:,:,:, row_within_pool, col_within_pool], win_cell) #windows中装的是所有的pooling数据块
    
        # find the norm
        norm = window.sum(axis = [4, 5]) #求和当分母用 
        norm = tensor.switch(tensor.eq(norm, 0.0), 1.0, norm) #如果norm为0,则将norm赋值为1
        norm = window / norm.dimshuffle(0, 1, 2, 3, 'x', 'x') #除以norm得到每个位置的概率
        # get prob
        prob = rng.multinomial(pvals = norm.reshape((batch * channel * res_r * res_c, pr * pc)), dtype='float32') #multinomial()函数能够按照pvals产生多个多项式分布,元素值为0或1
        # select
        res = (window * prob.reshape((batch, channel, res_r, res_c,  pr, pc))).max(axis=5).max(axis=4) #window和后面的矩阵相乘是点乘,即对应元素相乘,numpy矩阵符号
        res.name = 'pooled_' + name
    
        return tensor.cast(res, theano.config.floatX)
    
    def weighted_max_pool_bc01(bc01, pool_shape, pool_stride, image_shape, rng = None):
        """
        This implements test time probability weighted pooling defined in:
    
        Stochastic Pooling for Regularization of Deep Convolutional Neural Networks
        Matthew D. Zeiler, Rob Fergus
    
        bc01: minibatch in format (batch size, channels, rows, cols),
            IMPORTANT: All values should be poitivie
        pool_shape: shape of the pool region (rows, cols)
        pool_stride: strides between pooling regions (row stride, col stride)
        image_shape: avoid doing some of the arithmetic in theano
        """
        r, c = image_shape
        pr, pc = pool_shape
        rs, cs = pool_stride
    
        batch = bc01.shape[0]
        channel = bc01.shape[1]
        if rng is None: rng = RandomStreams(2022) # Compute index in pooled space of last needed pool # (needed = each input pixel must appear in at least one pool)
        def last_pool(im_shp, p_shp, p_strd):
            rval = int(numpy.ceil(float(im_shp - p_shp) / p_strd))
            assert p_strd * rval + p_shp >= im_shp
            assert p_strd * (rval - 1) + p_shp < im_shp
            return rval
        # Compute starting row of the last pool
        last_pool_r = last_pool(image_shape[0] ,pool_shape[0], pool_stride[0]) * pool_stride[0]
        # Compute number of rows needed in image for all indexes to work out
        required_r = last_pool_r + pr
    
        last_pool_c = last_pool(image_shape[1] ,pool_shape[1], pool_stride[1]) * pool_stride[1]
        required_c = last_pool_c + pc
    
        # final result shape
        res_r = int(numpy.floor(last_pool_r/rs)) + 1
        res_c = int(numpy.floor(last_pool_c/cs)) + 1
    
        for bc01v in get_debug_values(bc01):
            assert not numpy.any(numpy.isinf(bc01v))
            assert bc01v.shape[2] == image_shape[0]
            assert bc01v.shape[3] == image_shape[1]
    
        # padding
        padded = tensor.alloc(0.0, batch, channel, required_r, required_c)
        name = bc01.name
        if name is None:
            name = 'anon_bc01'
        bc01 = tensor.set_subtensor(padded[:,:, 0:r, 0:c], bc01)
        bc01.name = 'zero_padded_' + name
    
        # unraveling
        window = tensor.alloc(0.0, batch, channel, res_r, res_c, pr, pc)
        window.name = 'unravlled_winodows_' + name
    
        for row_within_pool in xrange(pool_shape[0]):
            row_stop = last_pool_r + row_within_pool + 1
            for col_within_pool in xrange(pool_shape[1]):
                col_stop = last_pool_c + col_within_pool + 1
                win_cell = bc01[:,:,row_within_pool:row_stop:rs, col_within_pool:col_stop:cs]
                window  =  tensor.set_subtensor(window[:,:,:,:, row_within_pool, col_within_pool], win_cell)
    
        # find the norm
        norm = window.sum(axis = [4, 5])
        norm = tensor.switch(tensor.eq(norm, 0.0), 1.0, norm)
        norm = window / norm.dimshuffle(0, 1, 2, 3, 'x', 'x')
        # average
        res = (window * norm).sum(axis=[4,5]) #前面的代码几乎和前向传播代码一样,这里只需加权求和即可
        res.name = 'pooled_' + name
    
        return res.reshape((batch, channel, res_r, res_c))

      参考资料:

      Stochastic Pooling for Regularization of Deep Convolutional Neural Networks. Matthew D. Zeiler, Rob Fergus.

           pylearn2/stochastic_pool.py

  • 相关阅读:
    壶公随感
    消息称微软受谷歌刺激 急于收购雅虎(zz)
    远程注销Windows用户
    "杀人"游戏中的一些规律
    由两点的经纬度估算距离
    我的城市?
    Blog里的一个bug,dudu看能否修正?
    这两天真烦
    发简历,找上海.Net方面软件开发工作
    "上海.NET俱乐部"聚会筹备进展
  • 原文地址:https://www.cnblogs.com/tornadomeet/p/3432093.html
Copyright © 2020-2023  润新知