• 深度学习标准化


    Swin Transformer


    作者:elfin



    Top  ---  Bottom

    1、Batch Normalization

    使用BN时,我们只需要使用torch.nn.BatchNorm2d()指定通道数即可。它会在每个通道上分别求均值和方差在进行标准化。

    1.1 数据准备

    import torch
    BatchNorm2d = torch.nn.BatchNorm2d
    test = torch.rand((1,3,2,2))
    

    1.2 数据展示

    test[0,:,:,:]
    
    tensor([[[6.7027e-01, 5.3149e-01],
             [4.6797e-01, 3.1028e-02]],
    
            [[4.1371e-01, 1.2022e-04],
             [2.3150e-01, 2.5120e-01]],
    
            [[5.2258e-01, 9.6350e-02],
             [4.6467e-01, 3.6091e-01]]])
    

    1.3 BN转化

    BatchNorm2d(3)(test)
    Out:
        tensor([[[[ 1.0252e+00,  4.4465e-01],
                  [ 1.7894e-01, -1.6488e+00]],
    
                 [[ 1.2858e+00, -1.5194e+00],
                  [ 4.9993e-02,  1.8359e-01]],
    
                 [[ 9.8744e-01, -1.6194e+00],
                  [ 6.3328e-01, -1.3302e-03]]]], grad_fn=<NativeBatchNormBackward>)
    
    (test[:,0,:,:] - test[:,0,:,:].numpy().mean()) / test[0,0,:,:].numpy().std()
    Out:
        tensor([[ 1.0253,  0.4447],
                [ 0.1790, -1.6489]])
    

    这里我们清楚看到两个结构是一致的!下面我们直接测试batch_size不为1的情况:

    test = torch.rand((10,3,2,2))
    BatchNorm2d(3)(test)[0,0,:,:]
    Out:
    	tensor([[ 1.6257,  0.5479],
                [-1.3761,  0.8000]], grad_fn=<SliceBackward>)
    
    res = (test[:,0,:,:] - test[:,0,:,:].numpy().mean()) / test[:,0,:,:].numpy().std()
    res[0,:,:]
    Out:
        tensor([[ 1.6258,  0.5479],
                [-1.3762,  0.8000]])
    

    Top  ---  Bottom

    2、Layer Normalization

    LN(Layer Normalization)也是做标准化,但是它不是在样本间,标准化的数据采集只会在单个样本内。

    关于torch.nn.LayerNorm()的参数我们有很多种的配置:

    >>> input = torch.randn(20, 5, 10, 10)
    >>> # With Learnable Parameters
    >>> m = nn.LayerNorm(input.size()[1:])
    >>> # Without Learnable Parameters
    >>> m = nn.LayerNorm(input.size()[1:], elementwise_affine=False)
    >>> # Normalize over last two dimensions
    >>> m = nn.LayerNorm([10, 10])
    >>> # Normalize over last dimension of size 10
    >>> m = nn.LayerNorm(10)
    >>> # Activating the module
    >>> output = m(input)
    

    2.1 数据展示

    squence = torch.rand((2,3,10))
    squence
    Out:
        tensor([[[0.1151, 0.9571, 0.5986, 0.4692, 0.7029, 0.5159, 0.4494, 0.9428,
                  0.9714, 0.9938],
                 [0.6456, 0.5997, 0.7542, 0.7266, 0.7021, 0.2900, 0.7044, 0.1627,
                  0.3725, 0.9454],
                 [0.9398, 0.3861, 0.5276, 0.8783, 0.8319, 0.1181, 0.6185, 0.9689,
                  0.6393, 0.7770]],
    
                [[0.2786, 0.8901, 0.7228, 0.3740, 0.4186, 0.6857, 0.8438, 0.4762,
                  0.4106, 0.4823],
                 [0.5199, 0.7644, 0.2987, 0.3745, 0.6000, 0.7266, 0.0854, 0.1954,
                  0.5413, 0.1656],
                 [0.5487, 0.2655, 0.9256, 0.7352, 0.4081, 0.8017, 0.7130, 0.5364,
                  0.5441, 0.8483]]])
    

    2.2 指定一个维度

    LN = torch.nn.LayerNorm
    LN(10)(squence)
    Out:
        tensor([[[-1.9932,  1.0227, -0.2616, -0.7251,  0.1120, -0.5578, -0.7961,
                  0.9712,  1.0739,  1.1540],
                 [ 0.2423,  0.0411,  0.7180,  0.5971,  0.4899, -1.3160,  0.5000,
                  -1.8739, -0.9546,  1.5561],
                 [ 1.0619, -1.1060, -0.5519,  0.8214,  0.6396, -2.1551, -0.1960,
                  1.1760, -0.1146,  0.4246]],
    
                [[-1.3968,  1.6568,  0.8218, -0.9200, -0.6974,  0.6363,  1.4258,
                  -0.4100, -0.7372, -0.3793],
                 [ 0.4093,  1.4885, -0.5673, -0.2324,  0.7629,  1.3218, -1.5084,
                  -1.0233,  0.5037, -1.1548],
                 [-0.4265, -1.8654,  1.4884,  0.5210, -1.1411,  0.8590,  0.4084,
                  -0.4893, -0.4500,  1.0955]]], grad_fn=<NativeLayerNormBackward>)
    
    (squence[0,0,:] - squence[0,0,:].numpy().mean()) / squence[0,0,:].numpy().std()
    Out:
        tensor([-1.9934,  1.0227, -0.2617, -0.7252,  0.1120, -0.5578, -0.7961,
                0.9713,  1.0740,  1.1540])
    

    对比发现,这里只在最后一个维度进行操作!

    2.3 指定两个维度

    squence2 = torch.rand((2,2,7))
    LN([2,7])(squence2)
    Out:
        tensor([[[-0.1525, -0.3791,  1.9005,  0.9187, -1.2562, -0.9069,  0.4788],
                 [-0.9507, -0.5147, -1.1867,  1.9212,  0.4739, -0.4837,  0.1374]],
    
                [[-0.4490, -1.2532,  1.2571, -0.7904, -0.7550, -1.0003,  0.2586],
                 [ 1.2673, -0.8106, -0.2374,  1.4318,  0.0237,  1.8428, -0.7854]]],
           grad_fn=<NativeLayerNormBackward>)
    
    (squence2[0,:,:] - squence2[0,:,:].numpy().mean()) / squence2[0,:,:].numpy().std()
    Out:
        tensor([[-0.1525, -0.3791,  1.9006,  0.9188, -1.2563, -0.9070,  0.4788],
                [-0.9508, -0.5148, -1.1867,  1.9214,  0.4739, -0.4838,  0.1374]])
    

    这里两种结构也是一致的,说明指定两个维度时,是在最后两个维度上进行标准化!


    Top  ---  Bottom

    未完!

    清澈的爱,只为中国
  • 相关阅读:
    arcgis api for javascipt 加载天地图、百度地图
    百度地图通过经纬度获取地址信息
    通过百度获取IP地址对应的经纬度
    黑马lavarel教程---13、分页
    legend3---6、legend3爬坑杂记
    黑马vue---1-7、vue杂记
    黑马在线教育项目---15-16、datatables插件
    尚学堂requireJs课程---3、私有和公有属性和方法
    尚学堂requireJs课程---2、模块
    尚学堂requireJs课程---1、作用域回顾
  • 原文地址:https://www.cnblogs.com/dan-baishucaizi/p/14718865.html
Copyright © 2020-2023  润新知