• Caffe学习 一 网络参数和自定义网络


    网络参数

    # 测试总数/batchsize
    test_iter: 100
    # 测试间隔
    test_interval: 500
    # 开始的学习率
    base_lr: 0.01
    # 冲量单元,用于加速收敛,v(t+1)=momentum*v(t)-lr*grad ; w(t+1)=w(t)+v(t+1)
    momentum: 0.9
    # 权值衰减,用于惩罚项
    weight_decay: 0.0005
    # 学习率下降策略,此处计算方式为base_lr * (1 + gamma * iter) ^ (- power)
    lr_policy: "inv"
    gamma: 0.0001
    power: 0.75
    # 每display次打印显示loss
    display: 100
    # train 最大迭代max_iter 
    max_iter: 10000
    # 每迭代snapshot次,保存一次快照
    snapshot: 5000
    snapshot_prefix: "examples/mnist/lenet"
    # 使用CPU还是GPU
    solver_mode: GPU
    name: "LeNet"
    layer {
      name: "mnist"
      type: "Data"
      top: "data"
      top: "label"
      include {
        phase: TRAIN
      }
      transform_param {
        scale: 0.00390625
      }
      data_param {
        source: "examples/mnist/mnist_train_lmdb"
        batch_size: 64
        backend: LMDB
      }
    }

    name: 该层的名称。

    type: 层类型,如果是Data,表示数据来源于LevelDB或LMDB。

    top/bottom: 输出/ 输入。(data,label)配对作为输入数据进行分类。

    include: 属于训练、测试或者两者均含。

    Transform_param: 将数据变换到定义的范围。0.00390625指1/255。

    source: 数据来源。

    batch_size: 每次处理的数据个数。

    backend: LevelDB/LMDB。

    layer {
      name: "conv1"
      type: "Convolution"
      bottom: "data"
      top: "conv1"
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 2
      }
      convolution_param {
        num_output: 20
        kernel_size: 5
        stride: 1
        weight_filler {
          type: "xavier"
        }
        bias_filler {
          type: "constant"
        }
      }
    }

    lr_mult: 学习率的系数,将乘以solver.prototxt配置文件中的base_lr。两个lr_mult对应两个参数。

    num_output: 卷积核的个数。

    kernel_size: 卷积核的大小。

    stride: 卷积核的步长,默认为1。

    pad: 扩充边缘,默认为0。

    weight_filler: 权值初始化。 默认为“constant",值全为0,很多时候我们用"xavier"算法来进行初始化,也可以设置为”gaussian"。

    bias_filler: 偏置项的初始化。一般设置为"constant",值全为0。

    layer {
      name: "pool1"
      type: "Pooling"
      bottom: "conv1"
      top: "pool1"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }

    pool: 池化方法。

    pad: 边缘扩充,默认为0。

    kernel_size: 池化的核大小。

    stride: 池化的步长,默认为1。设置为与kernel_size一样,即不重叠。
     
    layer {
      name: "ip1"
      type: "InnerProduct"
      bottom: "pool2"
      top: "ip1"
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 2
      }
      inner_product_param {
        num_output: 500
        weight_filler {
          type: "xavier"
        }
        bias_filler {
          type: "constant"
        }
      }
    }

    全连接层也可以理解为一种卷积层,卷积核大小和原数据大小一致。

    num_output: 输出/卷积核的个数。

    自定义网络

    在之前的随笔基于theano的深度卷积神经网络中有一个神经网络。

    此处用caffe训练。

    主要参数有:

    net = Network([
    ConvPoolLayer(image_shape=(mini_batch_size, 1, 28, 28),
    filter_shape=(20, 1, 5, 5),
    poolsize=(2, 2), activation_fn=ReLU),
    ConvPoolLayer(image_shape=(mini_batch_size, 20, 12, 12),
    filter_shape=(40, 20, 5, 5),
    poolsize=(2, 2)), activation_fn=ReLU),
    FullyConnectedLayer(n_in=40*4*4, n_out=100, activation_fn=sigmoid),
    SoftmaxLayer(n_in=100, n_out=10)], mini_batch_size=10)
    net.SGD(training_data, 30, mini_batch_size=10, 0.1,
    validation_data, test_data)

    lr保持0.1不变。不过测试在0.01时才能收敛,>=0.02不可以。

    没有momentum与weight_decay。

    Transform_param去掉或改为1.0不能收敛。

    高斯初始化,设置不同的std。

    还有一些网络输入输出的修改。

    net: "examples/mnist/lenet_train_test2.prototxt"
    test_iter: 100
    test_interval: 500
    base_lr: 0.01
    momentum: 0
    weight_decay: 0
    lr_policy: "inv"
    gamma: 0
    power: 0
    display: 100
    max_iter: 30000
    snapshot: 30000
    snapshot_prefix: "examples/mnist/lenet"
    solver_mode: GPU

    修改后的net。

    name: "LeNet"
    layer {
      name: "mnist"
      type: "Data"
      top: "data"
      top: "label"
      include {
        phase: TRAIN
      }
    transform_param {
        scale: 0.00390625
      }
        data_param {
        source: "examples/mnist/mnist_train_lmdb"
        batch_size: 10
        backend: LMDB
      }
    }
    layer {
      name: "mnist"
      type: "Data"
      top: "data"
      top: "label"
      include {
        phase: TEST
      }
    transform_param {
        scale: 0.00390625
      }
      data_param {
        source: "examples/mnist/mnist_test_lmdb"
        batch_size: 10
        backend: LMDB
      }
    }
    layer {
      name: "conv1"
      type: "Convolution"
      bottom: "data"
      top: "conv1"
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 1
      }
      convolution_param {
        num_output: 20
        kernel_size: 5
        stride: 1
        weight_filler {
          type: "gaussian"
          std: 0.09
        }
        bias_filler {
          type: "gaussian"
          std: 1.0
        }
      }
    }
    layer {
      name: "pool1"
      type: "Pooling"
      bottom: "conv1"
      top: "pool1"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      name: "conv2"
      type: "Convolution"
      bottom: "pool1"
      top: "conv2"
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 2
      }
      convolution_param {
        num_output: 40
        kernel_size: 5
        stride: 1
        weight_filler {
          type: "gaussian"
          std: 0.06
        }
        bias_filler {
          type: "gaussian"
          std: 1.0
        }
      }
    }
    layer {
      name: "pool2"
      type: "Pooling"
      bottom: "conv2"
      top: "pool2"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      name: "ip1"
      type: "InnerProduct"
      bottom: "pool2"
      top: "ip1"
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 2
      }
      inner_product_param {
        num_output: 100
        weight_filler {
          type: "gaussian"
          std: 0.1
        }
        bias_filler {
          type: "gaussian"
          std: 1.0
        }
      }
    }
    layer {
      name: "relu1"
      type: "ReLU"
      bottom: "ip1"
      top: "ip1"
    }
    layer {
      name: "ip2"
      type: "InnerProduct"
      bottom: "ip1"
      top: "ip2"
      param {
        lr_mult: 1
      }
      param {
        lr_mult: 2
      }
      inner_product_param {
        num_output: 10
        weight_filler {
          type: "gaussian"
          std: 0.33
        }
        bias_filler {
          type: "gaussian"
          std: 1.0
        }
      }
    }
    layer {
      name: "accuracy"
      type: "Accuracy"
      bottom: "ip2"
      bottom: "label"
      top: "accuracy"
      include {
        phase: TEST
      }
    }
    layer {
      name: "loss"
      type: "SoftmaxWithLoss"
      bottom: "ip2"
      bottom: "label"
      top: "loss"
    }

    得到的结果和theano接近。

    I0105 17:29:22.523669  2836 solver.cpp:317] Iteration 30000, loss = 0.00268317
    I0105 17:29:22.523669  2836 solver.cpp:337] Iteration 30000, Testing net (#0)
    I0105 17:29:22.648680  2836 solver.cpp:404]     Test net output #0: accuracy = 0.985
    I0105 17:29:22.648680  2836 solver.cpp:404]     Test net output #1: loss = 0.0472795 (* 1 = 0.0472795 loss)
  • 相关阅读:
    js学习之——js编写基本规范
    js学习之——数组的迭代方法
    css透明度设置,兼容所有的浏览器
    Mariadb配置主从复制
    Java枚举类型在switch语句中的正确用法
    Linux安装git
    Linux安装Jdk&Maven
    Postman配置token为全局变量
    Docker容器迁移
    Java获取当前时间到凌晨12点剩余秒数
  • 原文地址:https://www.cnblogs.com/qw12/p/6250474.html
Copyright © 2020-2023  润新知