• caffe中的sgd,与激活函数(activation function)


    caffe中activation function的形式,直接决定了其训练速度以及SGD的求解。

    在caffe中,不同的activation function对应的sgd的方式是不同的,因此,在配置文件中指定activation layer的type,目前caffe中用的最多的是relu的activation function.

    caffe中,目前实现的activation function有以下几种:

    absval, bnll, power, relu, sigmoid, tanh等几种,分别有单独的layer层。其数学公式分别为:

    算了,这部分我不解释了,直接看caffe的tutorial

    ReLU / Rectified-Linear and Leaky-ReLU

    • LayerType: RELU
    • CPU implementation: ./src/caffe/layers/relu_layer.cpp
    • CUDA GPU implementation: ./src/caffe/layers/relu_layer.cu
    • Parameters (ReLUParameter relu_param)
      • Optional
        • negative_slope [default 0]: specifies whether to leak the negative part by multiplying it with the slope value rather than setting it to 0.
    • Sample (as seen in ./examples/imagenet/imagenet_train_val.prototxt)

      layers {
        name: "relu1"
        type: RELU
        bottom: "conv1"
        top: "conv1"
      }
      

    Given an input value x, The RELU layer computes the output as x if x > 0 and negative_slope * x if x <= 0. When the negative slope parameter is not set, it is equivalent to the standard ReLU function of taking max(x, 0). It also supports in-place computation, meaning that the bottom and the top blob could be the same to preserve memory consumption.

    Sigmoid

    • LayerType: SIGMOID
    • CPU implementation: ./src/caffe/layers/sigmoid_layer.cpp
    • CUDA GPU implementation: ./src/caffe/layers/sigmoid_layer.cu
    • Sample (as seen in ./examples/imagenet/mnist_autoencoder.prototxt)

      layers {
        name: "encode1neuron"
        bottom: "encode1"
        top: "encode1neuron"
        type: SIGMOID
      }
      

    The SIGMOID layer computes the output as sigmoid(x) for each input element x.

    TanH / Hyperbolic Tangent

    • LayerType: TANH
    • CPU implementation: ./src/caffe/layers/tanh_layer.cpp
    • CUDA GPU implementation: ./src/caffe/layers/tanh_layer.cu
    • Sample

      layers {
        name: "layer"
        bottom: "in"
        top: "out"
        type: TANH
      }
      

    The TANH layer computes the output as tanh(x) for each input element x.

    Absolute Value

    • LayerType: ABSVAL
    • CPU implementation: ./src/caffe/layers/absval_layer.cpp
    • CUDA GPU implementation: ./src/caffe/layers/absval_layer.cu
    • Sample

      layers {
        name: "layer"
        bottom: "in"
        top: "out"
        type: ABSVAL
      }
      

    The ABSVAL layer computes the output as abs(x) for each input element x.

    Power

    • LayerType: POWER
    • CPU implementation: ./src/caffe/layers/power_layer.cpp
    • CUDA GPU implementation: ./src/caffe/layers/power_layer.cu
    • Parameters (PowerParameter power_param)
      • Optional
        • power [default 1]
        • scale [default 1]
        • shift [default 0]
    • Sample

      layers {
        name: "layer"
        bottom: "in"
        top: "out"
        type: POWER
        power_param {
          power: 1
          scale: 1
          shift: 0
        }
      }
      

    The POWER layer computes the output as (shift + scale * x) ^ power for each input element x.

    BNLL

    • LayerType: BNLL
    • CPU implementation: ./src/caffe/layers/bnll_layer.cpp
    • CUDA GPU implementation: ./src/caffe/layers/bnll_layer.cu
    • Sample

      layers {
        name: "layer"
        bottom: "in"
        top: "out"
        type: BNLL
      }
      

    The BNLL (binomial normal log likelihood) layer computes the output as log(1 + exp(x)) for each input element x.

     

    转载请注明出处,谢谢。
  • 相关阅读:
    springmvc的注解式开发
    springmvc
    spring整合Mybatis
    spring的事务管理
    注解定义增强的两个方法
    动态代理
    错题解析
    SpringMVC的基本操作
    Spring整合MyBatis
    配置事务以及事务回滚
  • 原文地址:https://www.cnblogs.com/jianyingzhou/p/4104977.html
Copyright © 2020-2023  润新知