• 深度学习 训练相关的超参数


    参数说明

    Parameter Default(常用值) Range Synopsis/Recommendation
    Number of Epochs 20 Depends on scenario Number of times the whole dataset is passed forward and backward through the network
    Batch Size 32(32, 64, 128, 256) Depends on scenario and hardware Number of input images (and corresponding labels) that are transferred to device memory at once and then processed simultaneously. Default values are chosen such that a network with up to 100 classes fits onto a device with 8 GB memory. If trained on GPU, set as high as permitted by memory. See also the additional information below.
    Learning Rate (λ) 0.001(0.01, 0.001, 0.0001) 0 < λ < 1 Determines the weight of the gradient on the updated loss function arguments; other name: step size. Too large values might result in divergence of the algorithm; very small values will take unnecessarily many steps (compare the figure Progress of Top-1 Error for Different Values of Learning Rate). You can configure to adapt (decrease) the learning rate after a certain number of epochs. See also Finding a Value for the Learning Rate.
    Momentum (μ) 0.9(0.5-0.9) 0 ≤ μ < 1 Fraction of the previous update step (vector) to add to the current step This parameter can help to attenuate the fluctuation of the loss function.
    Weight Prior (α) 0 0 ≤ α < 1 Regularization parameter penalizing large weights, used to prevent overfitting Start with a low value (e.g., 0.00001) and increase if overfitting occurs.

    方法

    • Manual Search
    • Grid Search
    • Random Search
    • Bayesian Optimization

    参考

  • 相关阅读:
    时间格式
    分页1
    vs2010 VS2008 VS2005 快捷键大全
    css 常用标签
    JS Array数组操作
    CSS属性
    jquery 选择器大全
    @fontface
    以前写过的ajax基础案例(王欢huanhuan)
    Jquery操作下拉框(DropDownList)的取值赋值实现代码(王欢)
  • 原文地址:https://www.cnblogs.com/zdfffg/p/15886423.html
Copyright © 2020-2023  润新知