• Optimization Algorithms


     

    1. Stochastic Gradient Descent

    2. SGD With Momentum

    Stochastic gradient descent with momentum remembers the update Δ w at each iteration, and determines the next update as a linear combination of the gradient and the previous update:

    Unlike in classical stochastic gradient descent, it tends to keep traveling in the same direction, preventing oscillations.

    3. RMSProp

    RMSProp (for Root Mean Square Propagation) is also a method in which the learning rate is adapted for each of the parameters. The idea is to divide the learning rate for a weight by a running average of the magnitudes of recent gradients for that weight. So, first the running average is calculated in terms of means square,

    where, gamma  is the forgetting factor.

    And the parameters are updated as,

    RMSProp has shown excellent adaptation of learning rate in different applications. RMSProp can be seen as a generalization of Rprop and is capable to work with mini-batches as well opposed to only full-batches.

    4. The Adam Algorithm

    Adam (short for Adaptive Moment Estimation) is an update to the RMSProp optimizer. In this optimization algorithm, running averages of both the gradients and the second moments of the gradients are used. Given parameters {displaystyle w^{(t)}} and a loss function {displaystyle L^{(t)}}, where t indexes the current training iteration (indexed at 1), Adam's parameter update is given by:

     

    where epsilon  is a small number used to prevent division by 0, and eta _{1} and eta _{2} are the forgetting factors for gradients and second moments of gradients, respectively.

    参考链接:Wikipedia

  • 相关阅读:
    UI自动化测试模型
    Selenium:HTML测试报告
    Selenium:浏览器及鼠标、键盘事件
    Selenium:WebDriver简介及元素定位
    selenium自动化环境搭建(Windows)
    浅谈UI自动化测试
    《MySQL:菜鸟入门系列》
    《HTTP协议:菜鸟入门系列》
    人人都是产品经理<1.0>
    聊聊连接池和线程
  • 原文地址:https://www.cnblogs.com/niuxichuan/p/8098562.html
Copyright © 2020-2023  润新知