• Supervised Learning001


    • Logistic Regression
      • We can approach the classfication problem ignoring the fact that y is discrete-valued, and use old linear regression algorithm to try to predict y given x. 
      • Intuitively, it also doesn't make sense for hθ(x) to take values larger than 1 or smaller than 0
      • To fix this, let's change the form for hypotheses  hθ(x), it's called logistic function or sigmoid function

     

      • a useful property of the derivetive of the sigmoid function, which is written as g'
      • To fit θ for logisitic regerssion model, let's endow our classification model with a set of probabilistic assumptions, and then fit the parameters via maximum likelihood
        • assume that

           it can be written more compactly as 

        •  Assuming that the n training examples wear generated independently, we can the write down th likelihood function of the parameters as

           and log likelihood function

        •  To get the maximum of l(θ), similar to our derivation in the case of linear regression, we can use gradient ascent. Written in vectorial notation, our updates will therefore be given by , take derivatives to derive the stochastic gradient ascent rule:

        • This therefore gives us the stochastic gradient ascent rule 

  • 相关阅读:
    第一节 变量与常量
    go语言学习笔记
    Java日期时间API系列41-----根据毫秒值计算倒计时
    数据库DML(数据操纵)
    数据库概述和DDL(数据库定义)
    软件测试基础理论
    软件测试学习大纲
    matplotlib
    pandas详细应用和文件处理
    DataFrame
  • 原文地址:https://www.cnblogs.com/yuelien/p/12917084.html
Copyright © 2020-2023  润新知