斯坦福机器学习课程 Exercise 习题四

Exercise 4: Logistic Regression and Newton’s Method

回顾一下线性回归
hθ(x)=θTx

Logistic Regression
hθ(x)=11+e−θTx=p{y=1|x;θ}

cost(hθ(x),y)的选择
cost(hθ(x),y)=−loghθ(x) (y=1)
选择对数似然损失函数作为逻辑回归的Cost Function 原因是这个cost函数是凸函数，具有碗状的形状，而凸函数具有良好的性质：对
于凸函数来说局部最小值点即为全局最小值点，因此只要能求得这类函数的一个最小值点，该点一定为全局最小值点。
当hθ(x)=1的时候cost =0 反之cost=+∞
同理，cost(hθ(x),y)=−log(1−hθ(x)) (y=0)
当hθ(x)=0的时候cost =0 反之cost=+∞

in summarize

c o s t (h θ (x), y) = - y l o g h θ (x) - (1 - y) l o g (1 - h θ (x)) (y=1 or 0)

J (θ) = 1 m c o s t (h θ (x (i)), y (i))

J (θ) = - 1 m \sum i = 1 m [y l o g h θ (x) + (1 - y) l o g (1 - h θ (x))]

牛顿迭代法

x n + 1 = x n - f ' ( x n ) f '' ( x n )

decision boundary

h θ (x) = 1 - g (θ T x) = 0.5

θ 0 + θ 1 x 1 + θ 2 x 2 = 0

x 2 = - 1 θ 2 (θ 0 + θ 1 x 1)

p l o t_y = - 1 θ 2 (θ 0 + θ 1 X)

预测不被admitted的概率

p r o b = 1 - g (θ T x)

for i=1:MAX_ITR
    z=x*theta;
    h=g(z);
    deltaJ= (1/m).* x' * (h - y);
    Hessian=(1/m).*x'* diag(h) * diag(1-h) * x;
    J(i)= (1/m) * sum (-y.*log(h) - (1-y).*log(1-h) );
    theta = theta - Hessian  deltaJ;
end

关键的地方是 Hessian矩阵的求法：
Ng的课程讲到

H = 1 m \sum i = 1 m [h (x (i)) � � �� � � � � R (1 - h (x (i))) � � �� � � � � � � � � � � R * (x (i)) * (x (i)) T]

后面的则是

R (n + 1) \times 1 * R 1 \times (n + 1)

h(x(i))是向量，因此在矩阵运算的时候，将向量表示成对角矩阵。

d i a g (h) * d i a g (1 - h)

本文完

相关阅读:
MongoDB学习：（一）MongoDB安装
事件轮询 Event Loop
常见的HTML5语义化标签
前端动画性能优化方案
前端动画的实现
《SVN的操作流程及规范》
css、js文件后的后缀作用是什么？
实现单行文字溢出显示...，以及多行文字溢出显示...
从输入URL到页面返回的过程详解
jQuery实现点击复制效果

原文地址：https://www.cnblogs.com/slankka/p/9158536.html