神经网络 - 润新知

神经网络
当特征的数量很多的时候，逻辑回归在分类问题上就难以发挥了，这时需要运用神经网络。

神经网络的起源

Origins: Algorithms that try to mimic the brain.

Was very widely used in 80s ans early 90s; popularity diminished in late 90s.

Recent resurgence: State-of-the-art technique for many applications

起源：试图模仿大脑的算法。

在80年代和90年代初被广泛使用; 人气在90年代末减少了。

最近的复兴：适用于许多应用的最先进技术（最近复兴的原因之一是计算机的计算能力提升了很多）。

简单的神经元模型：逻辑单元

可以认为中间的绿色神经元做了这样的操作

1，将输入值转化为向量x

[x = left[ {egin{array}{*{20}{c}}
{egin{array}{*{20}{c}}
{{x_0}}\
{{x_1}}
end{array}}\
{{x_2}}\
{{x_3}}
end{array}} ight]]

2，将x与θ相乘后，做sigmoid函数处理并输出

[ heta = left[ {egin{array}{*{20}{c}}
{egin{array}{*{20}{c}}
{{ heta _0}}\
{{ heta _1}}
end{array}}\
{{ heta _2}}\
{{ heta _3}}
end{array}} ight]]

[{h_ heta }left( x ight) = frac{1}{{1 + {e^{ - { heta ^T}x}}}}]

注意：这里没有画出x₀，它的值总等于1，通常称为“bias unit”

逻辑回归中的Sigmoid函数在这里称为“activation function”（激活函数）

[gleft( z ight) = frac{1}{{1 + {e^{ - z}}}}]

逻辑回归中的θ，在这里称为“weights”（权重）

标准的神经网络
- 1，这是一个有三层结构的神经网络，三层分别是：输入层（input layer）、隐含层（hidden layer）、输出层（output layer）
- 2，按照惯例加上“bias unit”：x₀, a₀。
符号：

[a_i^{left( i ight)}]

“activation” of unit i in layer j

第j层的第i个神经元的激活函数

[{Theta ^{left( j ight)}}]

matrix of wights controlling function mapping from layer j to layer j+1

控制从j层到j+1层函数映射的权重矩阵

对于上图所示的神经网络有

[egin{array}{l}
a_1^{left( 2 ight)} = gleft( {Theta _{10}^{left( 1 ight)}{x_0} + Theta _{11}^{left( 1 ight)}{x_1} + Theta _{12}^{left( 1 ight)}{x_2} + Theta _{13}^{left( 1 ight)}{x_3}} ight)\
a_2^{left( 2 ight)} = gleft( {Theta _{20}^{left( 1 ight)}{x_0} + Theta _{21}^{left( 1 ight)}{x_1} + Theta _{22}^{left( 1 ight)}{x_2} + Theta _{23}^{left( 1 ight)}{x_3}} ight)\
a_3^{left( 2 ight)} = gleft( {Theta _{30}^{left( 1 ight)}{x_0} + Theta _{31}^{left( 1 ight)}{x_1} + Theta _{32}^{left( 1 ight)}{x_2} + Theta _{33}^{left( 1 ight)}{x_3}} ight)\
{h_ heta }left( x ight) = a_1^{left( 3 ight)} = gleft( {Theta _{10}^{left( 2 ight)}{x_0} + Theta _{11}^{left( 2 ight)}{x_1} + Theta _{12}^{left( 2 ight)}{x_2} + Theta _{13}^{left( 2 ight)}{x_3}} ight)
end{array}]

if network has s_j units in layer j, s_j+1 units in layer j+1, then Θ^(j) will be of dimension s_j+1 x (s_j + 1)

如果网络中第j层有s_j个神经元，第j+1层有s_j+1个神经元，那么Θ^(j)将是一个s_j+1 x (s_j + 1)的矩阵（因为要加上“bias unit”）

Forward propagation: Vectorized implementation

前向传播：矢量化实现

定义

[Z_1^{left( 2 ight)} = Theta _{10}^{left( 1 ight)}{x_0} + Theta _{11}^{left( 1 ight)}{x_1} + Theta _{12}^{left( 1 ight)}{x_2} + Theta _{13}^{left( 1 ight)}{x_3}]

则

[a_1^{left( 2 ight)} = gleft( {Z_1^{left( 2 ight)}} ight)]

同理

[egin{array}{l}
a_2^{left( 2 ight)} = gleft( {Z_2^{left( 2 ight)}} ight)\
a_3^{left( 2 ight)} = gleft( {Z_3^{left( 2 ight)}} ight)\
a_1^{left( 3 ight)} = gleft( {Z_1^{left( 3 ight)}} ight)
end{array}]

定义

[x = left[ {egin{array}{*{20}{c}}
{{x_0}}\
{{x_1}}\
{{x_2}}\
{{x_3}}
end{array}} ight] = {a^{left( 1 ight)}},{z^{left( 2 ight)}} = left[ {egin{array}{*{20}{c}}
{z_1^{left( 2 ight)}}\
{z_2^{left( 2 ight)}}\
{z_2^{left( 2 ight)}}
end{array}} ight]]

则

[egin{array}{l}
{z^{left( 2 ight)}} = {Theta ^{left( 1 ight)}}{a^{left( 1 ight)}}\
{a^{left( 2 ight)}} = gleft( {{z^{left( 2 ight)}}} ight)\
加入“bias unit” a_0^{left( 2 ight)} = 1\
{z^{left( 3 ight)}} = {Theta ^{left( 2 ight)}}{a^{left( 2 ight)}}\
{h_ heta }left( x ight) = {a^{left( 3 ight)}} = gleft( {{z^{left( 3 ight)}}} ight)
end{array}]

加深理解

如果挡住神经网路的左半部分，会发现这就是逻辑回归。只不过原来的x变成了现在的a⁽²⁾。

而a⁽²⁾又是由前面的x⁽¹⁾确定

其他神经网络结构

神经网路可以由很多层，第一层为输入层，最后一层为输出层，其余为隐含层
相关阅读:
ng2-bootstrap的modal嵌套时无法滚动的情况
 oracle自动补0
webservice 从客户端中检测到有潜在危险的 Request.Form 值
 树莓派花生壳
 ubuntu E: Could not get lock /var/lib/dpkg/lock
树莓派配置静态ip
解决PL/SQL查询结果乱码的问题
 批处理脚本命令行方式关闭Windows服务
 最简单的分享到微博代码
 Select的onchange事件
原文地址：https://www.cnblogs.com/qkloveslife/p/9867498.html