• Machine Learning--week4 神经网络的基本概念


    之前的学习成果并不能解决复杂的非线性问题

    Neural Networks

    Sigmoid(logistic) activation function: activation function is another term for (g(z) = frac{1}{1+e^{-z}})

    activation: the value that's computed by and as output by a specific

    weights = parameters = ( heta)

    input units: (x_1,x_2, x_3,dots, x_n)

    bias unit/ bias neuron: (x_0)(a_0^{(j)})

    input units 和 hypothesis 之间的layer 由activation 构成

    input wire/ output wire:input wire是指指向目标neuron的箭头,output wire是指从目标neuron指出的箭头

    (a_i^{(j)}): "activation" of neuron (i) or of unit (i) in layer (j)

    (Theta^{(j)}): matrix of weights controlling the function mapping form layer (j) to layer (j+1)

    (注意(Theta)是大写的,因为它需要用到矩阵的形式了)

    layer 1 == input layer

    layer n == output layer (the last layer)

    layer 2 ~ layer n-1 == hidden layer

    for example:

    [egin{align} ext{output of layer 1(a hidden lyer)}&egin{cases}a_1^{(2)} &= g(Theta_{10}^{(1)}x_0 + Theta_{11}^{(1)}x_1 + Theta_{12}^{(1)}x_2 + Theta_{13}^{(1)}x_3)\ a_2^{(2)} &= g(Theta_{20}^{(1)}x_0 + Theta_{21}^{(1)}x_1 + Theta_{22}^{(1)}x_2 + Theta_{23}^{(1)}x_3)\ a_3^{(2)} &= g(Theta_{30}^{(1)}x_0 + Theta_{31}^{(1)}x_1 + Theta_{32}^{(1)}x_2 + Theta_{33}^{(1)}x_3)end{cases}\ ext{output layer}&egin{cases}h_Theta(x) = a_1^{(3)} = g(Theta_{10}^{(2)}a_0^{(2)} + Theta_{11}^{(2)}a_1^{(2)} +Theta_{12}^{(2)}a_2^{(2)} + Theta_{13}^{(2)}a_3^{(2)})end{cases} end{align} ]

    直观点就是:

    [egin{align} ext{output of layer 1(a hidden lyer)} &egin{cases} a_1^{(2)} &= g(Theta_{1}^{(1)}a^{(1)})\ a_2^{(2)} &= g(Theta_{2}^{(1)}a^{(1)})\ a_3^{(2)} &= g(Theta_{3}^{(1)}a^{(1)}) end{cases}\ ext{output layer} &egin{cases} h_Theta(x) = a_1^{(3)} = g(Theta_{1}^{(2)}a^{(2)}) end{cases} end{align} ]

    )generally, (Theta^{(j)}) will be of dimension (s_{j+1} imes (s_j+1)), if network has (s_j) units in layer (j) and (s_{j+1}) units in layer (j+1). ((s_j+1)中的(+1) comes from the addition in (Theta^{(j)}) of the "bias nodes," (x_0) and (Theta_0^{(j)}) . In other words the output nodes will not include the bias nodes while the inputs will. )

    定义 (a^{(1)} = x)

    (z^{j+1} = Theta^{(j)}a^{(j)})

    (x_k^{(j+1)} = Theta_{k,0}^{(j)}a_0^{(j)} + Theta_{k,1}^{(j)}a_1^{(j)} + dots + Theta_{k,n^{(j)}}^{(j)}a_{n^{(j)}}^{(j)}quad ,(n^{(j)} ext{ means layer j has } n^{(j)} ext{ activation}))

    (a^{(j)} = g(z^{(j)}) = g(Theta^{(j-1)}a^{(j-1)})quad(jge2))

    设有 (n) 个 layers, then the last matrix (Theta^{(n)}) will have only one row which is multiplied by one column (a^{(j)}) so that our result is a single number:

    (h_Theta(x) = a^{(n+1)}=g(z^{(n+1)}))

    Add (a_0^{(j)}=1)

    Forward Propagation:向前传播

    Neural Networks 实际上是使用(a^{(n-1)})layer作为训练logistic regression的特征的,而非input layer,在(Theta^{(1)})中选择不同的参数可能得到一些复杂的特征,从而的到更好的hypothesis,这样做比直接用(x_1,x_2,dots ,x_n)作为训练特征更好

    architecture(架构):the way that neural networks are connected

    逻辑表达式对应的( heta)

    • ({ m AND} = (x_1 igwedge x_2)):
      • (Theta = egin{bmatrix}-30 &20& 20 end{bmatrix})
    • ({ m NOR} = (lnot x_1 igwedge lnot x_2)):
      • (Theta = egin{bmatrix}10 & -20& -20 end{bmatrix})
    • ({ m OR} = (x_1 igvee x_2)):
      • (Theta = egin{bmatrix}-10 &20& 20 end{bmatrix})
    • ({ m NOT} = (lnot x)):
      • (Theta = egin{bmatrix}-10 & 20end{bmatrix})
    • ({ m XNOR} = (lnot x_1 igwedge lnot x_2) igvee ( x_1 igwedge x_2))
      • 需要一个hidden layer: (a_1^{(2)} == (lnot x_1 igwedge lnot x_2),quad a_2^{(2)} == (x_1 igwedge x_2))
      • output layer: (a^{(3)} == (a_1^{(2)} igvee a_2^{(2)}))

    逻辑表达式的实现:

    ​ 令(x=egin{bmatrix}1 \ x_1\x_2 end{bmatrix}), 则 (a_i = g(Theta_ix))就得到(Theta_i)对应的逻辑运算符运算(x_1,x_2)的结果了

    ​ 比如 (Theta_i = egin{bmatrix}-10 &20& 20 end{bmatrix})那么(a_i == x_1 igvee x_2)

    ​ 像({ m XNOR})这种复杂的逻辑表达式需要借助hidden layer才能算出来

    对于 multiclass Classification:

    ​ 用(y = egin{bmatrix}1\0\0\0 end{bmatrix}, egin{bmatrix}0\1\0\0 end{bmatrix}, egin{bmatrix}0\0\1\0 end{bmatrix}, egin{bmatrix}0\0\0\1 end{bmatrix},egin{bmatrix}0\0\0\0 end{bmatrix})来表示不同的class,

  • 相关阅读:
    python 学习笔记7(类/对象的属性;特性,__getattr__)
    linux 误删文件恢复
    python 学习笔记6(数据库 sqlite)
    hive 函数 Cube
    边标志法填充多边形
    tolua#代码简要分析
    CocoaAsyncSocket + Protobuf 处理粘包和拆包问题
    【设计模式】适配器模式
    【设计模式】外观模式
    【操作系统】进程管理(二)
  • 原文地址:https://www.cnblogs.com/khunkin/p/10199395.html
Copyright © 2020-2023  润新知