• 误差计算


    Outline

    • MSE

    • Cross Entropy Loss

    • Hinge Loss

    MSE

    • loss=1N(yout)2

    • L2norm=(yout)

    import tensorflow as tf
    
    y = tf.constant([1, 2, 3, 0, 2])
    y = tf.one_hot(y, depth=4)  # max_label=3种
    y = tf.cast(y, dtype=tf.float32)
    

    out = tf.random.normal([5, 4])
    out

    <tf.Tensor: id=117, shape=(5, 4), dtype=float32, numpy=
    array([[ 0.8138832 , -1.1521571 ,  0.05197939,  2.3684442 ],
           [ 0.28827545, -0.35568208, -0.3952962 , -1.2576817 ],
           [-0.4354525 , -1.9914867 ,  0.37045303, -0.38287213],
           [-0.7680094 , -0.98293644,  0.62572837, -0.5673917 ],
           [ 1.5299634 ,  0.38036177, -0.28049606, -0.708137  ]],
          dtype=float32)>
    
    loss1 = tf.reduce_mean(tf.square(y - out))
    loss1
    
    <tf.Tensor: id=122, shape=(), dtype=float32, numpy=1.5140966>
    
    loss2 = tf.square(tf.norm(y - out)) / (5 * 4)
    loss2
    
    <tf.Tensor: id=99, shape=(), dtype=float32, numpy=1.3962512>
    
    loss3 = tf.reduce_mean(tf.losses.MSE(y, out))
    loss3
    
    <tf.Tensor: id=105, shape=(), dtype=float32, numpy=1.3962513>
    

    Entropy

    • Uncertainty

    • measure of surprise

    • lower entropy --> more info.

    Entropy=iP(i)logP(i)

    a = tf.fill([4], 0.25)
    a * tf.math.log(a) / tf.math.log(2.)
    
    <tf.Tensor: id=134, shape=(4,), dtype=float32, numpy=array([-0.5, -0.5, -0.5, -0.5], dtype=float32)>
    
    -tf.reduce_sum(a * tf.math.log(a) / tf.math.log(2.))
    
    <tf.Tensor: id=143, shape=(), dtype=float32, numpy=2.0>
    
    a = tf.constant([0.1, 0.1, 0.1, 0.7])
    -tf.reduce_sum(a * tf.math.log(a) / tf.math.log(2.))
    
    <tf.Tensor: id=157, shape=(), dtype=float32, numpy=1.3567797>
    
    a = tf.constant([0.01, 0.01, 0.01, 0.97])
    -tf.reduce_sum(a * tf.math.log(a) / tf.math.log(2.))
    
    <tf.Tensor: id=167, shape=(), dtype=float32, numpy=0.24194068>
    

    Cross Entropy

    H(p,q)=p(x)logq(x)H(p,q)=H(p)+DKL(p|q)

    • for p = q

      • Minima: H(p,q) = H(p)
    • for P: one-hot encodint

      • h(p:[0,1,0])=1log1=0
      • H([0,1,0],[p0,p1,p2])=0+DKL(p|q)=1logq1 # p,q即真实值和预测值相等的话交叉熵为0

    Binary Classification

    • Two cases(第二种格式只需要输出一种情况,节省计算,无意义)

    Single output

    H(P,Q)=P(cat)logQ(cat)(1P(cat))log(1Q(cat))P(dog)=(1P(cat))

    H(P,Q)=i=(cat,dog)P(i)logQ(i)=P(cat)logQ(cat)P(dog)logQ(dog)(ylog(p)+(1y)log(1p))

    Classification

    • H([0,1,0],[p0,p1,p2])=0+DKL(p|q)=1logq1

    P1=[1,0,0,0,0]Q1=[0.4,0.3,0.05,0.05,0.2]

    H(P1,Q1)=P1(i)logQ1(i)=(1log0.4+0log0.3+0log0.05+0log0.05+0log0.2)=log0.40.916

    P1=[1,0,0,0,0]Q1=[0.98,0.01,0,0,0.01]

    H(P1,Q1)=P1(i)logQ1(i)=log0.980.02

    tf.losses.categorical_crossentropy([0, 1, 0, 0], [0.25, 0.25, 0.25, 0.25])
    
    <tf.Tensor: id=186, shape=(), dtype=float32, numpy=1.3862944>
    
    tf.losses.categorical_crossentropy([0, 1, 0, 0], [0.1, 0.1, 0.8, 0.1])
    
    <tf.Tensor: id=205, shape=(), dtype=float32, numpy=2.3978953>
    
    tf.losses.categorical_crossentropy([0, 1, 0, 0], [0.1, 0.7, 0.1, 0.1])
    
    <tf.Tensor: id=243, shape=(), dtype=float32, numpy=0.35667497>
    
    tf.losses.categorical_crossentropy([0, 1, 0, 0], [0.01, 0.97, 0.01, 0.01])
    
    <tf.Tensor: id=262, shape=(), dtype=float32, numpy=0.030459179>
    
    tf.losses.BinaryCrossentropy()([1],[0.1])
    
    <tf.Tensor: id=306, shape=(), dtype=float32, numpy=2.3025842>
    
    tf.losses.binary_crossentropy([1],[0.1])
    
    <tf.Tensor: id=333, shape=(), dtype=float32, numpy=2.3025842>
    

    Why not MSE?

    • sigmoid + MSE

      • gradient vanish
    • converge slower

    • However

      • e.g. meta-learning

    logits-->CrossEntropy

  • 相关阅读:
    产生sql语句的vba
    如何在IIS7或IIS7.5中导入导出站点及应用程序池. -摘自网络
    [js高手之路] es6系列教程
    [js高手之路] es6系列教程
    [js高手之路] es6系列教程
    [js高手之路] es6系列教程
    最通俗易懂的javascript变量提升
    [js高手之路] es6系列教程
    学生问的一道javascript面试题[来自腾讯]
    Java关键字final、static使用总结(转)
  • 原文地址:https://www.cnblogs.com/abdm-989/p/14123282.html
Copyright © 2020-2023  润新知