• Pytorch 损失函数总结


    1 nn.L1Loss

      torch.nn.L1Loss(reduction='mean')

      就是 MAE(mean absolute error),计算公式为

        $\ell(x, y)=L=\left\{l_{1}, \ldots, l_{N}\right\}^{\top}, \quad l_{n}=\left|x_{n}-y_{n}\right|$

        $\ell(x, y)=\left\{\begin{array}{ll}\operatorname{mean}(L), & \text { if reduction }=\text { 'mean'; } \\\operatorname{sum}(L), & \text { if reduction }=\text { 'sum' }\end{array}\right.$

      例子:逐元素计算

    input = torch.arange(1,7.).view(2,3)
    target = torch.arange(6).view(2,3)
    print(input)
    print(target)
    """
    tensor([[1., 2., 3.],
            [4., 5., 6.]])
    tensor([[0, 1, 2],
            [3, 4, 5]])
    """
    loss = nn.L1Loss(reduction='sum')
    output = loss(input, target)
    print(output)
    """
    tensor(6.)
    """
    loss = nn.L1Loss(reduction='mean')
    output = loss(input, target)
    print(output)
    """
    tensor(1.)
    """

    2 nn.MSELoss

        torch.nn.MSELoss(reduction='mean')

      如其名,mean squared error,也就是 L2 正则项,计算公式为

      $\ell(x, y)=\left\{\begin{array}{ll}\operatorname{mean}(L), & \text { if reduction }=\text { 'mean'; } \\\operatorname{sum}(L), & \text { if reduction }=\text { 'sum' }\end{array}\right.$

      $\ell(x, y)=L=\left\{l_{1}, \ldots, l_{N}\right\}^{\top}, \quad l_{n}=\left(x_{n}-y_{n}\right)^{2}$

      有 mean 和 sum 两种模式选,通过 reduction 控制。

      例子:逐元素计算

    loss = nn.MSELoss(reduction="mean")
    output = loss(input, target)
    print(output)
    """
    tensor(1.)
    """
    loss = nn.MSELoss(reduction="sum")
    output = loss(input, target)
    print(output)
    """
    tensor(6.)
    """

      从上述实验可以看出

        $l_{n}=\left(x_{n}-y_{n}\right)^{2}$ 

      是逐元素计算。

    3 nn.SmoothL1Loss

        torch.nn.SmoothL1Loss(reduction='mean', beta=1.0)

      对 L1 做了一点平滑,比起MSELoss,对于 outlier 更加不敏感。

        $\ell(x, y)=L=\left\{l_{1}, \ldots, l_{N}\right\}^{T}$

        $l_{n}=\left\{\begin{array}{ll}0.5\left(x_{n}-y_{n}\right)^{2} / \text { beta }, & \text { if }\left|x_{n}-y_{n}\right|<\text { beta } \\\left|x_{n}-y_{n}\right|-0.5 * \text { beta }, & \text { otherwise }\end{array}\right.$

      在Fast-RCNN中使用以避免梯度爆炸。

      例子:逐元素计算

    loss = nn.MSELoss(reduction="sum")
    output = loss(input, target)
    print(output)
    """
    tensor(6.)
    """
    loss = nn.SmoothL1Loss(reduction="mean")
    output = loss(input, target)
    print(output)
    """
    tensor(0.5000)
    """
    loss = nn.SmoothL1Loss(reduction="mean",beta = 3)
    output = loss(input, target)
    print(output)
    """
    tensor(0.1667)
    """

    4 nn.BCELoss 以及 nn.BCEWithLogitsLoss

        torch.nn.BCELoss(weight=None,reduction='mean')

      Binary Cross Entropy,公式如下:

        $\ell(x, y)=\left\{\begin{array}{ll}\operatorname{mean}(L), & \text { if reduction }=\text { 'mean'; } \\\operatorname{sum}(L), & \text { if reduction }=\text { 'sum' }\end{array}\right.$

        $\ell(x, y)=L=\left\{l_{1}, \ldots, l_{N}\right\}^{\top}, \quad l_{n}=-w_{n}\left[y_{n} \cdot \log x_{n}+\left(1-y_{n}\right) \cdot \log \left(1-x_{n}\right)\right]$

      双向的交叉熵,相当于交叉熵公式的二分类简化版,可以用于分类不互斥的多分类任务。

      BCELoss需要先手动对输入 sigmoid,然后每一个位置如果分类是 1 则加 $-log(exp(x))$ 否则加 $-log(exp(1-x))$,最后求取平均。

      BCEWithLogitsLoss 则不需要 sigmoid,其他都完全一样。

      例子:逐元素计算。

    target = torch.tensor([[1,0,1],[0,1,1]],dtype = torch.float32)
    raw_output = torch.randn(2,3,dtype = torch.float32)
    output = torch.sigmoid(raw_output)
    print(output)
    
    result = np.zeros((2,3))
    for ix in range(2):
        for iy in range(3):
            if(target[ix, iy]==1): 
                result[ix, iy] += -np.log(output[ix, iy])
            elif(target[ix, iy]==0): 
                result[ix, iy] += -np.log(1-output[ix, iy])
    
    print(result)
    print(np.mean(result))
    
    loss_fn = torch.nn.BCELoss(reduction='none')
    print(loss_fn(output, target))
    loss_fn = torch.nn.BCELoss(reduction='mean')
    print(loss_fn(output, target))
    loss_fn = torch.nn.BCEWithLogitsLoss(reduction='sum')
    print(loss_fn(raw_output, target))
    tensor([[0.5316, 0.6816, 0.4768],
            [0.6485, 0.3037, 0.5490]])
    
    [[0.63186073 1.14431179 0.74067789]
     [1.04543173 1.19187558 0.59973639]]
    
    0.892315685749054
    
    tensor([[0.6319, 1.1443, 0.7407],
            [1.0454, 1.1919, 0.5997]])
    
    tensor(0.8923)
    
    tensor(5.3539)

    5 nn.CrossEntropyLoss

         torch.nn.CrossEntropyLoss(weight=None, ignore_index=- 100, reduction='mean', label_smoothing=0.0)

      经典Loss, 计算公式为:

        $\text { weight }[\text { class }]\left(-\log \left(\frac{\exp (x[\text { class }])}{\sum\limits_{j} \exp (x[j])}\right)\right)=\text { weight }[\text { class }]\left(-x[\text { class }]+\log \left(\sum\limits_{j} \exp (x[j])\right)\right)$

      相当于先将输出值通过 softmax 映射到每个值在 $[0,1]$,和为 $1$ 的空间上。

      希望正确的 class 对应的 loss 越小越好,所以对 $\left(\frac {\exp (x[\text {class}])}{\sum\limits _{j} \exp (x[j])}\right)$ 求取 $-log()$, 把 $[0,1]$ 映射到 $[0,+\infty]$ 上,正确项的概率占比越大,整体损失就越小。

      torch里的CrossEntropyLoss(x) 等价于 NLLLoss(LogSoftmax(x))

    class GMLP(nn.Module):
        def __init__(self, nfeat, nhid, nclass, dropout):
            super(GMLP, self).__init__()
            self.nhid = nhid
            self.mlp = Mlp(nfeat, self.nhid, dropout)
            self.classifier = Linear(self.nhid, nclass)
    
        def forward(self, x):
            Z = self.mlp(x)
    
            if self.training:
                x_dis = get_feature_dis(Z)
    
            class_feature = self.classifier(Z)
            class_logits = F.log_softmax(class_feature, dim=1)
    
            if self.training:
                return class_logits, x_dis
            else:
                return class_logits
    
    
    loss_train_class = F.nll_loss(output[idx_train], labels[idx_train])
    Example Code

      预期输入未normalize过的score,输入形状和NLL一样,为$(N,C)和(N)$

      例子:按样本数计算

    target = torch.tensor([1,0,3])
    output = torch.randn(3,5)
    print(output)
    """
    tensor([[-2.5728, -0.4581, -0.2017,  1.8813,  0.4544],
            [-0.7278,  0.6300,  0.6510, -1.7570,  1.1788],
            [-0.4660,  0.0410,  0.6876,  0.8966,  0.1446]])
    """
    loss_fn = torch.nn.CrossEntropyLoss(reduction='mean')
    loss = loss_fn(output, target)
    print(loss)
    """
    tensor(2.1940)
    """
    loss_fn = torch.nn.CrossEntropyLoss(reduction='sum')
    loss = loss_fn(output, target)
    print(loss)
    """
    tensor(6.5821)
    """

       例子:手写版

    target = torch.tensor([1,0,3])
    output = torch.randn(3,5)
    print(output)
    """
    tensor([[-0.1168,  1.5417,  1.1748, -1.1856, -0.1233],
            [ 0.2074, -0.7376, -0.8934,  0.0899,  0.5337],
            [-0.5323, -0.2945, -0.1710,  1.5925,  1.3654]])
    """
    result = np.array([0.0, 0.0, 0.0])
    for ix in range(3):
        log_sum = 0.0
        for iy in range(5):
            if(iy==target[ix]): 
                result[ix] += -output[ix, iy]
            log_sum += np.exp(output[ix, iy])
        result[ix] += np.log(log_sum)
    print(result)
    print(np.mean(result))
    
    loss_fn = torch.nn.CrossEntropyLoss(reduction='mean')
    loss = loss_fn(output, target)
    print(loss.item())
    """
    [0.75984335 1.3853296  0.80614853]
    0.9837738275527954
    0.9837737679481506
    """

    6 nn.NLLLoss

         torch.nn.NLLLoss(weight=None,ignore_index=- 100, reduction='mean')

      negative log likelihood loss,用于训练 n 类分类器,对于不平衡数据集,可以给类别添加 weight,计算公式为
        $l_{n}=-w_{y_{n}} x_{n, y_{n}}$

        $-w_{c}=\text { weight }[c] \cdot 1$

      预期输入形状 $(N,C)$ 和 $(N)$,其中 $N$ 为 batch 大小,$C$ 为类别数;

      计算每个 case 的 target 对应类别的概率的负值,然后求取平均/和,一般与一个 LogSoftMax 连用从而获得对数概率。

      例子:按样本数计算

    target = torch.tensor([1,0,3])
    output = torch.randn(3,5)
    print(output)
    
    loss_fn = torch.nn.NLLLoss(reduction='mean')
    loss = loss_fn(output, target)
    print(loss)
    
    loss_fn = torch.nn.NLLLoss(reduction='sum')
    loss = loss_fn(output, target)
    print(loss)
    """
    tensor([[ 1.5083,  0.1846, -1.8400, -0.0068, -0.1943],
            [ 0.5303, -0.0350, -0.3924,  0.3026,  0.6159],
            [ 2.0047, -1.0653,  0.0718, -0.8632, -1.0695]])
    tensor(0.0494)
    tensor(0.1482)
    """

      显然不是逐元素计算。

      例子:

    import torch
    input=torch.randn(3,3)
    soft_input = torch.nn.Softmax(dim=0)
    soft_input(input)
    """
    tensor([[0.2603, 0.6519, 0.5811],
            [0.5248, 0.3026, 0.1783],
            [0.2148, 0.0455, 0.2406]])
    """
    #对softmax结果取log
    torch.log(soft_input(input))
    """
    tensor([[-1.3458, -0.4279, -0.5428],
            [-0.6447, -1.1952, -1.7243],
            [-1.5379, -3.0898, -1.4248]])
    """

      假设标签是[0,1,2],第一行取第0个元素,第二行取第1个,第三行取第2个,去掉负号,即[0.3168,3.3093,0.4701],求平均值,就可以得到损失值。

    (0.3168+3.3093+0.4701)/3
    """
    1.3654000000000002
    """
    loss=torch.nn.NLLLoss()
    target=torch.tensor([0,1,2])
    loss(input,target)
    """
    tensor(-0.1395)
    """

       所以 nn.NLLLoss 计算方式为:log(softmax) 取平均

     

    参考:https://segmentfault.com/a/1190000038584083

  • 相关阅读:
    第三次作业——《原型设计》
    第二次作业《熟悉使用工具》
    跟着《构建之法》学习软件工程(第一次作业)
    纯js代码实现手风琴特效
    HTML5
    为什么做前端要做好SEO
    让div盒子相对父盒子垂直居中的几种方法
    模板artTemplate
    bootstrap兼容问题
    移动常用的类库
  • 原文地址:https://www.cnblogs.com/BlairGrowing/p/15982528.html
Copyright © 2020-2023  润新知