• tensorflow中的sequence_loss_by_example


    在编写RNN程序时,一个很常见的函数就是sequence_loss_by_example

    loss = tf.contrib.legacy_seq2seq.sequence_loss_by_example(logits_list, targets_list, weights_list, average_across_timesteps)

    这个函数在contrib中的legacy(遗产)中,可见这个函数不是tensorflow支持的官方函数。

    import numpy as np
    import tensorflow as tf
    
    
    def sequence_loss_by_example(logits,
                                 targets,
                                 weights,
                                 average_across_timesteps=True,
                                 softmax_loss_function=None,
                                 name=None):
        """Weighted cross-entropy loss for a sequence of logits (per example).
    
        Args:
          logits: List of 2D Tensors of shape [batch_size x num_decoder_symbols].
          targets: List of 1D batch-sized int32 Tensors of the same length as logits.
          weights: List of 1D batch-sized float-Tensors of the same length as logits.
          average_across_timesteps: If set, divide the returned cost by the total
            label weight.
          softmax_loss_function: Function (labels, logits) -> loss-batch
            to be used instead of the standard softmax (the default if this is None).
            **Note that to avoid confusion, it is required for the function to accept
            named arguments.**
          name: Optional name for this operation, default: "sequence_loss_by_example".
    
        Returns:
          1D batch-sized float Tensor: The log-perplexity for each sequence.
    
        Raises:
          ValueError: If len(logits) is different from len(targets) or len(weights).
        """
        # 此三者都是列表,长度都应该相同
        if len(targets) != len(logits) or len(weights) != len(logits):
            raise ValueError("Lengths of logits, weights, and targets must be the same "
                             "%d, %d, %d." % (len(logits), len(weights), len(targets)))
        with tf.name_scope(name, "sequence_loss_by_example",
                           logits + targets + weights):
            log_perp_list = []
            # 计算每个时间片的损失
            for logit, target, weight in zip(logits, targets, weights):
                if softmax_loss_function is None:
                    # 默认使用sparse
                    target = tf.reshape(target, [-1])
                    crossent = tf.nn.sparse_softmax_cross_entropy_with_logits(
                        labels=target, logits=logit)
                else:
                    crossent = softmax_loss_function(labels=target, logits=logit)
                log_perp_list.append(crossent * weight)
            # 把各个时间片的损失加起来
            log_perps = tf.add_n(log_perp_list)
            # 对各个时间片的损失求平均数
            if average_across_timesteps:
                total_size = tf.add_n(weights)
                total_size += 1e-12  # Just to avoid division by 0 for all-0 weights.
                log_perps /= total_size
        return log_perps
    
    
    """
    考虑many2many形式的RNN用法,每次输入一个就会得到一个输出
    这些输出需要计算平均损失,我们可以指定:
    * 每个样本的权重
    * 每个时间片的权重
    """
    sample_count = 4
    target_count = 3
    frame_count = 2
    # 各个时间片我的答案
    logits = [tf.random_uniform((sample_count, target_count)) for i in range(frame_count)]
    # 各个时间片的真正答案
    targets = [tf.constant(np.random.randint(0, target_count, (sample_count,))) for i in range(frame_count)]
    # 每个时间片,每个样本的权重。利用weights我们可以指定时间片权重和样本权重
    weights = [tf.ones((sample_count,), dtype=tf.float32) * (i + 1) for i in range(frame_count)]
    loss1 = sequence_loss_by_example(logits, targets, weights, average_across_timesteps=True)
    loss = tf.contrib.legacy_seq2seq.sequence_loss_by_example(logits, targets, weights, True)
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        x, y, = sess.run([loss, loss1])
        print(x)
        print(y)
        print(x.shape, y.shape)
    

    这个函数非常有用,tensorflow.nn中的sparse_softmax_cross_entropy无法指定样本的权重,这个函数可以。
    使用时,只需要传入一个时间片即可。如果各个样本权重都为1,最后得到的结果跟sparse_softmax_cross_entropy得到的结果是一样的。

  • 相关阅读:
    线性时间将两个有序链表合成一个有序链表(constant additional space)
    C++定义指针数组
    cmd运行java编译文件
    java的方法
    Java流程控制
    用户交互-Scanner
    Java的注释
    编译型语言和解释性语言
    JDK、JRE和JVM
    MarkDown的简单使用
  • 原文地址:https://www.cnblogs.com/weiyinfu/p/9840494.html
Copyright © 2020-2023  润新知