• Tensorflow做阅读理解与完形填空


    catalogue

    0. 前言
    1. 使用的数据集
    2. 数据预处理
    3. 训练
    4. 测试模型运行结果: 进行实际完形填空

    0. 前言

    开始写这篇文章的时候是晚上12点,突然想到几点新的理解,赶紧记下来。我们用深度学习(例如tensorflow)的时候,一定要着重训练自己的建模和抽象能力,即把一个复杂的业务问题抽象为一个数学模型问题。从本质上说,阅读理解做完形填空和人机对话AI是一样的,所不同的地方在于,前者的输入一段长对话,且是带有上下文的长对话,而输出可能是一段短语,这要求神经网络需要训练出一个"长对话问题-短语回答"的最佳模型

    Relevant Link:

    http://blog.topspeedsnail.com/archives/11062
    http://blog.topspeedsnail.com/archives/11062
    http://wiki.jikexueyuan.com/project/tensorflow-zh/tutorials/mnist_pros.html

    1. 使用的数据集

    对于深度学习来说,训练集的覆盖度很重要,tensorflow的RGD随机递归下降会不断调整参数,直到尝试出一组最优地的参数去拟合我们的输入训练集,为了辅助tensorflow的loss函数找到最优解,我们把训练集分成2部分,一半用于RGD训练参数,另一半用于随时验证当前参数结果(数学上即loss函数获得最小值)

    0x1: Children’s Book Test
    Data is in the included "data" folder. Questions are separated according to whether the missing word is a named entity (NE), common noun (CN), verb (V) or preposition (P)

    cbtest_NE_train.txt : 67128 questions
    cbtest_NE_valid_2000ex.txt : 2000
    cbtest_NE_test_2500ex.txt : 2500
    
    cbtest_CN_train.txt : 121176 questions
    cbtest_CN_valid_2000ex.txt : 2000
    cbtest_CN_test_2500ex.txt : 2500
    
    cbtest_V_train.txt : 109111 questions
    cbtest_V_valid_2000ex.txt : 2000
    cbtest_V_test_2500ex.txt : 2500
    
    cbtest_P_train.txt : 67128 questions
    cbtest_P_valid_2000ex.txt : 2000
    cbtest_P_test_2500ex.txt : 2500

    0x2: CBT questions

    Questions are built from sets of 21 consecutive sentences from the books. A sentence is defined by the Stanford Core NLP sentence splitter.
    A Named Entity (NE) is any entity identified by the Stanford Core NLP NER system. A Common Noun (CN) is any word tagged as a noun by the Stanford Core NLP POS tagger that is not already a NE. Verbs and Prepositions are identified similarly.

    训练语料集的组成是每20个阅读上下文,对应一个问题,然后给出一个备选答案的集合,这个集合中的任何一个都有可能成为正确的答案,它是一个开放式的问题回答

    Relevant Link:

    https://research.fb.com/projects/babi/
    http://cs.nyu.edu/~kcho/DMQA/

    2. 数据预处理

    0x1: 把句子token化、把训练集转化为20条语境上下文+1条问题+一段备选答案

    def preprocess_data(data_file, out_file):
        # stories[x][0]  tories[x][1]  tories[x][2]
        stories = []
        with open(data_file) as f:
            story = []
            for line in f:
                line = line.strip()
                if not line:
                    story = []
                else:
                    _, line = line.split(' ', 1)
                    if line:
                        if '	' in line:
                            q, a, _, answers = line.split('	')
                            # tokenize
                            q = [s.strip() for s in re.split('(W+)+', q) if s.strip()]
                            stories.append((story, q, a))
                        else:
                            line = [s.strip() for s in re.split('(W+)+', line) if s.strip()]
                            story.append(line)
                        # print stories
    
        #print stories
        samples = []
        for story in stories:
            story_tmp = []
            content = []
            for c in story[0]:
                content += c
            story_tmp.append(content)
            story_tmp.append(story[1])
            story_tmp.append(story[2])
    
            samples.append(story_tmp)
    
        #print samples
    
        # 把每一段阅读与完形填空片段顺序打乱
        random.shuffle(samples)
        print(len(samples))
    
        with open(out_file, "w") as f:
            for sample in samples:
                f.write(str(sample))
                f.write('
    ')

    0x2: 根据训练语料集生成词表

    这里遵循的依然是word2vec词向量空间模型,我们假定训练集中的[上下文20,问题1,回答]之间都是存在关联关系的,类似马尔柯夫链中的关联预测性思想,即出现了A词汇,则A->B出现的概率是所有其他组合中最高的,对单词短语来说,词向量就相当于图像识别中的图像区域权重

    # generate word vocabulary table
    def read_data(data_file):
        stories = []
        with open(data_file) as f:
            for line in f:
                line = ast.literal_eval(line.strip())
                stories.append(line)
        return stories
    
    # generate word vocabulary table
    stories = read_data(train_data_token_file) + read_data(valid_data_token_file)
    
    content_length = max([len(s) for s, _, _ in stories])
    question_length = max([len(q) for _, q, _ in stories])
    print(content_length, question_length)
    
    vocab = sorted(set(itertools.chain(*(story + q + [answer] for story, q, answer in stories))))
    vocab_size = len(vocab) + 1
    print(vocab_size)
    word2idx = dict((w, i + 1) for i, w in enumerate(vocab))
    pickle.dump((word2idx, content_length, question_length, vocab_size), open(train_vocab_data_file, "wb"))

    通过将词汇进行index编码,将词汇序列转化为数字序列,从而为后续计算向量最短距离作准备

    0x3: 数据向量表示

    将[[上下文20,问题1,回答]] list(很多段对话)纵向抽取,根据词汇表的index编号,转化为行矩阵形式,X(语境对话)、Q(问题)、A(回答),矩阵中的每个元素都是一个index编号,代表了该字母在"向量词汇表(该词汇表中的词汇之间具备向量特征)"的索引编号

    # From keras, padding
    def pad_sequences(sequences, maxlen=None, dtype='int32',
                      padding='post', truncating='post', value=0.):
        lengths = [len(s) for s in sequences]
    
        nb_samples = len(sequences)
        if maxlen is None:
            maxlen = np.max(lengths)
    
        # take the sample shape from the first non empty sequence
        # checking for consistency in the main loop below.
        sample_shape = tuple()
        for s in sequences:
            if len(s) > 0:
                sample_shape = np.asarray(s).shape[1:]
                break
    
        x = (np.ones((nb_samples, maxlen) + sample_shape) * value).astype(dtype)
        for idx, s in enumerate(sequences):
            if len(s) == 0:
                continue  # empty list was found
            if truncating == 'pre':
                trunc = s[-maxlen:]
            elif truncating == 'post':
                trunc = s[:maxlen]
            else:
                raise ValueError('Truncating type "%s" not understood' % truncating)
    
            # check `trunc` has expected shape
            trunc = np.asarray(trunc, dtype=dtype)
            if trunc.shape[1:] != sample_shape:
                raise ValueError('Shape of sample %s of sequence at position %s is different from expected shape %s' %
                                 (trunc.shape[1:], idx, sample_shape))
    
            if padding == 'post':
                x[idx, :len(trunc)] = trunc
            elif padding == 'pre':
                x[idx, -len(trunc):] = trunc
            else:
                raise ValueError('Padding type "%s" not understood' % padding)
        return x
    
    
    # conver to vector
    def to_vector(data_file, output_file):
        word2idx, content_length, question_length, _ = pickle.load(open(train_vocab_data_file, "rb"))
    
        X = []
        Q = []
        A = []
        with open(data_file) as f_i:
            for line in f_i:
                line = ast.literal_eval(line.strip())
                x = [word2idx[w] for w in line[0]]
                q = [word2idx[w] for w in line[1]]
                a = [word2idx[line[2]]]
    
                X.append(x)
                Q.append(q)
                A.append(a)
    
        X = pad_sequences(X, content_length)
        Q = pad_sequences(Q, question_length)
    
        with open(output_file, "w") as f_o:
            for i in range(len(X)):
                f_o.write(str([X[i].tolist(), Q[i].tolist(), A[i]]))
                f_o.write('
    ')

    0x4: code

    # -*- coding:utf-8 -*-
    
    import re
    import random
    import ast
    import itertools
    import pickle
    import numpy as np
    
    train_data_file = './CBTest/data/cbtest_NE_train.txt'
    train_data_token_file = 'train.data'
    
    valid_data_file = './CBTest/data/cbtest_NE_valid_2000ex.txt'
    valid_data_token_file = 'valid.data'
    
    train_vocab_data_file = 'vocab.data'
    
    train_vec_data_file = 'train.vec'
    valid_vec_data_file = 'valid.vec'
    
    
    def preprocess_data(data_file, out_file):
        # stories[x][0]  tories[x][1]  tories[x][2]
        stories = []
        with open(data_file) as f:
            story = []
            for line in f:
                line = line.strip()
                if not line:
                    story = []
                else:
                    _, line = line.split(' ', 1)
                    if line:
                        if '	' in line:
                            q, a, _, answers = line.split('	')
                            # tokenize
                            q = [s.strip() for s in re.split('(W+)+', q) if s.strip()]
                            stories.append((story, q, a))
                        else:
                            line = [s.strip() for s in re.split('(W+)+', line) if s.strip()]
                            story.append(line)
                        # print stories
    
        #print stories
        samples = []
        for story in stories:
            story_tmp = []
            content = []
            for c in story[0]:
                content += c
            story_tmp.append(content)
            story_tmp.append(story[1])
            story_tmp.append(story[2])
    
            samples.append(story_tmp)
    
        #print samples
    
        # 把每一段阅读与完形填空片段顺序打乱
        random.shuffle(samples)
        print(len(samples))
    
        with open(out_file, "w") as f:
            for sample in samples:
                f.write(str(sample))
                f.write('
    ')
    
    
    # generate word vocabulary table
    def read_data(data_file):
        stories = []
        with open(data_file) as f:
            for line in f:
                line = ast.literal_eval(line.strip())
                stories.append(line)
        return stories
    
    
    # From keras, padding
    def pad_sequences(sequences, maxlen=None, dtype='int32',
                      padding='post', truncating='post', value=0.):
        lengths = [len(s) for s in sequences]
    
        nb_samples = len(sequences)
        if maxlen is None:
            maxlen = np.max(lengths)
    
        # take the sample shape from the first non empty sequence
        # checking for consistency in the main loop below.
        sample_shape = tuple()
        for s in sequences:
            if len(s) > 0:
                sample_shape = np.asarray(s).shape[1:]
                break
    
        x = (np.ones((nb_samples, maxlen) + sample_shape) * value).astype(dtype)
        for idx, s in enumerate(sequences):
            if len(s) == 0:
                continue  # empty list was found
            if truncating == 'pre':
                trunc = s[-maxlen:]
            elif truncating == 'post':
                trunc = s[:maxlen]
            else:
                raise ValueError('Truncating type "%s" not understood' % truncating)
    
            # check `trunc` has expected shape
            trunc = np.asarray(trunc, dtype=dtype)
            if trunc.shape[1:] != sample_shape:
                raise ValueError('Shape of sample %s of sequence at position %s is different from expected shape %s' %
                                 (trunc.shape[1:], idx, sample_shape))
    
            if padding == 'post':
                x[idx, :len(trunc)] = trunc
            elif padding == 'pre':
                x[idx, -len(trunc):] = trunc
            else:
                raise ValueError('Padding type "%s" not understood' % padding)
        return x
    
    
    # conver to vector
    def to_vector(data_file, output_file):
        word2idx, content_length, question_length, _ = pickle.load(open(train_vocab_data_file, "rb"))
    
        X = []
        Q = []
        A = []
        with open(data_file) as f_i:
            for line in f_i:
                line = ast.literal_eval(line.strip())
                x = [word2idx[w] for w in line[0]]
                q = [word2idx[w] for w in line[1]]
                a = [word2idx[line[2]]]
    
                X.append(x)
                Q.append(q)
                A.append(a)
    
        X = pad_sequences(X, content_length)
        Q = pad_sequences(Q, question_length)
    
        with open(output_file, "w") as f_o:
            for i in range(len(X)):
                f_o.write(str([X[i].tolist(), Q[i].tolist(), A[i]]))
                f_o.write('
    ')
    
    
    if __name__ == "__main__":
        preprocess_data(train_data_file, train_data_token_file)
        preprocess_data(valid_data_file, valid_data_token_file)
    
        # generate word vocabulary table
        stories = read_data(train_data_token_file) + read_data(valid_data_token_file)
    
        content_length = max([len(s) for s, _, _ in stories])
        question_length = max([len(q) for _, q, _ in stories])
        print(content_length, question_length)
    
        vocab = sorted(set(itertools.chain(*(story + q + [answer] for story, q, answer in stories))))
        vocab_size = len(vocab) + 1
        print(vocab_size)
        word2idx = dict((w, i + 1) for i, w in enumerate(vocab))
        pickle.dump((word2idx, content_length, question_length, vocab_size), open(train_vocab_data_file, "wb"))
    
        to_vector(train_data_token_file, train_vec_data_file)
        to_vector(valid_data_token_file, valid_vec_data_file)

    Relevant Link: 

    3. 训练

    0x1: Word Embeddings

    向量空间模型 (VSMs)将词汇表达(嵌套)于一个连续的向量空间中,语义近似的词汇被映射为相邻的数据点。向量空间模型在自然语言处理领域中有着漫长且丰富的历史,不过几乎所有利用这一模型的方法都依赖于 分布式假设,其核心思想为出现于上下文情景中的词汇都有相类似的语义。采用这一假设的研究方法大致分为以下两类:基于计数的方法 (e.g. 潜在语义分析), 和 预测方法 (e.g. 神经概率化语言模型).

    0x2: 定义损失函数

    loss = -tf.reduce_mean(
            tf.log(tf.reduce_sum(tf.to_float(tf.equal(tf.expand_dims(A, -1), X)) * X_attentions, 1) + tf.constant(0.00001)))

    对于训练过程来说,模型根据X上下文语境得到的answer应该和验证机中的打标结果一致的(这和图像识别只有一个正确答案的道理是一样的)

    0x3: 优化器

    这里使用Adam算法的Optimizer不断训练我们的输入参数

    optimizer = tf.train.AdamOptimizer(learning_rate=0.001)
        grads_and_vars = optimizer.compute_gradients(loss)
        capped_grads_and_vars = [(tf.clip_by_norm(g, 5), v) for g, v in grads_and_vars]
        train_op = optimizer.apply_gradients(capped_grads_and_vars)

    Tensorflow的优化器使用十分简便,已经进行了大量高层封装,我们只要实例化相应class,传入指定参数执行即可

    0x4: Dropout

    为了减少过拟合,我们在输出层之前加入dropout。我们用一个placeholder来代表一个神经元的输出在dropout中保持不变的概率。这样我们可以在训练过程中启用dropout,在测试过程中关闭dropout。 TensorFlow的tf.nn.dropout操作除了可以屏蔽神经元的输出外,还会自动处理神经元输出值的scale。所以用dropout的时候可以不用考虑scale。

    keep_prob = tf.placeholder("float")
    h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

    0x5: code

    # -*- coding: utf-8 -*-
    
    import tensorflow as tf
    import pickle
    import numpy as np
    import ast
    from collections import defaultdict
    
    train_vec_data_file = 'train.vec'
    valid_vec_data_file = 'valid.vec'
    
    
    train_vocab_data_file = 'vocab.data'
    
    
    def get_next_batch():
        X = []
        Q = []
        A = []
        for i in range(batch_size):
            for line in train_file:
                line = ast.literal_eval(line.strip())
                X.append(line[0])
                Q.append(line[1])
                A.append(line[2][0])
                break
    
        if len(X) == batch_size:
            return X, Q, A
        else:
            train_file.seek(0)
            return get_next_batch()
    
    
    def get_test_batch():
        with open(valid_vec_data_file) as f:
            X = []
            Q = []
            A = []
            for line in f:
                line = ast.literal_eval(line.strip())
                X.append(line[0])
                Q.append(line[1])
                A.append(line[2][0])
            return X, Q, A
    
    
    def glimpse(weights, bias, encodings, inputs):
        weights = tf.nn.dropout(weights, keep_prob)
        inputs = tf.nn.dropout(inputs, keep_prob)
        attention = tf.transpose(tf.matmul(weights, tf.transpose(inputs)) + bias)
        attention = tf.batch_matmul(encodings, tf.expand_dims(attention, -1))
        attention = tf.nn.softmax(tf.squeeze(attention, -1))
        return attention, tf.reduce_sum(tf.expand_dims(attention, -1) * encodings, 1)
    
    
    def neural_attention(embedding_dim=384, encoding_dim=128):
        embeddings = tf.Variable(tf.random_normal([vocab_size, embedding_dim], stddev=0.22), dtype=tf.float32)
        tf.contrib.layers.apply_regularization(tf.contrib.layers.l2_regularizer(1e-4), [embeddings])
    
        with tf.variable_scope('encode'):
            with tf.variable_scope('X'):
                X_lens = tf.reduce_sum(tf.sign(tf.abs(X)), 1)
                embedded_X = tf.nn.embedding_lookup(embeddings, X)
                encoded_X = tf.nn.dropout(embedded_X, keep_prob)
                gru_cell = tf.nn.rnn_cell.GRUCell(encoding_dim)
                outputs, output_states = tf.nn.bidirectional_dynamic_rnn(gru_cell, gru_cell, encoded_X,
                                                                         sequence_length=X_lens, dtype=tf.float32,
                                                                         swap_memory=True)
                encoded_X = tf.concat(2, outputs)
            with tf.variable_scope('Q'):
                Q_lens = tf.reduce_sum(tf.sign(tf.abs(Q)), 1)
                embedded_Q = tf.nn.embedding_lookup(embeddings, Q)
                encoded_Q = tf.nn.dropout(embedded_Q, keep_prob)
                gru_cell = tf.nn.rnn_cell.GRUCell(encoding_dim)
                outputs, output_states = tf.nn.bidirectional_dynamic_rnn(gru_cell, gru_cell, encoded_Q,
                                                                         sequence_length=Q_lens, dtype=tf.float32,
                                                                         swap_memory=True)
                encoded_Q = tf.concat(2, outputs)
    
        W_q = tf.Variable(tf.random_normal([2 * encoding_dim, 4 * encoding_dim], stddev=0.22), dtype=tf.float32)
        b_q = tf.Variable(tf.random_normal([2 * encoding_dim, 1], stddev=0.22), dtype=tf.float32)
        W_d = tf.Variable(tf.random_normal([2 * encoding_dim, 6 * encoding_dim], stddev=0.22), dtype=tf.float32)
        b_d = tf.Variable(tf.random_normal([2 * encoding_dim, 1], stddev=0.22), dtype=tf.float32)
        g_q = tf.Variable(tf.random_normal([10 * encoding_dim, 2 * encoding_dim], stddev=0.22), dtype=tf.float32)
        g_d = tf.Variable(tf.random_normal([10 * encoding_dim, 2 * encoding_dim], stddev=0.22), dtype=tf.float32)
    
        with tf.variable_scope('attend') as scope:
            infer_gru = tf.nn.rnn_cell.GRUCell(4 * encoding_dim)
            infer_state = infer_gru.zero_state(batch_size, tf.float32)
            for iter_step in range(8):
                if iter_step > 0:
                    scope.reuse_variables()
    
                _, q_glimpse = glimpse(W_q, b_q, encoded_Q, infer_state)
                d_attention, d_glimpse = glimpse(W_d, b_d, encoded_X, tf.concat_v2([infer_state, q_glimpse], 1))
    
                gate_concat = tf.concat_v2([infer_state, q_glimpse, d_glimpse, q_glimpse * d_glimpse], 1)
    
                r_d = tf.sigmoid(tf.matmul(gate_concat, g_d))
                r_d = tf.nn.dropout(r_d, keep_prob)
                r_q = tf.sigmoid(tf.matmul(gate_concat, g_q))
                r_q = tf.nn.dropout(r_q, keep_prob)
    
                combined_gated_glimpse = tf.concat_v2([r_q * q_glimpse, r_d * d_glimpse], 1)
                _, infer_state = infer_gru(combined_gated_glimpse, infer_state)
    
        return tf.to_float(tf.sign(tf.abs(X))) * d_attention
    
    
    def train_neural_attention():
        X_attentions = neural_attention()
        loss = -tf.reduce_mean(
            tf.log(tf.reduce_sum(tf.to_float(tf.equal(tf.expand_dims(A, -1), X)) * X_attentions, 1) + tf.constant(0.00001)))
    
        optimizer = tf.train.AdamOptimizer(learning_rate=0.001)
        grads_and_vars = optimizer.compute_gradients(loss)
        capped_grads_and_vars = [(tf.clip_by_norm(g, 5), v) for g, v in grads_and_vars]
        train_op = optimizer.apply_gradients(capped_grads_and_vars)
    
        saver = tf.train.Saver()
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
    
            # writer = tf.summary.FileWriter()
            # 恢复前一次训练
            ckpt = tf.train.get_checkpoint_state('.')
            if ckpt != None:
                print(ckpt.model_checkpoint_path)
                saver.restore(sess, ckpt.model_checkpoint_path)
            else:
                print("checkpoint not found!!")
    
            for step in range(20000):
                train_x, train_q, train_a = get_next_batch()
                loss_, _ = sess.run([loss, train_op], feed_dict={X: train_x, Q: train_q, A: train_a, keep_prob: 0.7})
                print(loss_)
    
                # 保存模型并计算准确率
                if step % 1000 == 0:
                    path = saver.save(sess, 'machine_reading.model', global_step=step)
                    print(path)
    
                    test_x, test_q, test_a = get_test_batch()
                    test_x, test_q, test_a = np.array(test_x[:batch_size]), np.array(test_q[:batch_size]), np.array(
                        test_a[:batch_size])
                    attentions = sess.run(X_attentions, feed_dict={X: test_x, Q: test_q, keep_prob: 1.})
                    correct_count = 0
                    for x in range(test_x.shape[0]):
                        probs = defaultdict(int)
                        for idx, word in enumerate(test_x[x, :]):
                            probs[word] += attentions[x, idx]
                        guess = max(probs, key=probs.get)
                        if guess == test_a[x]:
                            correct_count += 1
                    print(correct_count / test_x.shape[0])
    
    
    
    # 读取词汇表
    word2idx, content_length, question_length, vocab_size = pickle.load(open(train_vocab_data_file, "rb"))
    print(content_length, question_length, vocab_size)
    #print word2idx
    
    batch_size = 64
    
    train_file = open(train_vec_data_file)
    
    X = tf.placeholder(tf.int32, [batch_size, content_length])  # 洋文材料
    Q = tf.placeholder(tf.int32, [batch_size, question_length])  # 问题
    A = tf.placeholder(tf.int32, [batch_size])  # 答案
    
    # drop out
    keep_prob = tf.placeholder(tf.float32)
    
    train_neural_attention()

    Relevant Link:

    http://docs.pythontab.com/tensorflow/tutorials/word2vec/
    http://blog.csdn.net/lenbow/article/details/52218551
    http://wiki.jikexueyuan.com/project/tensorflow-zh/get_started/basic_usage.html
    http://wiki.jikexueyuan.com/project/tensorflow-zh/tutorials/mnist_tf.html
    http://wiki.jikexueyuan.com/project/tensorflow-zh/how_tos/variable_scope.html
    http://www.jianshu.com/p/45dbfe5809d4
    https://www.zhihu.com/question/51325408
    http://www.jianshu.com/p/c9f66bc8f96c

    4. 测试模型运行结果: 进行实际完形填空

    验证模型的过程就是让cnn根据和train相同的输入格式,得到一个预测概率最大的输出结果

    0x1: 测试题目

    We did manage to get the taffy made but before we could sample the result satisfactorily , and just as the girls were finishing with the washing of the dishes , Felicity glanced out of the window and exclaimed in tones of dismay , `` Oh , dear me , here ' s Great - aunt Eliza coming up the lane ! Now , is n ' t that too mean ? '' We all looked out to see a tall , gray - haired lady approaching the house , looking about her with the slightly puzzled air of a stranger . We had been expecting Great - aunt Eliza ' s advent for some weeks , for she was visiting relatives in Markdale . We knew she was liable to pounce down on us any time , being one of those delightful folk who like to `` surprise '' people , but we had never thought of her coming that particular day . It must be confessed that we did not look forward to her visit with any pleasure . None of us had ever seen her , but we knew she was very deaf , and had very decided opinions as to the way in which children should behave . `` Whew ! '' whistled Dan . `` We ' re in for a jolly afternoon . She ' s deaf as a post and we ' ll have to split our throats to make her hear at all . I ' ve a notion to skin out . '' `` Oh , do n ' t talk like that , Dan , '' said Cecily reproachfully . `` She ' s old and lonely and has had a great deal of trouble . She has buried three husbands . We must be kind to her and do the best we can to make her visit pleasant . '' `` She ' s coming to the back door , '' said Felicity , with an agitated glance around the kitchen . `` I told you , Dan , that you should have shovelled the snow away from the front door this morning . Cecily , set those pots in the pantry quick -- hide those boots , Felix -- shut the cupboard door , Peter -- Sara , straighten up the lounge . She ' s awfully particular and ma says her house is always as neat as wax . ''

    模型会按照同样的训练过程将测试数据输入模型,得到最大概率对应的index号,即模型预测的完形填空答案

    Relevant Link:

    Copyright (c) 2017 LittleHann All rights reserved

  • 相关阅读:
    权重
    盒模型
    认识html标签
    CSS盒子模型
    行内元素和块级元素的区别
    搭建线路mvc实现接口获取数据库数据
    实现车辆信息编辑功能
    最近系统更新进度截图
    最近没写什么---更新下,在家没键盘就偷懒了
    基于web公交查询系统----管理员公交站点管理页面实现
  • 原文地址:https://www.cnblogs.com/LittleHann/p/6429561.html
Copyright © 2020-2023  润新知