• [Kaggle] dogs-vs-cats之建立模型


      建立神经网络模型,下面要建立的模型如下:

    (上图来源:训练网络时,打开tensorboard即可观察网络结构,在下一节模型训练的时候会讲到)

    下面为具体步骤:

    Step 0:导入相关库

    import tensorflow as tf

    Step 1:定义网络结构

    def inference(images, batch_size, n_classes):
        '''Build the model
        Args:
            images: image batch, 4D tensor, tf.float32, [batch_size, width, height, channels]
        Returns:
            output tensor with the computed logits, float, [batch_size, n_classes]
        '''
        #conv1, shape = [kernel size, kernel size, channels, kernel numbers]
        #卷积层1
        with tf.variable_scope('conv1') as scope:
            #变量初始化
            weights = tf.get_variable('weights', 
                                      shape = [3,3,3, 16],
                                      dtype = tf.float32, 
                                      initializer=tf.truncated_normal_initializer(stddev=0.1,dtype=tf.float32))
            biases = tf.get_variable('biases', 
                                     shape=[16],
                                     dtype=tf.float32,
                                     initializer=tf.constant_initializer(0.1))
            conv = tf.nn.conv2d(images, weights, strides=[1,1,1,1], padding='SAME')
            pre_activation = tf.nn.bias_add(conv, biases)#加上偏置
            conv1 = tf.nn.relu(pre_activation, name= scope.name)#relu激活函数
        
        #pool1 and norm1   
        with tf.variable_scope('pooling1_lrn') as scope:
            pool1 = tf.nn.max_pool(conv1, ksize=[1,3,3,1],strides=[1,2,2,1],
                                   padding='SAME', name='pooling1')
            norm1 = tf.nn.lrn(pool1, depth_radius=4, bias=1.0, alpha=0.001/9.0,
                              beta=0.75,name='norm1')
        
        #conv2
        with tf.variable_scope('conv2') as scope:
            weights = tf.get_variable('weights',
                                      shape=[3,3,16,16],
                                      dtype=tf.float32,
                                      initializer=tf.truncated_normal_initializer(stddev=0.1,dtype=tf.float32))
            biases = tf.get_variable('biases',
                                     shape=[16], 
                                     dtype=tf.float32,
                                     initializer=tf.constant_initializer(0.1))
            conv = tf.nn.conv2d(norm1, weights, strides=[1,1,1,1],padding='SAME')
            pre_activation = tf.nn.bias_add(conv, biases)
            conv2 = tf.nn.relu(pre_activation, name='conv2')
        
        
        #pool2 and norm2
        with tf.variable_scope('pooling2_lrn') as scope:
            norm2 = tf.nn.lrn(conv2, depth_radius=4, bias=1.0, alpha=0.001/9.0,
                              beta=0.75,name='norm2')
            pool2 = tf.nn.max_pool(norm2, ksize=[1,3,3,1], strides=[1,1,1,1],
                                   padding='SAME',name='pooling2')
        
        
        #local3 #全连接层1
        with tf.variable_scope('local3') as scope:
            reshape = tf.reshape(pool2, shape=[batch_size, -1])#转换为一维
            dim = reshape.get_shape()[1].value#获取第二维的长度
            weights = tf.get_variable('weights',
                                      shape=[dim,128],
                                      dtype=tf.float32,
                                      initializer=tf.truncated_normal_initializer(stddev=0.005,dtype=tf.float32))
            biases = tf.get_variable('biases',
                                     shape=[128],
                                     dtype=tf.float32, 
                                     initializer=tf.constant_initializer(0.1))
            local3 = tf.nn.relu(tf.matmul(reshape, weights) + biases, name=scope.name)    
        
        #local4
        with tf.variable_scope('local4') as scope:
            weights = tf.get_variable('weights',
                                      shape=[128,128],
                                      dtype=tf.float32, 
                                      initializer=tf.truncated_normal_initializer(stddev=0.005,dtype=tf.float32))
            biases = tf.get_variable('biases',
                                     shape=[128],
                                     dtype=tf.float32,
                                     initializer=tf.constant_initializer(0.1))
            local4 = tf.nn.relu(tf.matmul(local3, weights) + biases, name='local4')
         
            
        # softmax
        with tf.variable_scope('softmax_linear') as scope:
            weights = tf.get_variable('softmax_linear',
                                      shape=[128, n_classes],
                                      dtype=tf.float32,
                                      initializer=tf.truncated_normal_initializer(stddev=0.005,dtype=tf.float32))
            biases = tf.get_variable('biases', 
                                     shape=[n_classes],
                                     dtype=tf.float32, 
                                     initializer=tf.constant_initializer(0.1))
            softmax_linear = tf.add(tf.matmul(local4, weights), biases, name='softmax_linear')
        
        return softmax_linear

    函数介绍:

    1)tf.variable_scope

     通过 tf.get_variable()为变量名指定命名空间。

    2)tf.get_variable

    通过所给的名字,创建或者返回一个变量。

    以上两个函数详情参考:共享变量:http://wiki.jikexueyuan.com/project/tensorflow-zh/how_tos/variable_scope.html

    3)tf.nn.conv2d

    conv2d(
        input,
        filter,
        strides,
        padding,
        use_cudnn_on_gpu=True,
        data_format='NHWC',
        name=None
    )

    作用:对给定的4-D输入和卷积核(filter)做2-D的卷积。

    输入的张量(tensor)大小为[batch, in_height, in_width, in_channels],卷积核(filter/kernel)的大小为[filter_height, filter_width, in_channels, out_channels]。

           Strides一般为[1, stride, stride, 1];padding,取值"SAME", "VALID"。

    4)tf.nn.bias_add

    bias_add(
        value,
        bias,
        data_format=None,
        name=None
    )

    作用:将bias添加至value。

    tf.nn.bias_add 是 tf.add 的一个特例,也即 tf.add 支持的操作比 tf.nn.bias_add 更多。二者均支持 broadcasting(广播机制),也即两个操作数最后一个维度保持一致。除了支持最后一个维度保持一致的两个操作数相加外,tf.add 还支持第二个操作数是一维的情况。

    5)tf.nn.relu

    relu(
        features,
        name=None
    )

    作用:是计算激活函数relu,即max(features, 0)。

    6)tf.nn.max_pool

    max_pool(
        value,
        ksize,
        strides,
        padding,
        data_format='NHWC',
        name=None
    )

    作用:计算池化区域中元素的最大值

    输入参数:

    value: 一个四维的Tensor。数据维度是 [batch, height, width, channels]。数据类型是float32,float64,qint8,quint8,qint32。

    ksize: 一个长度不小于4的整型数组。每一位上面的值对应于输入数据张量中每一维的窗口对应值。

    strides: 一个长度不小于4的整型数组。该参数指定滑动窗口在输入数据张量每一维上面的步长。

    padding: 一个字符串,取值为 SAME 或者 VALID 。

    name: (可选)为这个操作取一个名字。

    7)tf.nn.lrn

    作用:局部响应归一化

    Step 2:定义损失函数

    def losses(logits, labels):
        '''Compute loss from logits and labels
        Args:
            logits: logits tensor, float, [batch_size, n_classes]
            labels: label tensor, tf.int32, [batch_size]
            
        Returns:
            loss tensor of float type
        '''
        with tf.variable_scope('loss') as scope:#sparse  不需要one hot encoding
            cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits
                            (logits=logits, labels=labels, name='xentropy_per_example')
            loss = tf.reduce_mean(cross_entropy, name='loss')
            tf.summary.scalar(scope.name+'/loss', loss)
        return loss

    函数介绍:

    1) tf.nn.sparse_softmax_cross_entropy_with_logits

    sparse_softmax_cross_entropy_with_logits(
        _sentinel=None,
        labels=None,
        logits=None,
        name=None
    )

    作用:计算logits和labels之间的softmax交叉熵。

    第一个参数logits:就是神经网络最后一层的输出,如果有batch的话,它的大小就是[batchsize,num_classes],单样本的话,大小就是num_classes。

    第二个参数labels以前也必须是[batch_size, num_classes]否则无法做Cross Entropy(softmax_cross_entropy_with_logits的用法),这个函数改为限制更强的[batch_size],而值必须是从0开始编码的int32或int64,而且值范围是[0, num_class)。

    2) tf.reduce_mean

    作用:计算输入tensor的均值

    3)tf.summary.scalar

    scalar(
        name,
        tensor,
        collections=None,
        family=None
    )

    作用:输出一个包含单个标量值的Summary protocol buffer 。

    Step 3:定义训练方法

    def trainning(loss, learning_rate):
        '''Training ops, the Op returned by this function is what must be passed to 
            'sess.run()' call to cause the model to train.
            
        Args:
            loss: loss tensor, from losses()
            
        Returns:
            train_op: The op for trainning
        '''
        with tf.name_scope('optimizer'):
            optimizer = tf.train.AdamOptimizer(learning_rate= learning_rate)
            global_step = tf.Variable(0, name='global_step', trainable=False)
            train_op = optimizer.minimize(loss, global_step= global_step)
        return train_op

    函数介绍:

    1)tf.train.AdamOptimizer

    __init__(
        learning_rate=0.001,
        beta1=0.9,
        beta2=0.999,
        epsilon=1e-08,
        use_locking=False,
        name='Adam'
    )

    作用:利用Adam algorithm来

    2)optimizer.minimize

    minimize(
        loss,
        global_step=None,
        var_list=None,
        gate_gradients=GATE_OP,
        aggregation_method=None,
        colocate_gradients_with_ops=False,
        name=None,
        grad_loss=None
    )

    作用:最小化loss。

    global_step: Optional Variable to increment by one after the variables have been updated.

    Step4:定义评估方法

    def evaluation(logits, labels):
      """Evaluate the quality of the logits at predicting the label.
      Args:
        logits: Logits tensor, float - [batch_size, NUM_CLASSES].
        labels: Labels tensor, int32 - [batch_size], with values in the
          range [0, NUM_CLASSES).
      Returns:
        A scalar int32 tensor with the number of examples (out of batch_size)
        that were predicted correctly.
      """
      with tf.variable_scope('accuracy') as scope:
          correct = tf.nn.in_top_k(logits, labels, 1)
          correct = tf.cast(correct, tf.float16)
          accuracy = tf.reduce_mean(correct)
          tf.summary.scalar(scope.name+'/accuracy', accuracy)
      return accuracy

    函数介绍:

    1)tf.nn.in_top_k

    in_top_k(
        predictions,
        targets,
        k,
        name=None
    )

    作用:返回targets是否位于前K个predictions中,True或者False。

    Predictionsfloat32类型的Tensor,大小为batch_size x classes

    Targets必须是int32或者int64的Tensor。类id组成的batch_size大小的向量。

    说明:

    代码来自:https://github.com/kevin28520/My-TensorFlow-tutorials,略有修改

    函数作用主要参考tensorflow官网。https://www.tensorflow.org/versions/master/api_docs/

  • 相关阅读:
    【转载】opencvVS2019配置方法
    windows.h头文件中改变光标位置的函数——SetConsoleCursorPosition
    五行代码解决猴子选大王问题
    AtCoder Beginner Contest 192
    ACM做题注意事项
    数据库部分重点
    数据库7-11章期末复习
    数据库4-6章期末复习
    数据库1-3章期末复习
    ICPC Central Russia Regional Contest (CRRC 19)
  • 原文地址:https://www.cnblogs.com/hejunlin1992/p/7613325.html
Copyright © 2020-2023  润新知