• tensorflow+ssd_mobilenet实现目标检测的训练


    本文在Ubuntu下使用tensorflow的object detection API来训练自己的数据集。所用模型为ssd_mobilenet,也可以使用其他的模型。当然也可以在windows下训练,代码上没有多大差别,主要是配置环境那里,比较麻烦(windows和linux下都一样麻烦)。

    一、配置环境

    1. 在GitHub上下载所需的models文件,地址:https://github.com/tensorflow/models

    2. 安装pillow、lxml。Jupyter Notebook和matplotlib这两个我就不说了,肯定早就装好了。

    pip install pillow
    pip install lxml

    3. 编译protobuf,object detection API是使用protobuf来训练模型和配置参数,所以得先编译protobuf,下载地址:https://github.com/google/protobuf/releases

    windows的配置参考:https://blog.csdn.net/dy_guox/article/details/79081499

    Ubuntu下的配置为:

    下载地址 https://github.com/google/protobuf/releases
    下载protobuf-all-3.7.0.tar.gz

    命令如下:

    tar zxvf protobuf-all-3.7.0.tar.gz 
    cd protobuf-3.7.0/
    ./autogen.sh 
    ./configure --prefix=/usr/local/protobuf
    make
    make check 出现错误,不过没事

    修改配置文件

    (1)vim /etc/profile,添加

         export PATH=$PATH:/usr/local/protobuf/bin/
         export PKG_CONFIG_PATH=/usr/local/protobuf/lib/pkgconfig/

        保存执行source /etc/profile
        vim ~/.profile,添加
          export PATH=$PATH:/usr/local/protobuf/bin/
          export PKG_CONFIG_PATH=/usr/local/protobuf/lib/pkgconfig/ 否则会出现登陆用户找不到protoc命令
    (2)配置动态链接库
        vim /etc/ld.so.conf 在新的一行添加
        /usr/local/protobuf/lib
        执行命令 sudo ldconfig

    安装完成

    执行protoc --version,会出现当前libporoto的版本信息
    libprotoc 2.6.1

    二、在Ubuntu内先制作好自己的VOC数据集。制作过程如下:

    在Ubuntu内制作自己的VOC数据集

    三、将样本数据转换为TFRecord格式

    1. 新建 train_test_split.py 把xml文件数据集分为了train、test、validation三部分,并存储在Annotations文件夹中,训练验证集占80%,测试集占20%。训练集占训练验证集的80%。代码如下:

    import os  
    import random  
    import time  
    import shutil
    
    xmlfilepath=r'./Annotations'  
    saveBasePath=r"./Annotations"
    
    trainval_percent=0.8  
    train_percent=0.8  
    total_xml = os.listdir(xmlfilepath)  
    num=len(total_xml)  
    list=range(num)  
    tv=int(num*trainval_percent)  
    tr=int(tv*train_percent)  
    trainval= random.sample(list,tv)  
    train=random.sample(trainval,tr)  
    print("train and val size",tv)  
    print("train size",tr) 
    
    start = time.time()
    
    test_num=0  
    val_num=0  
    train_num=0  
    
    for i in list:  
        name=total_xml[i]
        if i in trainval:  #train and val set 
            if i in train: 
                directory="train"  
                train_num += 1  
                xml_path = os.path.join(os.getcwd(), 'Annotations/{}'.format(directory))  
                if(not os.path.exists(xml_path)):  
                    os.mkdir(xml_path)  
                filePath=os.path.join(xmlfilepath,name)  
                newfile=os.path.join(saveBasePath,os.path.join(directory,name))  
                shutil.copyfile(filePath, newfile)
            else:
                directory="validation"  
                xml_path = os.path.join(os.getcwd(), 'Annotations/{}'.format(directory))  
                if(not os.path.exists(xml_path)):  
                    os.mkdir(xml_path)  
                val_num += 1  
                filePath=os.path.join(xmlfilepath,name)   
                newfile=os.path.join(saveBasePath,os.path.join(directory,name))  
                shutil.copyfile(filePath, newfile)
    
        else:
            directory="test"  
            xml_path = os.path.join(os.getcwd(), 'Annotations/{}'.format(directory))  
            if(not os.path.exists(xml_path)):  
                    os.mkdir(xml_path)  
            test_num += 1  
            filePath=os.path.join(xmlfilepath,name)  
            newfile=os.path.join(saveBasePath,os.path.join(directory,name))  
            shutil.copyfile(filePath, newfile)
    
    end = time.time()  
    seconds=end-start  
    print("train total : "+str(train_num))  
    print("validation total : "+str(val_num))  
    print("test total : "+str(test_num))  
    total_num=train_num+val_num+test_num  
    print("total number : "+str(total_num))  
    print( "Time taken : {0} seconds".format(seconds))

     

    2. 把xml转换成csv文件,xml_to_csv.py 将生成的csv文件放在 object_detection/data/

    import os  
    import glob  
    import pandas as pd  
    import xml.etree.ElementTree as ET 
    
    def xml_to_csv(path):  
        xml_list = []  
        for xml_file in glob.glob(path + '/*.xml'):  
            tree = ET.parse(xml_file)  
            root = tree.getroot()
            
            print(root.find('filename').text)  
            for member in root.findall('object'): 
                value = (root.find('filename').text,  
                    int(root.find('size')[0].text),   #width  
                    int(root.find('size')[1].text),   #height  
                    member[0].text,  
                    int(member[4][0].text),  
                    int(float(member[4][1].text)),  
                    int(member[4][2].text),  
                    int(member[4][3].text)  
                    )  
                xml_list.append(value)
        column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
        xml_df = pd.DataFrame(xml_list, columns=column_name)  
        return xml_df      
    
    def main():  
        for directory in ['train','test','validation']:  
            xml_path = os.path.join(os.getcwd(), 'Annotations/{}'.format(directory))  
    
            xml_df = xml_to_csv(xml_path)  
            # xml_df.to_csv('whsyxt.csv', index=None)  
            xml_df.to_csv('/home/z/work/models-master/research/object_detection/data/trafficlight_{}_labels.csv'.format(directory), index=None)  
            print('Successfully converted xml to csv.')
    
    main()  

    3. 生成tfrecord文件,generate_tfrecord.py

    #!/usr/bin/env python3
    # -*- coding: utf-8 -*-
    """
    Created on Tue Mar  5 15:28:55 2019
    
    @author: z
    """
    
    """
    Usage:
      # From tensorflow/models/
      # Create train data:
      python generate_tfrecord.py --csv_input=data/tv_vehicle_labels.csv  --output_path=train.record
      # Create test data:
      python generate_tfrecord.py --csv_input=data/test_labels.csv  --output_path=test.record
    """
    
    
    
    import os
    import io
    import pandas as pd
    import tensorflow as tf
    
    from PIL import Image
    from object_detection.utils import dataset_util
    from collections import namedtuple, OrderedDict
    
    os.chdir('/home/z/work/models-master/research/')
    
    flags = tf.app.flags
    flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
    flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
    FLAGS = flags.FLAGS
    
    
    # TO-DO replace this with label map
    def class_text_to_int(row_label):
            # 你的所有类别
        if row_label == 'red pedestrian':
                return 1
        else:
            return None
    
    def split(df, group):
        data = namedtuple('data', ['filename', 'object'])
        gb = df.groupby(group)
        return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]
    
    
    def create_tf_example(group, path):
        with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
            encoded_jpg = fid.read()
        encoded_jpg_io = io.BytesIO(encoded_jpg)
        image = Image.open(encoded_jpg_io)
        width, height = image.size
    
        filename = group.filename.encode('utf8')
        image_format = b'jpg'
        xmins = []
        xmaxs = []
        ymins = []
        ymaxs = []
        classes_text = []
        classes = []
    
        for index, row in group.object.iterrows():
            xmins.append(row['xmin'] / width)
            xmaxs.append(row['xmax'] / width)
            ymins.append(row['ymin'] / height)
            ymaxs.append(row['ymax'] / height)
            classes_text.append(row['class'].encode('utf8'))
            classes.append(class_text_to_int(row['class']))
    
        tf_example = tf.train.Example(features=tf.train.Features(feature={
            'image/height': dataset_util.int64_feature(height),
            'image/width': dataset_util.int64_feature(width),
            'image/filename': dataset_util.bytes_feature(filename),
            'image/source_id': dataset_util.bytes_feature(filename),
            'image/encoded': dataset_util.bytes_feature(encoded_jpg),
            'image/format': dataset_util.bytes_feature(image_format),
            'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
            'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
            'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
            'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
            'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
            'image/object/class/label': dataset_util.int64_list_feature(classes),
        }))
        return tf_example
    
    
    def main(_):
        writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
        path = os.path.join(os.getcwd(), 'object_detection/images/voc/VOCdevkit/VOC2018/JPEGImages/')
        examples = pd.read_csv(FLAGS.csv_input)
        grouped = split(examples, 'filename')
        num = 0
        for group in grouped:
            num += 1
            tf_example = create_tf_example(group, path)
            writer.write(tf_example.SerializeToString())
            if (num % 100 == 0):    # 每完成100个转换,打印一次
                print(num)
    
        writer.close()
        output_path = os.path.join(os.getcwd(), FLAGS.output_path)
        print('Successfully created the TFRecords: {}'.format(output_path))
    
    
    if __name__ == '__main__':
        tf.app.run()

    主要是在 row_label 这里要添加上你标注的类别,字符串 row_label 应于labelImg中标注的名称相同。同样 path 为图片的路径。

    cd /home/z/work/models-master/research

    python generate_tfrecord.py --csv_input=object_detection/data/trafficlight_train_labels.csv --output_path=object_detection/data/
    trafficlight_train.tfrecord

    generate_tfrecord.py 需要在research目录下,也就是object_detection的上级目录,因为在脚本中使用了 object_detection.utils,如果在 object_detection 下执行命令会报错(No module named object_detection)。

    其实这句命令很好理解,其实就是根据脚本中提供的图片路径,找到图片所在。至于是哪些图片?由csv文件来决定。csv文件主要就是记录图片的名称、类别、以及标记框的坐标。如下图所示:

    类似的,我们可以输入如下命令,将验证集和测试集也转换为tfrecord格式。

    python generate_tfrecord.py --csv_input=object_detection/data/trafficlight_validation_labels.csv --output_path=object_detection/data/
    trafficlight_validation.tfrecord
    python generate_tfrecord.py --csv_input=object_detection/data/trafficlight_test_labels.csv --output_path=object_detection/data/
    trafficlight_test.tfrecord

    四、训练

    1. 在object_detection/data文件夹下创建标签分类的配置文件(label_map.txt),需要检测几种目标,将创建几个id,代码如下:

    item {
      id: 1    # id 从1开始编号
      name: 'red pedestrian'
    }
    
    item {
      id: 2
      name: 'green pedestrian'
    }

    2. 配置管道配置文件,找到object_detection/samples/config/ssd_mobilenet_v1_coco.config,复制到data文件夹下。修改后的代码如下:

    # SSD with Mobilenet v1 configuration for MSCOCO Dataset.
    # Users should configure the fine_tune_checkpoint field in the train config as
    # well as the label_map_path and input_path fields in the train_input_reader and
    # eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
    # should be configured.
    
    model {
      ssd {
        num_classes: 17
        box_coder {
          faster_rcnn_box_coder {
            y_scale: 10.0
            x_scale: 10.0
            height_scale: 5.0
            width_scale: 5.0
          }
        }
        matcher {
          argmax_matcher {
            matched_threshold: 0.5
            unmatched_threshold: 0.5
            ignore_thresholds: false
            negatives_lower_than_unmatched: true
            force_match_for_each_row: true
          }
        }
        similarity_calculator {
          iou_similarity {
          }
        }
        anchor_generator {
          ssd_anchor_generator {
            num_layers: 6
            min_scale: 0.2
            max_scale: 0.95
            aspect_ratios: 1.0
            aspect_ratios: 2.0
            aspect_ratios: 0.5
            aspect_ratios: 3.0
            aspect_ratios: 0.3333
          }
        }
        image_resizer {
          fixed_shape_resizer {
            height: 300
             300
          }
        }
        box_predictor {
          convolutional_box_predictor {
            min_depth: 0
            max_depth: 0
            num_layers_before_predictor: 0
            use_dropout: false
            dropout_keep_probability: 0.8
            kernel_size: 1
            box_code_size: 4
            apply_sigmoid_to_scores: false
            conv_hyperparams {
              activation: RELU_6,
              regularizer {
                l2_regularizer {
                  weight: 0.00004
                }
              }
              initializer {
                truncated_normal_initializer {
                  stddev: 0.03
                  mean: 0.0
                }
              }
              batch_norm {
                train: true,
                scale: true,
                center: true,
                decay: 0.9997,
                epsilon: 0.001,
              }
            }
          }
        }
        feature_extractor {
          type: 'ssd_mobilenet_v1'
          min_depth: 16
          depth_multiplier: 1.0
          conv_hyperparams {
            activation: RELU_6,
            regularizer {
              l2_regularizer {
                weight: 0.00004
              }
            }
            initializer {
              truncated_normal_initializer {
                stddev: 0.03
                mean: 0.0
              }
            }
            batch_norm {
              train: true,
              scale: true,
              center: true,
              decay: 0.9997,
              epsilon: 0.001,
            }
          }
        }
        loss {
          classification_loss {
            weighted_sigmoid {
            }
          }
          localization_loss {
            weighted_smooth_l1 {
            }
          }
          hard_example_miner {
            num_hard_examples: 3000
            iou_threshold: 0.99
            loss_type: CLASSIFICATION
            max_negatives_per_positive: 3
            min_negatives_per_image: 0
          }
          classification_weight: 1.0
          localization_weight: 1.0
        }
        normalize_loss_by_num_matches: true
        post_processing {
          batch_non_max_suppression {
            score_threshold: 1e-8
            iou_threshold: 0.6
            max_detections_per_class: 100
            max_total_detections: 100
          }
          score_converter: SIGMOID
        }
      }
    }
    
    train_config: {
      batch_size: 2
      optimizer {
        rms_prop_optimizer: {
          learning_rate: {
            exponential_decay_learning_rate {
              initial_learning_rate: 0.00001
              decay_steps: 800720
              decay_factor: 0.95
            }
          }
          momentum_optimizer_value: 0.9
          decay: 0.9
          epsilon: 1.0
        }
      }
    
      # Note: The below line limits the training process to 200K steps, which we
      # empirically found to be sufficient enough to train the pets dataset. This
      # effectively bypasses the learning rate schedule (the learning rate will
      # never decay). Remove the below line to train indefinitely.
      num_steps: 6000
      data_augmentation_options {
        random_horizontal_flip {
        }
      }
      data_augmentation_options {
        ssd_random_crop {
        }
      }
    }
    
    train_input_reader: {
      tf_record_input_reader {
        input_path: "data/trafficlight_train.tfrecord"
      }
      label_map_path: "data/trafficlight_label_map.pbtxt"
    }
    
    eval_config: {
      num_examples: 8000
      # Note: The below line limits the evaluation process to 10 evaluations.
      # Remove the below line to evaluate indefinitely.
      max_evals: 10
    }
    
    eval_input_reader: {
      tf_record_input_reader {
        input_path: "data/trafficlight_validation.tfrecord"
      }
      label_map_path: "data/trafficlight_label_map.pbtxt"
      shuffle: false
      num_readers: 1
    }

    可以修改的有好几处,在代码中都标红了。建议还是将配置文件看一下。主要包括:标注的类别数目、batch_size(建议设置小一点)、学习率和退化率、训练的总步数、训练集和验证集的tfrecord的路径、label_map的路径

    3. 开始训练

    python train.py --logtostderr --train_dir=training/ --pipeline_config_path=data/ssd_mobilenet_v1_coco.config

     参考地址:

    https://www.cnblogs.com/White-xzx/p/9503203.html

  • 相关阅读:
    linux安装ftp服务器
    批量删除某个用户所有的表
    解决ORA-14450:试图访问已经在使用的事务处理临时表
    jpa 表字段转bean对象
    oracle查询包含某个字段的表
    给所有的表建同义词
    easyui datagrid
    oracle 拼接一张表所有字段
    easyui js基础
    MySQL的Grant命令
  • 原文地址:https://www.cnblogs.com/gezhuangzhuang/p/10613468.html
Copyright © 2020-2023  润新知