• SSD目标检测实战(TF项目)——VOC2007


    TF项目实战(SSD目标检测)-VOC2007

    理论详解:https://blog.csdn.net/u013989576/article/details/73439202

    训练好的模型和代码会公布在网上(含 VOC数据集  vgg16 模型  以及训练好的模型):

     待续

    步骤:

    1.代码地址:https://github.com/balancap/SSD-Tensorflow

    2.解压ssd_300_vgg.ckpt.zip 到checkpoint文件夹下(另外将vgg16模型放在本路径下)

    3.测试一下看看,在notebooks文件夹下创建demo_test.py,其实就是复制ssd_notebook.ipynb中的代码,该py文件是完成对于单张图片的测试。

     1 import os
     2 import math
     3 import random
     4 
     5 import numpy as np
     6 import tensorflow as tf
     7 import cv2
     8 
     9 slim = tf.contrib.slim
    10 import matplotlib.pyplot as plt
    11 import matplotlib.image as mpimg
    12 import sys
    13 
    14 sys.path.append('../')
    15 from nets import ssd_vgg_300, ssd_common, np_methods
    16 from preprocessing import ssd_vgg_preprocessing
    17 from notebooks import visualization
    18 
    19 # TensorFlow session: grow memory when needed. TF, DO NOT USE ALL MY GPU MEMORY!!!
    20 gpu_options = tf.GPUOptions(allow_growth=True)
    21 config = tf.ConfigProto(log_device_placement=False, gpu_options=gpu_options)
    22 isess = tf.InteractiveSession(config=config)
    23 # Input placeholder.
    24 net_shape = (300, 300)
    25 data_format = 'NHWC'
    26 img_input = tf.placeholder(tf.uint8, shape=(None, None, 3))
    27 # Evaluation pre-processing: resize to SSD net shape.
    28 image_pre, labels_pre, bboxes_pre, bbox_img = ssd_vgg_preprocessing.preprocess_for_eval(
    29     img_input, None, None, net_shape, data_format, resize=ssd_vgg_preprocessing.Resize.WARP_RESIZE)
    30 image_4d = tf.expand_dims(image_pre, 0)
    31 
    32 # Define the SSD model.
    33 reuse = True if 'ssd_net' in locals() else None
    34 ssd_net = ssd_vgg_300.SSDNet()
    35 with slim.arg_scope(ssd_net.arg_scope(data_format=data_format)):
    36     predictions, localisations, _, _ = ssd_net.net(image_4d, is_training=False, reuse=reuse)
    37 
    38 # Restore SSD model.
    39 ckpt_filename = '../checkpoints/ssd_300_vgg.ckpt'
    40 # ckpt_filename = '../checkpoints/VGG_VOC0712_SSD_300x300_ft_iter_120000.ckpt'
    41 isess.run(tf.global_variables_initializer())
    42 saver = tf.train.Saver()
    43 saver.restore(isess, ckpt_filename)
    44 
    45 # SSD default anchor boxes.
    46 ssd_anchors = ssd_net.anchors(net_shape)
    47 
    48 
    49 # Main image processing routine.
    50 def process_image(img, select_threshold=0.5, nms_threshold=.45, net_shape=(300, 300)):
    51     # Run SSD network.
    52     rimg, rpredictions, rlocalisations, rbbox_img = isess.run([image_4d, predictions, localisations, bbox_img],
    53                                                               feed_dict={img_input: img})
    54 
    55     # Get classes and bboxes from the net outputs.
    56     rclasses, rscores, rbboxes = np_methods.ssd_bboxes_select(
    57         rpredictions, rlocalisations, ssd_anchors,
    58         select_threshold=select_threshold, img_shape=net_shape, num_classes=21, decode=True)
    59 
    60     rbboxes = np_methods.bboxes_clip(rbbox_img, rbboxes)
    61     rclasses, rscores, rbboxes = np_methods.bboxes_sort(rclasses, rscores, rbboxes, top_k=400)
    62     rclasses, rscores, rbboxes = np_methods.bboxes_nms(rclasses, rscores, rbboxes, nms_threshold=nms_threshold)
    63     # Resize bboxes to original image shape. Note: useless for Resize.WARP!
    64     rbboxes = np_methods.bboxes_resize(rbbox_img, rbboxes)
    65     return rclasses, rscores, rbboxes
    66 
    67 
    68 # Test on some demo image and visualize output.
    69 # 测试的文件夹
    70 path = '../demo/'
    71 image_names = sorted(os.listdir(path))
    72 # 文件夹中的第几张图,-1代表最后一张
    73 img = mpimg.imread(path + image_names[-1])
    74 rclasses, rscores, rbboxes = process_image(img)
    75 
    76 # visualization.bboxes_draw_on_img(img, rclasses, rscores, rbboxes, visualization.colors_plasma)
    77 visualization.plt_bboxes(img, rclasses, rscores, rbboxes)

    4. 将自己的数据集或者 VOC2007直接放在工程目录下

    5. 修改datasets文件夹中pascalvoc_common.py文件,将训练类修改别成自己的(这里如果自己的类别)  本文以两类为例子

     1 VOC_LABELS = {
     2     'none': (0, 'Background'),
     3     'aeroplane': (1, 'Vehicle'),
     4     'bicycle': (2, 'Vehicle'),
     5     'bird': (3, 'Animal'),
     6     'boat': (4, 'Vehicle'),
     7     'bottle': (5, 'Indoor'),
     8     'bus': (6, 'Vehicle'),
     9     'car': (7, 'Vehicle'),
    10     'cat': (8, 'Animal'),
    11     'chair': (9, 'Indoor'),
    12     'cow': (10, 'Animal'),
    13     'diningtable': (11, 'Indoor'),
    14     'dog': (12, 'Animal'),
    15     'horse': (13, 'Animal'),
    16     'motorbike': (14, 'Vehicle'),
    17     'person': (15, 'Person'),
    18     'pottedplant': (16, 'Indoor'),
    19     'sheep': (17, 'Animal'),
    20     'sofa': (18, 'Indoor'),
    21     'train': (19, 'Vehicle'),
    22     'tvmonitor': (20, 'Indoor'),
    23 }
    24 #自己的数据
    25 # VOC_LABELS = {
    26 #     'none': (0, 'Background'),
    27 #     'aeroplane': (1, 'Vehicle'),
    28 # }

    6.  将图像数据转换为tfrecods格式,修改datasets文件夹中的pascalvoc_to_tfrecords.py文件,然后更改文件的83行读取方式为’rb‘,如果你的文件不是.jpg格式,也可以修改图片的类型。

    另外这个修改

    7.运行tf_convert_data.py文件,但是需要传给它一些参数:  这个文件生成TFrecords文件的代码

    但是该文件需要像类似于linux 命令那样传入参数。  pycharm中如何解决呢???

    假设我们需要执行:python ./tf_convert_data.py   --dataset_name=pascalvoc  --dataset_dir=./VOC2007/  --output_name=voc_2007_train  --output_dir=./tfrecords_怎么办呢?

    我们可以在  run中的 Edit ... 进入

    一、

    二、

    三、

    参数:--dataset_name=pascalvoc --dataset_dir=./VOC2007/ --output_name=voc_2007_train --output_dir=./tfrecords_

    然后执行该py文件就ok。

    如果出现错误(文件夹相关的错误),则在工程下建立一个文件夹就可以了。

    8.训练模型train_ssd_network.py文件中修改

     

    None代表一直训练。

    其它需要修改的文件:

    ① nets/ssd_vgg_300.py  (因为使用此网络结构) ,修改87 和88行的类别

    ② train_ssd_network.py,修改类别120行,GPU占用量,学习率,batch_size等

    ③ eval_ssd_network.py 修改类别,66行

    ④ datasets/pascalvoc_2007.py 根据自己的训练数据修改整个文件

    9.开始训练

     类似于第7步中的 三

    训练的主文件为  train_ssd_network.py

    参数为:

    --train_dir=./train_model/ --dataset_dir=./tfrecords_/ --dataset_name=pascalvoc_2007 --dataset_split_name=train --model_name=ssd_300_vgg --checkpoint_path=./checkpoints/vgg_16.ckpt --checkpoint_model_scope=vgg_16 --checkpoint_exclude_scopes=ssd_300_vgg/conv6,ssd_300_vgg/conv7,ssd_300_vgg/block8,ssd_300_vgg/block9,ssd_300_vgg/block10,ssd_300_vgg/block11,ssd_300_vgg/block4_box,ssd_300_vgg/block7_box,ssd_300_vgg/block8_box,ssd_300_vgg/block9_box,ssd_300_vgg/block10_box,ssd_300_vgg/block11_box --trainable_scopes=ssd_300_vgg/conv6,ssd_300_vgg/conv7,ssd_300_vgg/block8,ssd_300_vgg/block9,ssd_300_vgg/block10,ssd_300_vgg/block11,ssd_300_vgg/block4_box,ssd_300_vgg/block7_box,ssd_300_vgg/block8_box,ssd_300_vgg/block9_box,ssd_300_vgg/block10_box,ssd_300_vgg/block11_box --save_summaries_secs=60 --save_interval_secs=600 --weight_decay=0.0005 --optimizer=adam --learning_rate=0.001 --learning_rate_decay_factor=0.94 --batch_size=24 --gpu_memory_fraction=0.9

    训练过程

    10 测试:

    先看效果

    另外我修改了demo_test文件 调取电脑摄像投来执行代码。

    如果单独看一张图的效果则执行函数:

    代码如下:

      1 __author__ = "WSX"
      2 import os
      3 import math
      4 import random
      5 
      6 import numpy as np
      7 import tensorflow as tf
      8 import cv2
      9 
     10 slim = tf.contrib.slim
     11 import matplotlib.pyplot as plt
     12 import matplotlib.image as mpimg
     13 import sys
     14 
     15 sys.path.append('../')
     16 from nets import ssd_vgg_300, ssd_common, np_methods
     17 from preprocessing import ssd_vgg_preprocessing
     18 from notebooks import visualization
     19 
     20 # TensorFlow session: grow memory when needed. TF, DO NOT USE ALL MY GPU MEMORY!!!
     21 gpu_options = tf.GPUOptions(allow_growth=True)
     22 config = tf.ConfigProto(log_device_placement=False, gpu_options=gpu_options)
     23 isess = tf.InteractiveSession(config=config)
     24 # Input placeholder.
     25 net_shape = (300, 300)
     26 data_format = 'NHWC'
     27 img_input = tf.placeholder(tf.uint8, shape=(None, None, 3))
     28 # Evaluation pre-processing: resize to SSD net shape.
     29 image_pre, labels_pre, bboxes_pre, bbox_img = ssd_vgg_preprocessing.preprocess_for_eval(
     30     img_input, None, None, net_shape, data_format, resize=ssd_vgg_preprocessing.Resize.WARP_RESIZE)
     31 image_4d = tf.expand_dims(image_pre, 0)
     32 
     33 # Define the SSD model.
     34 reuse = True if 'ssd_net' in locals() else None
     35 ssd_net = ssd_vgg_300.SSDNet()
     36 with slim.arg_scope(ssd_net.arg_scope(data_format=data_format)):
     37     predictions, localisations, _, _ = ssd_net.net(image_4d, is_training=False, reuse=reuse)
     38 
     39 # Restore SSD model.
     40 ckpt_filename = '../checkpoints/ssd_300_vgg.ckpt'
     41 # ckpt_filename = '../checkpoints/VGG_VOC0712_SSD_300x300_ft_iter_120000.ckpt'
     42 isess.run(tf.global_variables_initializer())
     43 saver = tf.train.Saver()
     44 saver.restore(isess, ckpt_filename)
     45 
     46 # SSD default anchor boxes.
     47 ssd_anchors = ssd_net.anchors(net_shape)
     48 
     49 
     50 # Main image processing routine.
     51 def process_image(img, select_threshold=0.5, nms_threshold=.45, net_shape=(300, 300)):
     52     # Run SSD network.
     53     rimg, rpredictions, rlocalisations, rbbox_img = isess.run([image_4d, predictions, localisations, bbox_img],
     54                                                               feed_dict={img_input: img})
     55 
     56     # Get classes and bboxes from the net outputs.
     57     rclasses, rscores, rbboxes = np_methods.ssd_bboxes_select(
     58         rpredictions, rlocalisations, ssd_anchors,
     59         select_threshold=select_threshold, img_shape=net_shape, num_classes=21, decode=True)
     60 
     61     rbboxes = np_methods.bboxes_clip(rbbox_img, rbboxes)
     62     rclasses, rscores, rbboxes = np_methods.bboxes_sort(rclasses, rscores, rbboxes, top_k=400)
     63     rclasses, rscores, rbboxes = np_methods.bboxes_nms(rclasses, rscores, rbboxes, nms_threshold=nms_threshold)
     64     # Resize bboxes to original image shape. Note: useless for Resize.WARP!
     65     rbboxes = np_methods.bboxes_resize(rbbox_img, rbboxes)
     66     return rclasses, rscores, rbboxes
     67 
     68 
     69 #===========================================测试部分===========================================
     70 #----------------------------------单张图片测试---------------------------
     71 # Test on some demo image and visualize output.
     72 # 测试的文件夹
     73 def demo():
     74     path = '../demo/'
     75     image_names = sorted(os.listdir(path))
     76     # 文件夹中的第几张图,-1代表最后一张
     77     img = mpimg.imread(path + image_names[-1])
     78     print(img.shape)
     79     rclasses, rscores, rbboxes = process_image(img)
     80 
     81     # visualization.bboxes_draw_on_img(img, rclasses, rscores, rbboxes, visualization.colors_plasma)
     82     visualization.plt_bboxes(img, rclasses, rscores, rbboxes)
     83 
     84 
     85 #======================================做成实时显示的代码===================================================
     86 L = ["None","aeroplane","bicycle","bird ","boat ","bottle ","bus ","car ","cat ","chair","cow ","diningtable","dog","horse","motorbike","person",
     87      "pottedplant","sheep","sofa","train","tvmonitor"]
     88 colors_tableau = [(255, 255, 255), (31, 119, 180), (174, 199, 232), (255, 127, 14), (255, 187, 120),
     89                   (44, 160, 44), (152, 223, 138), (214, 39, 40), (255, 152, 150),
     90                   (148, 103, 189), (197, 176, 213), (140, 86, 75), (196, 156, 148),
     91                   (227, 119, 194), (247, 182, 210), (127, 127, 127), (199, 199, 199),
     92                   (188, 189, 34), (219, 219, 141), (23, 190, 207), (158, 218, 229)]
     93 
     94 def Load_video_show():  #获取视频
     95     video = cv2.VideoCapture("1.mp4")  # 0 表示摄像头 ,   如果为文件路径则 为加载视频
     96     while (True):
     97         ret, frame = video.read()  #frame为 帧  这里当做一张图
     98         frame = cv2.flip( frame ,1)  #镜像变换,图像正与不正
     99         cv2.resizeWindow("video", 640, 360)                #设置窗口大小
    100         frame = cv2.resize(frame, (640, 360))               #设置图大小
    101         #cv2.imshow("video" ,frame)
    102         rclasses, rscores, rbboxes = process_image(frame)
    103         height = frame.shape[0]
    104         width = frame.shape[1]
    105         colors = dict()
    106         for i in range(rclasses.shape[0]):
    107             cls_id = int(rclasses[i])
    108             if cls_id >= 0:
    109                 score = rscores[i]
    110                 if cls_id not in colors:
    111                     colors[cls_id] = (random.random(), random.random(), random.random())
    112                 ymin = int(rbboxes[i, 0] * height)
    113                 xmin = int(rbboxes[i, 1] * width)
    114                 ymax = int(rbboxes[i, 2] * height)
    115                 xmax = int(rbboxes[i, 3] * width)
    116                 cv2.rectangle(frame, (xmin, ymin), (xmax,ymax), colors[cls_id], 2)  #画矩形框
    117                 class_name = L[cls_id]
    118                 cv2.putText(frame, '{:s} | {:.3f}'.format(class_name, score), (xmin, ymin - 2,), cv2.FONT_HERSHEY_COMPLEX, 0.5, (0, 255, 0), 1)  #写文字
    119         cv2.imshow("video", frame)
    120         c = cv2.waitKey(50)
    121         if c == 27: #esc退出
    122             break
    123 
    124 
    125 #Load_video_show()
    126 demo() 
  • 相关阅读:
    资源 | TensorFlow推出新工具Seedbank:即刻使用的预训练模型库【转】
    Vim 基本設置 – 使用Vim-plug管理插件 (3)【转】
    Linux kernel 编译问题记录【转】
    深度学习(四)卷积神经网络入门学习(1)【转】
    深度学习:Keras入门(二)之卷积神经网络(CNN)【转】
    深度学习:Keras入门(一)之基础篇【转】
    CNN笔记:通俗理解卷积神经网络【转】
    [Deep Learning] 神经网络基础【转】
    一文弄懂神经网络中的反向传播法——BackPropagation【转】
    Testin
  • 原文地址:https://www.cnblogs.com/WSX1994/p/11216953.html
Copyright © 2020-2023  润新知