• 基于FCN的人脸检测


    由于本人深度学习环境安装在windows上,因此下面是在windows系统上实现的。仅供自己学习记录。

    使用caffe训练模型,首先需要准备数据。

    正样本:对于人脸检测项目,正样本就是人脸的图片。制作正样本需要将人脸从图片中裁剪出来(数据源已经标注出人脸在图片中的坐标)。裁剪完成之后,需要check一下数据是否制作的没问题。

    负样本:随机进行裁剪,使用IOU确定是正样本还是负样本。比如:IOU<0.3为负样本,最好是拿没有人脸的图片。

    1、caffe数据源准备:

    caffe支持LMDB数据,在训练模型时首先需要将训练集、验证集转换成LMDB数据。

    首先需要准备两个txt文件:train.txt和test.txt。格式如下: 

    /path/to/folder/image_x.jpg 0 (即图片样本所在的路径和标签。文本后面的标签,对于二分类时,为0和1。本例中,0表示人脸数据,1表示非人脸数据。)

    可以使用脚本来获取txt文档。简单写个脚本(获取train.txt)如下:

    (txt文档中应该只需要相对路径,如train.txt的格式如下:xxxx.jpg label ,下面的代码有点问题)——2019年7月28日更新

    import os
    
    full_train_path = r"C:UsersAdministratorDesktopFaceDetection	rain.txt"
    full_val_path = r"C:UsersAdministratorDesktopFaceDetectionval.txt"
    
    train_txt = open(full_train_path, 'w')
    val_txt = open(full_val_path, 'w')
    
    # get train.txt
    for file in os.listdir(r"C:UsersAdministratorDesktopFaceDetection	rain	rain"):
        for figure in os.listdir(r"C:UsersAdministratorDesktopFaceDetection	rain	rain\" + file):
            train_txt.writelines(file + r"/" + figure + " " + file + "
    ")
    
    train_txt.close()
    
    #get val.txt
    for val_file in os.listdir(r"C:UsersAdministratorDesktopFaceDetection	rainval"):
        if val_file.find("faceimage") != -1:
            val_txt.writelines(val_file + " " + "0" + "
    ")
        else:
            val_txt.writelines(val_file + " " + "1" + "
    ")
    
    val_txt.close()

     2、制作LMDB数据源:

    分类问题使用LMDB数据,回归问题使用HDF5数据。

    使用caffe自带的脚本文件,制作LMDB数据源。

    convert_imageset使用格式:
    convert_imageset --参数(如:resize、shuffle等) 数据源路径 数据源的txt 需要输出的lmdb路径。
    cd C:Program Filescaffe-windowsscriptsuild	oolsRelease
    convert_imageset.exe --resize_height=227 --resize_width=227 --shuffle C:UsersAdministratorDesktopFaceDetection rain rain C:UsersAdministratorDesktopFaceDetection rain rain rain.txt C:UsersAdministratorDesktopFaceDetection rain_lmdb convert_imageset.exe --resize_height=227 --resize_width=227 --shuffle C:UsersAdministratorDesktopFaceDetection rainval C:UsersAdministratorDesktopFaceDetection rainvalval.txt C:UsersAdministratorDesktopFaceDetectionval_lmdb

     3、训练ALEXNET网络:

    3.1配置caffe文件:

    1、train.prototxt

    配置caffe格式的ALEXNET网络结构。

    2、solver.prototxt

    ①net:指定网络配置文件路径:

    ②test_iter:设置一次测试需要测试的batch数。最好是test_iter * batch_size = 样本总个数。

    ③base_lr:基础学习率。最终总的学习率为base_lr * lr_mult(train.prototxt中每一层指定的)。学习率不能太大,太大会

    注:windows版本中,配置文件的路径使用“/”,如:source: "C:/Users/Administrator/Desktop/FaceDetection/train_lmdb"

    3.2编写脚本,训练模型,得到模型(_iter_36000.caffemodel):

    cd C:Program Filescaffe-windowsscriptsuild	oolsRelease
    caffe.exe train --solver=C:UsersAdministratorDesktopFaceDetectionsolver.prototxt

    训练过程如下:

     4、人脸识别检测算法框架:

    4.1滑动窗口:

    对输入的图片,画出不同的227*227的窗口(到目前为止,只支持固定大小的图片---卷积神经网络,最后的全连接层,参数固定。后面讲到全卷积网络,可以输入任意大小图片)。

    为了检测不同尺寸图片中的人脸。需要进行多尺度scale变换。

    FCN全卷积网络。得到heatmap,heatmap每一个点,代表了原图的每一个区域,其值为该区域是人脸的概率值。通过前向传播forward_all() ,得到heatmap。

    设置阈值α,比如当α>0.9时,保存框。这样的结果可能得到多个框。可以使用NMS(非极大值抑制)得到最终的一个框。

    4.2将训练时的全连接的Alexnet网络进行转换成全卷积网络fcn的模型:

    可以根据caffe官网示例操作:https://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/net_surgery.ipynb

    首先需要将原先全连接网络的deploy.prototxt文件中的全连接层(InnerProduct)转换层卷积网络层(Convolution),并计算设定kernel size。

    然后使用以下代码转换成全卷积网络fcn的模型(full_conv.caffemodel)

        net = caffe.Net(r"C:UsersAdministratorDesktopFaceDetectiondeploy.prototxt",
                        r"C:UsersAdministratorDesktopFaceDetectionmodel2\_iter_36000.caffemodel",
                        caffe.TEST)
        params = ['fc6', 'fc7', 'fc8_flickr']
    
        fc_params = {pr: (net.params[pr][0].data, net.params[pr][1].data) for pr in params}
    
        for fc in params:
            print("{} weights are {} dimensional and biases are {} dimensional".format(fc, fc_params[fc][0].shape, fc_params[fc][1].shape))
    
        net_fully_conv = caffe.Net(r"C:UsersAdministratorDesktopFaceDetectiondeploy_full_conv.prototxt",
                                   r"C:UsersAdministratorDesktopFaceDetectionmodel2\_iter_36000.caffemodel",
                                   caffe.TEST)
        params_fully_conv = ['fc6-conv', 'fc7-conv', 'fc8-conv']
    
        conv_params = {pr:(net_fully_conv.params[pr][0].data, net_fully_conv.params[pr][1].data) for pr in params_fully_conv}
        for conv in params_fully_conv:
            print("{} weights are {} dimensional and biases are {} dimensional".format(conv, conv_params[conv][0].shape, conv_params[conv][1].shape))
    
        for pr, pr_conv in zip(params, params_fully_conv):
            conv_params[pr_conv][0].flat = fc_params[pr][0].flat
            conv_params[pr_conv][1][...] = fc_params[pr][1]
        net_fully_conv.save(r"C:UsersAdministratorDesktopFaceDetectionfull_conv.caffemodel")

    4.3使用训练好的模型,编码实现人脸检测:

    import os
    import sys
    import numpy as np
    import math
    import cv2
    import random
    
    caffe_root = r"C:Program Filescaffe-windowspython"
    sys.path.insert(0, caffe_root + 'python')
    os.environ['GLOG_minloglevel'] = '2'
    import caffe
    
    class Point(object):
        def __init__(self, x, y):
            self.x = x
            self.y = y
    
    class Rect(object):
        def __init__(self, p1, p2):
            """Store the top, bottom, left, right values for points
            p1, p2 are the left-top and right-bottom points of the rectangle"""
            self.left = min(p1.x, p2.x)
            self.right = max(p1.x, p2.x)
            self.bottom = min(p1.y, p2.y)
            self.top = max(p1.y, p2.y)
    
        def __str__(self):
            return "Rect[%d, %d, %d, %d]" %(self.left, self.top, self.right, self.bottom)
    
    def calcDistance(x1, y1, x2, y2):
        dist = math.sqrt((x2 - x1) ** 2 + (y2 - y1) ** 2)
        return dist
    
    def range_overlap(a_min, a_max, b_min, b_max):
        """Judge whether there is intersection on one dimension"""
        return (a_min <= b_max) and (a_max >= b_min)
    
    def rect_overlaps(r1, r2):
        """Judge whether the two rectangles have intersection"""
        return range_overlap(r1.left, r1.right, r2.left, r2.right) and range_overlap(r1.bottom, r1.top, r2.bottom, r2.top)
    
    def rect_merge(r1, r2, mergeThresh):
        """Calculate the merge area of two rectangles"""
        if rect_overlaps(r1, r2):
            SI = abs(min(r1.right, r2.right) - max(r1.left, r2.left)) * abs(min(r1.top, r2.top) - max(r1.bottom, r2.bottom))
            SA = abs(r1.right - r1.left) * abs(r1.top - r1.bottom)
            SB = abs(r2.right - r2.left) * abs(r2.top - r2.bottom)
            S = SA + SB - SI
    
            ratio = float(SI) / float(S)
    
            if ratio > mergeThresh:
                return 1
        return 0
    
    def generateBoundingBox(featureMap, scale):
        boundingBox = []
        """We can calculate the stride from the architecture of the alexnet"""
        stride = 32
        """We need to get the boundingbox whose size is 227 * 227. When we trained the alexnet,
        we also resize the size of the input image to 227 * 227 in caffe"""
        cellSize = 227
    
        for (x, y), prob in np.ndenumerate(featureMap):
            if(prob >= 0.50):
                """Get the bounding box: we record the left-bottom and right-top coordinates"""
                boundingBox.append([float(stride * y) / scale, float(stride * x) / scale, float(stride * y + cellSize - 1) / scale,
                                   float(stride * x + cellSize - 1) / scale, prob])
        return boundingBox
    
    def nms_average(boxes, groupThresh = 2, overlapThresh=0.2):
        rects = []
    
        for i in range(len(boxes)):
            if boxes[i][4] > 0.2:
                """The box in here, we record the left-bottom coordinates(y, x) and the height and width"""
                rects.append([boxes[i, 0], boxes[i, 1], boxes[i, 2] - boxes[i, 0], boxes[i, 3] - boxes[i, 1]])
    
        rects, weights = cv2.groupRectangles(rects, groupThresh, overlapThresh)
    
        rectangles = []
        for i in range(len(rects)):
            testRect = Rect(Point(rects[i, 0], rects[i, 1]), Point(rects[i, 0] + rects[i, 2], rects[i, 1] + rects[i, 3]))
            rectangles.append(testRect)
        clusters = []
        for rect in rectangles:
            matched = 0
            for cluster in clusters:
                if (rect_merge(rect, cluster, 0.2)):
                    matched = 1
                    cluster.left = (cluster.left + rect.left) / 2
                    cluster.right = (cluster.right + rect.right) / 2
                    cluster.bottom = (cluster.bottom + rect.bottom) / 2
                    cluster.top = (cluster.top + rect.top) / 2
            if (not matched):
                clusters.append(rect)
    
        result_boxes = []
        for i in range(len(clusters)):
            result_boxes.append([clusters[i].left, clusters[i].bottom, clusters[i].right, clusters[i].top, 1])
    
        return result_boxes
    
    def face_detection(imgFlie):
        net_fully_conv = caffe.Net(r"C:UsersAdministratorDesktopFaceDetectiondeploy_full_conv.prototxt",
                                   r"C:UsersAdministratorDesktopFaceDetectionfull_conv.caffemodel",
                                   caffe.TEST)
    
        scales = []
        factor = 0.793700526
    
        img = cv2.imread(imgFlie)
        print(img.shape)
    
        largest = min(2, 4000 / max(img.shape[0:2]))
        scale = largest
        minD = largest * min(img.shape[0:2])
        while minD >= 227:
            scales.append(scale)
            scale *= factor
            minD *= factor
        total_boxes = []
    
        for scale in scales:
            scale_img = cv2.resize(img, (int(img.shape[0] * scale), int(img.shape[1] * scale)))
            cv2.imwrite(r"C:UsersAdministratorDesktopFaceDetectionscale_img.jpg", scale_img)
            im = caffe.io.load_image(r"C:UsersAdministratorDesktopFaceDetectionscale_img.jpg")
    
            """Change the test input data size of the scaled image size """
            net_fully_conv.blobs['data'].reshape(1, 3, scale_img.shape[1], scale_img.shape[0])
            transformer = caffe.io.Transformer({'data': net_fully_conv.blobs['data'].data.shape})
            transformer.set_transpose('data', (2, 0, 1))
            transformer.set_channel_swap('data', (2, 1, 0))
            transformer.set_raw_scale('data', 255.0)
    
            out = net_fully_conv.forward_all(data=np.asarray([transformer.preprocess('data', im)]))
            print(out['prob'][0, 1].shape)
    
            boxes = generateBoundingBox(out['prob'][0, 1], scale)
    
            if (boxes):
                total_boxes.extend(boxes)
        print(total_boxes)
        boxes_nms = np.array(total_boxes)
        true_boxes = nms_average(boxes_nms, 1, 0.2)
    
        if (not true_boxes == []):
            (x1, y1, x2, y2) = true_boxes[0][:-1]
            cv2.rectangle(img, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0))
            win = cv2.namedWindow('face detection', flags=0)
            cv2.imshow('face detection', img)
            cv2.waitKey(0)
    
    if __name__ == "__main__":
        img = r"C:UsersAdministratorDesktopFaceDetection	mp9055.jpg"
        face_detection(img)

    因为电脑配置实在是太低了,所以训练了好久,电脑开着跑了好几天,也没有训练很多次。所以模型训练的不是很好。本例中,经过调参,发现生成boundingbox时,prob设置为大于等于0.5得到的结果较好。结果如下:

    另外,也是用过tensorflow写过训练代码,但是由于电脑太差,训练速度太慢、精度太差。待以后慢慢再进一步学习。

    2、tensorflow实现

    2.1将数据转换成TFRecord文件,便于后续训练,代码如下:

    import tensorflow as tf
    import numpy as np
    import os
    import cv2
    from PIL import Image
    
    class0_path = "/home/sxj/DL/face_detect/train/train/0/"
    class1_path = "/home/sxj/DL/face_detect/train/train/1/"
    tf_output_dir = "/home/sxj/DL/face_detect/data/"
    tf_filename = "/home/sxj/DL/face_detect/data/train.tfrecord"
    SAMPLER_PER_FILES = 5000
    
    # 将数据集转换成TFRecords格式
    def int64_feature(value):
        if not isinstance(value, list):
            value = [value]
        return tf.train.Feature(int64_list=tf.train.Int64List(value=value))
    
    def bytes_feature(value):
        if not isinstance(value, list):
            value = [value]
        return tf.train.Feature(bytes_list=tf.train.BytesList(value=value))
    
    
    def get_output_filename(tf_output_dir, dataset_name, fdx):
        return "%s/%s_%03d.tfrecord"
    
    # 将标签为0的数据集进行转换
    i = 0
    total_files = len(os.listdir(class0_path))
    train_writer = tf.python_io.TFRecordWriter(tf_filename)
    
    for img in os.listdir(class0_path):
        print("转换图片进度%d/%d" % (i + 1, total_files))
    
        # 获取图片
        file_name = class0_path + img
        img_raw = Image.open(file_name)
        img_raw = img_raw.resize((227, 227))
        img_data = img_raw.tobytes()
        # img_data = tf.gfile.FastGFile(file_name, 'rb').read()
    
        # 将图片数据封装出example
        example = tf.train.Example(features=tf.train.Features(feature={
            'label': int64_feature(value=0),
            'image/encoded': bytes_feature(value=img_data)
        }))
    
        # 序列化
        train_writer.write(example.SerializeToString())
    
        i += 1
    
    print("数据集 转换成功")
    
    # 将标签为1的数据集进行转换
    i = 0
    total_files = len(os.listdir(class1_path))
    
    for img in os.listdir(class1_path):
        print("转换图片进度%d/%d" % (i + 1, total_files))
    
        # 获取图片
        file_name = class1_path + img
        # img_data = tf.gfile.FastGFile(file_name, 'rb').read()
        img_raw = Image.open(file_name)
        img_raw = img_raw.resize((227, 227))
        img_data = img_raw.tobytes()
    
        # 将图片数据封装出example
        example = tf.train.Example(features=tf.train.Features(feature={
            'label': int64_feature(value=1),
            'image/encoded': bytes_feature(value=img_data)
        }))
    
        # 序列化
        train_writer.write(example.SerializeToString())
    
        i += 1
    
    train_writer.close()
    View Code

    2.2 TFRecord文件读取

    import os
    import sys
    import numpy as np
    import math
    import cv2
    import random
    import tensorflow as tf
    import matplotlib.pyplot as plt
    
    slim = tf.contrib.slim
    
    
    batch_size = 32
    img_size = 227
    num_bathes = 100
    train_step = 1000001
    model_path = "/home/sxj/DL/insightface/alex_model"
    model_name = 'Alex'
    addr = '/home/sxj/DL/face_detect/data/train.tfrecords'
    
    
    lr_steps = [40000, 60000, 80000]
    lr_values = [0.004, 0.002, 0.0012, 0.0004]
    
    
    class Point(object):
        def __init__(self, x, y):
            self.x = x
            self.y = y
    
    
    class Rect(object):
        def __init__(self, p1, p2):
            """Store the top, bottom, left, right values for points
            p1, p2 are the left-top and right-bottom points of the rectangle"""
            self.left = min(p1.x, p2.x)
            self.right = max(p1.x, p2.x)
            self.bottom = min(p1.y, p2.y)
            self.top = max(p1.y, p2.y)
    
        def __str__(self):
            return "Rect[%d, %d, %d, %d]" %(self.left, self.top, self.right, self.bottom)
    
    
    def calcDistance(x1, y1, x2, y2):
        dist = math.sqrt((x2 - x1) ** 2 + (y2 - y1) ** 2)
        return dist
    
    
    def range_overlap(a_min, a_max, b_min, b_max):
        """Judge whether there is intersection on one dimension"""
        return (a_min <= b_max) and (a_max >= b_min)
    
    
    def rect_overlaps(r1, r2):
        """Judge whether the two rectangles have intersection"""
        return range_overlap(r1.left, r1.right, r2.left, r2.right) and range_overlap(r1.bottom, r1.top, r2.bottom, r2.top)
    
    
    def rect_merge(r1, r2, mergeThresh):
        """Calculate the merge area of two rectangles"""
        if rect_overlaps(r1, r2):
            SI = abs(min(r1.right, r2.right) - max(r1.left, r2.left)) * abs(min(r1.top, r2.top) - max(r1.bottom, r2.bottom))
            SA = abs(r1.right - r1.left) * abs(r1.top - r1.bottom)
            SB = abs(r2.right - r2.left) * abs(r2.top - r2.bottom)
            S = SA + SB - SI
    
            ratio = float(SI) / float(S)
    
            if ratio > mergeThresh:
                return 1
        return 0
    
    
    def softmax(a, b):
        a0 = math.exp(a)
        a1 = math.exp(b)
        return a0/(a0 + a1)
    
    
    def generateBoundingBox(featureMap, scale):
        boundingBox = []
        """We can calculate the stride from the architecture of the alexnet"""
        stride = 32
        """We need to get the boundingbox whose size is 227 * 227. When we trained the alexnet,
        we also resize the size of the input image to 227 * 227 in caffe"""
        cellSize = 227
        # print(featureMap.shape[0])
        # featureMap = tf.nn.softmax(featureMap)
        # print(featureMap[0][0][0])
        # print(featureMap[0][0][1])
        for x in range(featureMap.shape[0]):
            for y in range(featureMap.shape[1]):
                prob = softmax(featureMap[x][y][0], featureMap[x][y][1])
                if prob > 0.4:
                    print("success")
                    boundingBox.append(
                        [float(stride * y) / scale, float(stride * x) / scale, float(stride * y + cellSize - 1) / scale,
                         float(stride * x + cellSize - 1) / scale, 1])
        return boundingBox
    
        # for (x, prob) in np.ndenumerate(featureMap):
            # if x[2] == 0
            # print(x)
            # print(prob)
            # print("aaaaaaaaaaa")
    
        # for x, y in zip(range(featureMap.get_shape().as_list()[0]), range(featureMap.get_shape().as_list()[1])):
        # for x in range(featureMap.get_shape().as_list()[0]):
        #     for y in range(featureMap.get_shape().as_list()[1]):
        #         # print(x)
        #         # print(y)
        #         prob = tf.nn.softmax([featureMap[x, y][0], featureMap[x, y][1]])
        #         print(sess.run(prob[0]))
        #         print(prob[1])
        #         print("----")
        #         if (prob[0] >= 0.50):
        #             """Get the bounding box: we record the left-bottom and right-top coordinates"""
        #             boundingBox.append(
        #                 [float(stride * y) / scale, float(stride * x) / scale, float(stride * y + cellSize - 1) / scale,
        #                  float(stride * x + cellSize - 1) / scale, prob])
        #     # prob = tf.nn.softmax([featureMap[x, y][0], featureMap[x, y][1]])
        #     # if(prob[0] >= 0.50):
        #     #     """Get the bounding box: we record the left-bottom and right-top coordinates"""
        #     #     boundingBox.append([float(stride * y) / scale, float(stride * x) / scale, float(stride * y + cellSize - 1) / scale,
        #     #                        float(stride * x + cellSize - 1) / scale, prob])
        # return boundingBox
    
    
    def nms_average(boxes, groupThresh = 2, overlapThresh=0.2):
        rects = []
    
        for i in range(len(boxes)):
            if boxes[i][4] > 0.2:
                """The box in here, we record the left-bottom coordinates(y, x) and the height and width"""
                rects.append([boxes[i, 0], boxes[i, 1], boxes[i, 2] - boxes[i, 0], boxes[i, 3] - boxes[i, 1]])
    
        rects, weights = cv2.groupRectangles(rects, groupThresh, overlapThresh)
    
        rectangles = []
        for i in range(len(rects)):
            testRect = Rect(Point(rects[i, 0], rects[i, 1]), Point(rects[i, 0] + rects[i, 2], rects[i, 1] + rects[i, 3]))
            rectangles.append(testRect)
        clusters = []
        for rect in rectangles:
            matched = 0
            for cluster in clusters:
                if (rect_merge(rect, cluster, 0.2)):
                    matched = 1
                    cluster.left = (cluster.left + rect.left) / 2
                    cluster.right = (cluster.right + rect.right) / 2
                    cluster.bottom = (cluster.bottom + rect.bottom) / 2
                    cluster.top = (cluster.top + rect.top) / 2
            if (not matched):
                clusters.append(rect)
    
        result_boxes = []
        for i in range(len(clusters)):
            result_boxes.append([clusters[i].left, clusters[i].bottom, clusters[i].right, clusters[i].top, 1])
    
        return result_boxes
    
    
    def print_tensor_info(tensor):
        print("tensor name:", tensor.op.name, "-tensor shape:", tensor.get_shape().as_list())
    
    
    def read_single_tfrecord(addr, _batch_size, shape):
        filename_queue = tf.train.string_input_producer([addr], shuffle=True)
    
        reader = tf.TFRecordReader()
        _, serialized_example = reader.read(filename_queue)
    
        features = tf.parse_single_example(serialized_example,
                                           features={
                                            'image': tf.FixedLenFeature([], tf.string),
                                            'label': tf.FixedLenFeature([], tf.int64)})
        img = tf.decode_raw(features['image'], tf.uint8)
        label = tf.cast(features['label'], tf.int32)
        img = tf.reshape(img, [shape, shape, 3])
        # img = augmentation(img)
        img=(tf.cast(img, tf.float32)-127.5)/128
        min_after_dequeue = 10000
        batch_size = _batch_size
        capacity = min_after_dequeue + 10 * batch_size
        image_batch, label_batch = tf.train.shuffle_batch([img, label],
                                                            batch_size=batch_size,
                                                            capacity=capacity,
                                                            min_after_dequeue=min_after_dequeue,
                                                            num_threads=4)
    
        label_batch = tf.reshape(label_batch, [batch_size])
    
        return image_batch, label_batch
    
    
    def Network(images, is_train):
        # input 227*227
        with tf.variable_scope('vgg_16'):
            with slim.arg_scope([slim.conv2d, slim.max_pool2d], padding='SAME'):
                conv1 = slim.conv2d(images, 96, [11, 11], stride=[4, 4], scope='conv_1')  # 55*55*96
                print_tensor_info(conv1)
                pool1 = slim.max_pool2d(conv1, [3, 3], stride=[2, 2], scope='pool_1')  # 27*27*96
                print_tensor_info(pool1)
    
                conv2 = slim.conv2d(pool1, 256, [5, 5], stride=[1, 1], scope='conv_2')
                print_tensor_info(conv2)
                pool2 = slim.max_pool2d(conv2, [3, 3], stride=[2, 2], scope='pool_2')
                print_tensor_info(pool2)
    
                conv3 = slim.conv2d(pool2, 384, [3, 3], stride=[1, 1], scope='conv_3')
                print_tensor_info(conv3)
    
                conv4 = slim.conv2d(conv3, 384, [3, 3], stride=[1, 1], scope='conv_4')
                print_tensor_info(conv4)
    
                conv5 = slim.conv2d(conv4, 256, [3, 3], stride=[1, 1], scope='conv_5')
                pool5 = slim.max_pool2d(conv5, [3, 3], stride=[2, 2], scope='pool_5')
                print_tensor_info(pool5)
    
                conv6 = slim.conv2d(pool5, 256, [3, 3], stride=[1, 1], scope='conv_6')
                print_tensor_info(conv6)
    
                output = slim.conv2d(conv6, 2, [8, 8], stride=[8, 8], scope='output')
                print_tensor_info(output)
                if is_train:
                    output = tf.reshape(output, [-1, 2])
                    print_tensor_info(output)
                else:
                    output = tf.squeeze(output, axis=0)
                    output = tf.nn.softmax(output)
                    print_tensor_info(output)
                return output
    
    # def test():
    #     reader = tf.TFRecordReader()
    #     filename_queue = tf.train.string_input_producer([addr])
    #
    #     _, serialized_example = reader.read(filename_queue)
    #
    #     features = tf.parse_single_example(serialized_example,
    #                                        features={
    #                                            'image': tf.FixedLenFeature([], tf.string),
    #                                            'label': tf.FixedLenFeature([], tf.int64),
    #                                        })
    #
    #     image = tf.decode_raw(features['image'], tf.uint8)
    #     # image = image.set_shape([227, 227, 3])
    #     image = tf.reshape(image, [227, 227, 3])
    #     label = tf.cast(features['label'], tf.int32)
    #     [img_batch, label_batch] = tf.train.shuffle_batch([image, label],
    #                                                       batch_size=32,
    #                                                       capacity=64,
    #                                                       min_after_dequeue=32)
    #
    #     sess = tf.Session()
    #     coord = tf.train.Coordinator()
    #     threads = tf.train.start_queue_runners(sess=sess, coord=coord)
    #
    #     for i in range(10):
    #         print(sess.run([img_batch, label_batch]))
    #         print(tf.shape(image))
    
    
    def train():
        image = tf.placeholder(tf.float32, [batch_size, img_size, img_size, 3], name='image')
        label = tf.placeholder(tf.int32, [batch_size], name='label')
        logit = Network(image, True)  # [batch, 2]
        train_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logit, labels=label))
    
        train_images, train_labels = read_single_tfrecord(addr, batch_size, img_size)
    
        with tf.name_scope('loss'):
            tf.summary.scalar('train_loss', train_loss)
    
        global_step = tf.Variable(name='global_step', initial_value=0, trainable=False)
        inc_op = tf.assign_add(global_step, 1, name='increment_global_step')
    
        scale = int(128.0/batch_size)
        _lr_steps = [scale*s for s in lr_steps]
        _lr_values = [v/scale for v in lr_values]
        lr = tf.train.piecewise_constant(global_step, boundaries=_lr_steps, values=_lr_values, name='lr_schedule')
    
        update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
        with tf.control_dependencies(update_ops):
            train_op = tf.train.MomentumOptimizer(learning_rate=lr, momentum=0.9).minimize(train_loss)
    
        with tf.name_scope('accuracy'):
            # label = tf.one_hot(label, 2)
            train_accuracy = tf.reduce_mean(
                tf.cast(tf.equal(tf.to_int32(tf.argmax(tf.nn.softmax(logit), axis=1)), label), tf.float32))
            tf.summary.scalar('train_accuracy', train_accuracy)
    
        saver = tf.train.Saver(max_to_keep=5)
        merged = tf.summary.merge_all()
    
        with tf.Session() as sess:
            sess.run((tf.global_variables_initializer(),
                      tf.local_variables_initializer()))
            coord = tf.train.Coordinator()
            threads = tf.train.start_queue_runners(sess=sess, coord=coord)
    
            writer_train = tf.summary.FileWriter("/home/sxj/DL/insightface/alex_model/%s" % (model_name), sess.graph)
    
            try:
    
                for i in range(1, train_step):
                    image_batch, label_batch = sess.run([train_images, train_labels])
                    sess.run([train_op, inc_op], feed_dict={image: image_batch, label: label_batch})
                    if (i % 100 == 0):
                        summary = sess.run(merged, feed_dict={image: image_batch, label: label_batch})
                        writer_train.add_summary(summary, i)
                    if (i % 1000 == 0):
                        print('次数', i)
                        print('train_accuracy',
                              sess.run(train_accuracy, feed_dict={image: image_batch, label: label_batch}))
                        print('train_loss', sess.run(train_loss,
                                                     {image: image_batch, label: label_batch}))
                        if (i % 10000 == 0):
                            saver.save(sess, os.path.join(model_path, model_name), global_step=i)
    
            except tf.errors.OutOfRangeError:
                print("finished")
            finally:
                coord.request_stop()
                writer_train.close()
    
            # for i in range(1, train_step):
            #     image_batch, label_batch = sess.run([train_images, train_labels])
            #
            #     sess.run([train_op, inc_op], feed_dict={image: image_batch, label: label_batch})
            # #     if (i % 100 == 0):
            # #         summary = sess.run(merged, feed_dict={image: image_batch, label: label_batch})
            # #         writer_train.add_summary(summary, i)
            #     if (i % 10 == 0):
            #         print('次数', i)
            #         print('train_accuracy',
            #               sess.run(train_accuracy, feed_dict={image: image_batch, label: label_batch}))
            #         print('train_loss', sess.run(train_loss,
            #                                      {image: image_batch, label: label_batch}))
            #         if (i % 10000 == 0):
            #             saver.save(sess, os.path.join(model_path, model_name), global_step=i)
            # # coord.join(threads)
    
    
    # def face_detection(imgFlie):
    #
    #     scales = []
    #     factor = 0.793700526
    #
    #     img = cv2.imread(imgFlie)
    #
    #     image_size = [227, 227]
    #     # image = tf.placeholder(tf.float32, [1, 227, 227, 3], name='image')
    #     # a = Network(image)
    #     largest = min(2, 4000 / max(img.shape[0:2]))
    #     scale = largest
    #     minD = largest * min(img.shape[0:2])
    #     while minD >= 227:
    #         scales.append(scale)
    #         scale *= factor
    #         minD *= factor
    #     total_boxes = []
    #
    #
    #     with tf.Session() as sess:
    #         # saver = tf.train.Saver()
    #         # saver.restore(sess, "/home/sxj/DL/insightface/alex_model/Alex-500000")
    #         sess.run((tf.global_variables_initializer(),
    #                   tf.local_variables_initializer()))
    #
    #         for scale in scales:
    #             tf.reset_default_graph()
    #             scale_img = cv2.resize(img, (int(img.shape[0] * scale), int(img.shape[1] * scale)))
    #             # print(scale_img.shape[0], scale_img.shape[1], scale_img.shape[2])
    #             if scale_img.shape[0] < 227:
    #                 continue
    #             cv2.imwrite(r"/home/sxj/DL/face_detect/scale_img.jpg", scale_img)
    #
    #             im = cv2.imread(r"/home/sxj/DL/face_detect/scale_img.jpg")
    #             im = np.array(im)
    #             print(im.shape)
    #
    #             image = tf.placeholder(tf.float32, [1, im.shape[0], im.shape[1], im.shape[2]])
    #
    #             im = im.reshape(1, im.shape[0], im.shape[1], im.shape[2])
    #             logit = Network(image, False)
    #             saver = tf.train.Saver()
    #             saver.restore(sess, "/home/sxj/DL/insightface/alex_model/Alex-10000")
    #             a = sess.run(logit, feed_dict={image: im})
    #
    #             boxes = generateBoundingBox(sess, a, scale)
    #
    #             if (boxes):
    #                 total_boxes.extend(boxes)
    #         # print(total_boxes)
    #         boxes_nms = np.array(total_boxes)
    #         true_boxes = nms_average(boxes_nms, 1, 0.2)
    #
    #         if (not true_boxes == []):
    #             (x1, y1, x2, y2) = true_boxes[0][:-1]
    #             cv2.rectangle(img, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0))
    #             win = cv2.namedWindow('face detection', flags=0)
    #             cv2.imshow('face detection', img)
    #             cv2.waitKey(0)
    
    def face_detection(imgFlie):
        scales = []
        factor = 0.793700526
    
        img = cv2.imread(imgFlie)
    
        # image = tf.placeholder(tf.float32, name='image')
        largest = min(2, 4000 / max(img.shape[0:2]))
        scale = largest
        minD = largest * min(img.shape[0:2])
        while minD >= 227:
            scales.append(scale)
            scale *= factor
            minD *= factor
        total_boxes = []
    
        for scale in scales:
            image = tf.placeholder(tf.float32, name='image')
            scale_img = cv2.resize(img, (int(img.shape[0] * scale), int(img.shape[1] * scale)))
    
            cv2.imwrite(r"/home/sxj/DL/face_detect/scale_img.jpg", scale_img)
    
            im = cv2.imread(r"/home/sxj/DL/face_detect/scale_img.jpg")
    
            image_reshape = tf.reshape(image, [1, im.shape[0], im.shape[1], 3])
            logit = Network(image_reshape, False)
            with tf.Session() as sess:
                sess.run((tf.global_variables_initializer(),
                          tf.local_variables_initializer()))
    
                saver = tf.train.Saver()
                saver.restore(sess, "/home/sxj/DL/insightface/alex_model/Alex-80000")
                input = sess.run(logit, feed_dict={image: im})
    
                boxes = generateBoundingBox(input, scale)
    
                if (boxes):
                    total_boxes.extend(boxes)
    
            tf.reset_default_graph()
        print(total_boxes)
    
        boxes_nms = np.array(total_boxes)
        true_boxes = nms_average(boxes_nms, 1, 0.2)
    
        if (not true_boxes == []):
            (x1, y1, x2, y2) = true_boxes[0][:-1]
            cv2.rectangle(img, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0))
            # win = cv2.namedWindow('face detection', flags=0)
            plt.imshow(img)
            plt.show()
            # cv2.imshow('face detection', img)
            # cv2.waitKey(0)
    
        # with tf.Session() as sess:
        #     sess.run((tf.global_variables_initializer(),
        #               tf.local_variables_initializer()))
        #
        #     for scale in scales:
        #         print("aaaaaaaaaaa")
        #         scale_img = cv2.resize(img, (int(img.shape[0] * scale), int(img.shape[1] * scale)))
        #         # print(scale_img.shape[0], scale_img.shape[1], scale_img.shape[2])
        #
        #         cv2.imwrite(r"/home/sxj/DL/face_detect/scale_img.jpg", scale_img)
        #
        #         im = cv2.imread(r"/home/sxj/DL/face_detect/scale_img.jpg")
        #
        #         image_reshape = tf.reshape(image, [1, im.shape[0], im.shape[1], 3])
        #         logit = Network(image_reshape, False)
        #
        #         saver = tf.train.Saver()
        #         saver.restore(sess, "/home/sxj/DL/insightface/alex_model/Alex-1000")
        #         a = sess.run(logit, feed_dict={image: im})
        #
        #         boxes = generateBoundingBox(sess, a, scale)
        #
        #         if (boxes):
        #             total_boxes.extend(boxes)
        #
        #
        #     # print(total_boxes)
        #     boxes_nms = np.array(total_boxes)
        #     true_boxes = nms_average(boxes_nms, 1, 0.2)
        #
        #     if (not true_boxes == []):
        #         (x1, y1, x2, y2) = true_boxes[0][:-1]
        #         cv2.rectangle(img, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0))
        #         win = cv2.namedWindow('face detection', flags=0)
        #         cv2.imshow('face detection', img)
        #         cv2.waitKey(0)
    
    def run_bechmark():
        with tf.Graph().as_default():
            image_size = 227
            # 以高斯分布产生一些图片
            images = tf.Variable(tf.random_normal([batch_size, image_size, image_size, 3],
                                                  dtype=tf.float32, stddev=0.1))
            Network(images)
            init = tf.global_variables_initializer()
            sess = tf.Session()
            sess.run(init)
    
    
    if __name__ == "__main__":
        train()
        # img = "/home/sxj/DL/face_detect/c.jpg"
        # face_detection(img)
    View Code

    训练和测试代码:

    import os
    import sys
    import numpy as np
    import math
    import cv2
    import random
    import tensorflow as tf
    import matplotlib.pyplot as plt
    
    # 因为自己电脑安装的cv2显示图片有问题,这里使用matplotlib来显示图片
    
    slim = tf.contrib.slim
    
    
    batch_size = 32
    img_size = 227
    num_bathes = 100
    train_step = 1000001
    model_path = "/home/sxj/DL/insightface/alex_model"
    model_name = 'Alex'
    addr = '/home/sxj/DL/face_detect/data/train.tfrecords'
    
    
    lr_steps = [40000, 60000, 80000]
    lr_values = [0.004, 0.002, 0.0012, 0.0004]
    
    
    class Point(object):
        def __init__(self, x, y):
            self.x = x
            self.y = y
    
    
    class Rect(object):
        def __init__(self, p1, p2):
            """Store the top, bottom, left, right values for points
            p1, p2 are the left-top and right-bottom points of the rectangle"""
            self.left = min(p1.x, p2.x)
            self.right = max(p1.x, p2.x)
            self.bottom = min(p1.y, p2.y)
            self.top = max(p1.y, p2.y)
    
        def __str__(self):
            return "Rect[%d, %d, %d, %d]" %(self.left, self.top, self.right, self.bottom)
    
    
    def calcDistance(x1, y1, x2, y2):
        dist = math.sqrt((x2 - x1) ** 2 + (y2 - y1) ** 2)
        return dist
    
    
    def range_overlap(a_min, a_max, b_min, b_max):
        """Judge whether there is intersection on one dimension"""
        return (a_min <= b_max) and (a_max >= b_min)
    
    
    def rect_overlaps(r1, r2):
        """Judge whether the two rectangles have intersection"""
        return range_overlap(r1.left, r1.right, r2.left, r2.right) and range_overlap(r1.bottom, r1.top, r2.bottom, r2.top)
    
    
    def rect_merge(r1, r2, mergeThresh):
        """Calculate the merge area of two rectangles"""
        if rect_overlaps(r1, r2):
            SI = abs(min(r1.right, r2.right) - max(r1.left, r2.left)) * abs(min(r1.top, r2.top) - max(r1.bottom, r2.bottom))
            SA = abs(r1.right - r1.left) * abs(r1.top - r1.bottom)
            SB = abs(r2.right - r2.left) * abs(r2.top - r2.bottom)
            S = SA + SB - SI
    
            ratio = float(SI) / float(S)
    
            if ratio > mergeThresh:
                return 1
        return 0
    
    
    def softmax(a, b):
        a0 = math.exp(a)
        a1 = math.exp(b)
        return a0/(a0 + a1)
    
    
    def generateBoundingBox(featureMap, scale):
        boundingBox = []
        """We can calculate the stride from the architecture of the alexnet"""
        stride = 32
        """We need to get the boundingbox whose size is 227 * 227. When we trained the alexnet,
        we also resize the size of the input image to 227 * 227 in caffe"""
        cellSize = 227
    
        for x in range(featureMap.shape[0]):
            for y in range(featureMap.shape[1]):
                if featureMap[x][y] > 0.8:
                    boundingBox.append(
                        [float(stride * y) / scale, float(stride * x) / scale, float(stride * y + cellSize - 1) / scale,
                         float(stride * x + cellSize - 1) / scale, featureMap[x][y]])
    
        return boundingBox
    
    
    def nms_average(boxes, groupThresh = 2, overlapThresh=0.2):
        rects = []
    
        for i in range(len(boxes)):
            if boxes[i][4] > 0.2:
                """The box in here, we record the left-bottom coordinates(y, x) and the height and width"""
                rects.append([boxes[i, 0], boxes[i, 1], boxes[i, 2] - boxes[i, 0], boxes[i, 3] - boxes[i, 1]])
    
        rects, weights = cv2.groupRectangles(rects, groupThresh, overlapThresh)
    
        rectangles = []
        for i in range(len(rects)):
            testRect = Rect(Point(rects[i, 0], rects[i, 1]), Point(rects[i, 0] + rects[i, 2], rects[i, 1] + rects[i, 3]))
            rectangles.append(testRect)
        clusters = []
        for rect in rectangles:
            matched = 0
            for cluster in clusters:
                if (rect_merge(rect, cluster, 0.2)):
                    matched = 1
                    cluster.left = (cluster.left + rect.left) / 2
                    cluster.right = (cluster.right + rect.right) / 2
                    cluster.bottom = (cluster.bottom + rect.bottom) / 2
                    cluster.top = (cluster.top + rect.top) / 2
            if (not matched):
                clusters.append(rect)
    
        result_boxes = []
        for i in range(len(clusters)):
            result_boxes.append([clusters[i].left, clusters[i].bottom, clusters[i].right, clusters[i].top, 1])
    
        return result_boxes
    
    
    def print_tensor_info(tensor):
        print("tensor name:", tensor.op.name, "-tensor shape:", tensor.get_shape().as_list())
    
    
    def read_single_tfrecord(addr, _batch_size, shape):
        filename_queue = tf.train.string_input_producer([addr], shuffle=True)
    
        reader = tf.TFRecordReader()
        _, serialized_example = reader.read(filename_queue)
    
        features = tf.parse_single_example(serialized_example,
                                           features={
                                            'image': tf.FixedLenFeature([], tf.string),
                                            'label': tf.FixedLenFeature([], tf.int64)})
        img = tf.decode_raw(features['image'], tf.uint8)
        label = tf.cast(features['label'], tf.int32)
        img = tf.reshape(img, [shape, shape, 3])
        # img = augmentation(img)
        img=(tf.cast(img, tf.float32)-127.5)/128
        min_after_dequeue = 10000
        batch_size = _batch_size
        capacity = min_after_dequeue + 10 * batch_size
        image_batch, label_batch = tf.train.shuffle_batch([img, label],
                                                            batch_size=batch_size,
                                                            capacity=capacity,
                                                            min_after_dequeue=min_after_dequeue,
                                                            num_threads=4)
    
        label_batch = tf.reshape(label_batch, [batch_size])
    
        return image_batch, label_batch
    
    
    def Network(images, is_train):
        # input 227*227
        with tf.variable_scope('vgg_16'):
            with slim.arg_scope([slim.conv2d, slim.max_pool2d], padding='VALID'):
                conv1 = slim.conv2d(images, 96, [11, 11], stride=[4, 4], scope='conv_1')  # 55*55*96
                print_tensor_info(conv1)
                pool1 = slim.max_pool2d(conv1, [3, 3], stride=[2, 2], scope='pool_1')  # 27*27*96
                print_tensor_info(pool1)
    
                conv2 = slim.conv2d(pool1, 256, [5, 5], stride=[1, 1], scope='conv_2')
                print_tensor_info(conv2)
                pool2 = slim.max_pool2d(conv2, [3, 3], stride=[2, 2], scope='pool_2')
                print_tensor_info(pool2)
    
                conv3 = slim.conv2d(pool2, 384, [3, 3], stride=[1, 1], scope='conv_3')
                print_tensor_info(conv3)
    
                conv4 = slim.conv2d(conv3, 384, [3, 3], stride=[1, 1], scope='conv_4')
                print_tensor_info(conv4)
    
                conv5 = slim.conv2d(conv4, 256, [3, 3], stride=[1, 1], scope='conv_5')
                pool5 = slim.max_pool2d(conv5, [3, 3], stride=[2, 2], scope='pool_5')
                print_tensor_info(pool5)
    
                conv6 = slim.conv2d(pool5, 256, [2, 2], stride=[1, 1], scope='conv_6')
                print_tensor_info(conv6)
    
                output = slim.conv2d(conv6, 2, [1, 1], stride=[1, 1], scope='output')
                print_tensor_info(output)
                if is_train:
                    output = tf.reshape(output, [-1, 2])
                    print_tensor_info(output)
                else:
                    output = tf.squeeze(output, axis=0)
                    output = tf.nn.softmax(output)
                    print_tensor_info(output)
                return output
    
    
    def train():
        image = tf.placeholder(tf.float32, [batch_size, img_size, img_size, 3], name='image')
        label = tf.placeholder(tf.int32, [batch_size], name='label')
        logit = Network(image, True)  # [batch, 2]
        train_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logit, labels=label))
    
        train_images, train_labels = read_single_tfrecord(addr, batch_size, img_size)
    
        with tf.name_scope('loss'):
            tf.summary.scalar('train_loss', train_loss)
    
        global_step = tf.Variable(name='global_step', initial_value=0, trainable=False)
        inc_op = tf.assign_add(global_step, 1, name='increment_global_step')
    
        scale = int(128.0/batch_size)
        _lr_steps = [scale*s for s in lr_steps]
        _lr_values = [v/scale for v in lr_values]
        lr = tf.train.piecewise_constant(global_step, boundaries=_lr_steps, values=_lr_values, name='lr_schedule')
    
        update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
        with tf.control_dependencies(update_ops):
            train_op = tf.train.MomentumOptimizer(learning_rate=lr, momentum=0.9).minimize(train_loss)
    
        with tf.name_scope('accuracy'):
            # label = tf.one_hot(label, 2)
            train_accuracy = tf.reduce_mean(
                tf.cast(tf.equal(tf.to_int32(tf.argmax(tf.nn.softmax(logit), axis=1)), label), tf.float32))
            tf.summary.scalar('train_accuracy', train_accuracy)
    
        saver = tf.train.Saver(max_to_keep=5)
        merged = tf.summary.merge_all()
    
        with tf.Session() as sess:
            sess.run((tf.global_variables_initializer(),
                      tf.local_variables_initializer()))
            coord = tf.train.Coordinator()
            threads = tf.train.start_queue_runners(sess=sess, coord=coord)
    
            saver.restore(sess, '/home/sxj/DL/insightface/alex_model/Alex-40000')
    
            writer_train = tf.summary.FileWriter("/home/sxj/DL/insightface/alex_model/%s" % (model_name), sess.graph)
    
            try:
    
                for i in range(1, train_step):
                    image_batch, label_batch = sess.run([train_images, train_labels])
                    sess.run([train_op, inc_op], feed_dict={image: image_batch, label: label_batch})
                    if (i % 100 == 0):
                        summary = sess.run(merged, feed_dict={image: image_batch, label: label_batch})
                        writer_train.add_summary(summary, i)
                    if (i % 1000 == 0):
                        print('次数', i)
                        print('train_accuracy',
                              sess.run(train_accuracy, feed_dict={image: image_batch, label: label_batch}))
                        print('train_loss', sess.run(train_loss,
                                                     {image: image_batch, label: label_batch}))
                        if (i % 10000 == 0):
                            saver.save(sess, os.path.join(model_path, model_name), global_step=i)
    
            except tf.errors.OutOfRangeError:
                print("finished")
            finally:
                coord.request_stop()
                writer_train.close()
    
    
    
    def face_detection(imgFlie):
        scales = []
        factor = 0.793700526
    
        img = cv2.imread(imgFlie)
    
        # image = tf.placeholder(tf.float32, name='image')
        largest = min(2, 4000 / max(img.shape[0:2]))
        scale = largest
        minD = largest * min(img.shape[0:2])
        while minD >= 227:
            scales.append(scale)
            scale *= factor
            minD *= factor
        total_boxes = []
    
        for scale in scales:
            print("scales")
            print(scale)
            image = tf.placeholder(tf.float32, name='image')
            scale_img = cv2.resize(img, (int(img.shape[0] * scale), int(img.shape[1] * scale)))
    
            image_reshape = tf.reshape(image, [1, scale_img.shape[0], scale_img.shape[1], 3])
            logit = Network(image_reshape, False)
            with tf.Session() as sess:
                sess.run((tf.global_variables_initializer(),
                          tf.local_variables_initializer()))
    
                saver = tf.train.Saver()
                saver.restore(sess, "/home/sxj/DL/insightface/alex_model/Alex-50000")
                input = sess.run(logit, feed_dict={image: scale_img})
    
                boxes = generateBoundingBox(input[:, :, 0], scale)
    
                if boxes:
                    total_boxes.extend(boxes)
    
            tf.reset_default_graph()
        print(total_boxes)
    
        boxes_nms = np.array(total_boxes)
        true_boxes = nms_average(boxes_nms, 1, 0.2)
    
        if not true_boxes == []:
            (x1, y1, x2, y2) = true_boxes[0][:-1]
            cv2.rectangle(img, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0))
            plt.imshow(img)
            plt.show()
    
    
    
    if __name__ == "__main__":
        train()
        # img = "/home/sxj/DL/face_detect/tmp9055.jpg"
        # face_detection(img)
    View Code

    训练过程如下:

    测试结果:

    tensorflow的实现代码的可能还有缺陷,实际跑出来效果不太好,还在优化中,欢迎各位大佬提出意见。

    注:本人正在学习AI相关知识,本例只是通过视频学习加上自己动手操作实现人脸检测功能,仅供自己学习记录。

  • 相关阅读:
    Django(一)创建第一个Django的demo
    使用webdriver扒取网站小说(二)-----进阶篇(分层数据驱动)
    【求解答】在eclipse中运行Android项目出现的问题 ——Launching MyFirstAPP' has encountered a program. Errors occurred during the build.
    今天想写一点简单的东西-关于计算机算法的
    新的朋友
    Java基础回顾(3)
    java基础回顾(2)
    java基础回顾(一)
    Asp.Net Mvc AutoFac 的简单使用
    微信文件记录删除
  • 原文地址:https://www.cnblogs.com/xjlearningAI/p/11067200.html
Copyright © 2020-2023  润新知