• 基于深度学习的人脸识别系统(Caffe+OpenCV+Dlib)【三】VGG网络进行特征提取


    前言

    基于深度学习的人脸识别系统,一共用到了5个开源库:OpenCV(计算机视觉库)、Caffe(深度学习库)、Dlib(机器学习库)、libfacedetection(人脸检测库)、cudnn(gpu加速库)。
    用到了一个开源的深度学习模型:VGG model。
    最终的效果是很赞的,识别一张人脸的速度是0.039秒,而且最重要的是:精度高啊!!!
    CPU:intel i5-4590
    GPU:GTX 980
    系统:Win 10
    OpenCV版本:3.1(这个无所谓)
    Caffe版本:Microsoft caffe (微软编译的Caffe,安装方便,在这里安利一波)
    Dlib版本:19.0(也无所谓
    CUDA版本:7.5
    cudnn版本:4
    libfacedetection:6月份之后的(这个有所谓,6月后出了64位版本的)
    这个系列纯C++构成,有问题的各位朋同学可以直接在博客下留言,我们互相交流学习。
    ====================================================================

    本篇是该系列的第三篇博客,介绍如何使用VGG网络模型与Caffe的 MemoryData层去提取一个OpenCV矩阵类型Mat的特征。

    思路

    VGG网络模型是牛津大学视觉几何组提出的一种深度模型,在LFW数据库上取得了97%的准确率。VGG网络由5个卷积层,两层fc图像特征,一层fc分类特征组成,具体我们可以去读它的prototxt文件。这里是模型与配置文件的下载地址。
    http://www.robots.ox.ac.uk/~vgg/software/vgg_face/
    话题回到Caffe。在Caffe中提取图片的特征是很容易的,其提供了extract_feature.exe让我们来实现,提取格式为lmdb与leveldb。关于这个的做法,可以看我的这篇博客:
    http://blog.csdn.net/mr_curry/article/details/52097529
    显然,我们在程序中肯定是希望能够灵活利用的,使用这种方法不太可行。Caffe的Data层提供了type:MemoryData,我们可以使用它来进行Mat类型特征的提取。
    注:你需要先按照本系列第一篇博客的方法去配置好Caffe的属性表。
    http://blog.csdn.net/mr_curry/article/details/52443126

    实现

    首先我们打开VGG_FACE_deploy.prototxt,观察VGG的网络结构。
    这里写图片描述
    有意思的是,MemoryData层需要图像均值,但是官方网站上并没有给出mean文件。我们可以通过这种方式进行输入:

        mean_value:129.1863
        mean_value:104.7624
        mean_value:93.5940

    我们还需要修改它的data层:(你可以用下面这部分的代码去替换下载下来的prototxt文件的data层)

       layer {
      name: "data"
      type: "MemoryData"
      top: "data"
      top: "label"
      transform_param {
        mirror: false
        crop_size: 224
        mean_value:129.1863
        mean_value:104.7624
        mean_value:93.5940
      }
      memory_data_param {
        batch_size: 1
        channels:3
        height:224
        224
      }
    }
    

    为了不破坏原来的文件,把它另存为vgg_extract_feature_memorydata.prototxt。
    这里写图片描述
    好的,然后我们开始编写。添加好这个属性表:
    这里写图片描述
    然后,新建caffe_net_memorylayer.h、ExtractFeature_.h、ExtractFeature_.cpp开始编写。
    caffe_net_memorylayer.h:

    #include "caffe/layers/input_layer.hpp"  
    #include "caffe/layers/inner_product_layer.hpp"  
    #include "caffe/layers/dropout_layer.hpp"  
    #include "caffe/layers/conv_layer.hpp"  
    #include "caffe/layers/relu_layer.hpp"  
    #include <iostream> 
    #include "caffe/caffe.hpp"
    #include <opencv.hpp>
    #include <caffe/layers/memory_data_layer.hpp>
    #include "caffe/layers/pooling_layer.hpp"  
    #include "caffe/layers/lrn_layer.hpp"  
    #include "caffe/layers/softmax_layer.hpp"  
    // must predefined
    caffe::MemoryDataLayer<float> *memory_layer;
    caffe::Net<float>* net;

    ExtractFeature_.h

    #include <opencv.hpp>
    using namespace cv;
    using namespace std;
    
    std::vector<float> ExtractFeature(Mat FaceROI);//给一个图片 返回一个vector<float>容器
    void Caffe_Predefine();

    ExtractFeature_.cpp:

    #include <ExtractFeature_.h>
    #include <caffe_net_memorylayer.h>
    namespace caffe
    {
        extern INSTANTIATE_CLASS(InputLayer);
        extern INSTANTIATE_CLASS(InnerProductLayer);
        extern INSTANTIATE_CLASS(DropoutLayer);
        extern INSTANTIATE_CLASS(ConvolutionLayer);
        REGISTER_LAYER_CLASS(Convolution);
        extern INSTANTIATE_CLASS(ReLULayer);
        REGISTER_LAYER_CLASS(ReLU);
        extern INSTANTIATE_CLASS(PoolingLayer);
        REGISTER_LAYER_CLASS(Pooling);
        extern INSTANTIATE_CLASS(LRNLayer);
        REGISTER_LAYER_CLASS(LRN);
        extern INSTANTIATE_CLASS(SoftmaxLayer);
        REGISTER_LAYER_CLASS(Softmax);
        extern INSTANTIATE_CLASS(MemoryDataLayer);
    }
    template <typename Dtype>
    caffe::Net<Dtype>* Net_Init_Load(std::string param_file, std::string pretrained_param_file, caffe::Phase phase)
    {
        caffe::Net<Dtype>* net(new caffe::Net<Dtype>("vgg_extract_feature_memorydata.prototxt", caffe::TEST));
        net->CopyTrainedLayersFrom("VGG_FACE.caffemodel");
        return net;
    }
    
    void Caffe_Predefine()//when our code begining run must add it
    {
        caffe::Caffe::set_mode(caffe::Caffe::GPU);
        net = Net_Init_Load<float>("vgg_extract_feature_memorydata.prototxt", "VGG_FACE.caffemodel", caffe::TEST);
        memory_layer = (caffe::MemoryDataLayer<float> *)net->layers()[0].get();
    }
    
    std::vector<float> ExtractFeature(Mat FaceROI)
    {
        caffe::Caffe::set_mode(caffe::Caffe::GPU);
        std::vector<Mat> test;
        std::vector<int> testLabel;
        std::vector<float> test_vector;
        test.push_back(FaceROI);
        testLabel.push_back(0);
        memory_layer->AddMatVector(test, testLabel);// memory_layer and net , must be define be a global variable.
        test.clear(); testLabel.clear();
        std::vector<caffe::Blob<float>*> input_vec;
        net->Forward(input_vec);
        boost::shared_ptr<caffe::Blob<float>> fc8 = net->blob_by_name("fc8");
        int test_num = 0;
        while (test_num < 2622)
        {
            test_vector.push_back(fc8->data_at(0, test_num++, 1, 1));
        }
        return test_vector;
    }

    =============注意上面这个地方可以这么改:==============
    (直接可以知道这个向量的首地址、尾地址,我们直接用其来定义vector)

            float* begin = nullptr;
            float* end = nullptr;
            begin = fc8->mutable_cpu_data();
            end = begin + fc8->channels();
            CHECK(begin != nullptr);
            CHECK(end != nullptr);
            std::vector<float> FaceVector{ begin,end };
            return std::move(FaceVector);

    请特别注意这个地方:

    namespace caffe
    {
        extern INSTANTIATE_CLASS(InputLayer);
        extern INSTANTIATE_CLASS(InnerProductLayer);
        extern INSTANTIATE_CLASS(DropoutLayer);
        extern INSTANTIATE_CLASS(ConvolutionLayer);
        REGISTER_LAYER_CLASS(Convolution);
        extern INSTANTIATE_CLASS(ReLULayer);
        REGISTER_LAYER_CLASS(ReLU);
        extern INSTANTIATE_CLASS(PoolingLayer);
        REGISTER_LAYER_CLASS(Pooling);
        extern INSTANTIATE_CLASS(LRNLayer);
        REGISTER_LAYER_CLASS(LRN);
        extern INSTANTIATE_CLASS(SoftmaxLayer);
        REGISTER_LAYER_CLASS(Softmax);
        extern INSTANTIATE_CLASS(MemoryDataLayer);
    }

    为什么要加这些?因为在提取过程中发现,如果不加,会导致有一些层没有注册的情况。我在Github的Microsoft/Caffe上帮一外国哥们解决了这个问题。我把问题展现一下:
    这里写图片描述
    如果我们加了上述代码,就相当于注册了这些层,自然就不会有这样的问题。
    在提取过程中,我提取的是fc8层的特征,2622维。当然,最后一层都已经是分类特征了,最好还是提取fc7层的4096维特征。
    在这个地方:

    void Caffe_Predefine()//when our code begining run must add it
    {
        caffe::Caffe::set_mode(caffe::Caffe::GPU);
        net = Net_Init_Load<float>("vgg_extract_feature_memorydata.prototxt", "VGG_FACE.caffemodel", caffe::TEST);
        memory_layer = (caffe::MemoryDataLayer<float> *)net->layers()[0].get();
    }

    是一个初始化的函数,用于将VGG网络模型与提取特征的配置文件进行传入,所以很明显地,在提取特征之前,需要先:

    Caffe_Predefine()

    进行了这个之后,这些全局量我们就能一直用了。
    我们可以试试提取特征的这个接口。新建一个main.cpp,调用之:

    #include <ExtractFeature_.h>
    int main()
    {
        Caffe_Predefine();
        Mat lena = imread("lena.jpg");
        if (!lena.empty())
        {
            ExtractFeature(lena);
        }
    
    }

    因为我们得到的是一个vector< float>类型,所以我们可以把它逐一输出出来看看。当然,在ExtractFeature()的函数中你就可以这么做了。我们还是在main()函数里这么做。
    来看看:

    #include <ExtractFeature_.h>
    int main()
    {
        Caffe_Predefine();
        Mat lena = imread("lena.jpg");
        if (!lena.empty())
        {
            int i = 0;
            vector<float> print=ExtractFeature(lena);
            while (i<print.size())
            {
                cout << print[i++] << endl;
            }
        }
        imshow("Extract feature",lena);
        waitKey(0);
    }

    那么对于这张图片,提取出的特征,就是很多的这些数字:
    这里写图片描述
    提取一张224*224图片特征的时间为:0.019s。我们可以看到,GPU加速的效果是非常明显的。而且我这块显卡也就是GTX980。不知道泰坦X的提取速度如何(泪)。

    附:net结构 (prototxt),注意layer和layers的区别:

    name: "VGG_FACE_16_layer"
    layer {
      name: "data"
      type: "MemoryData"
      top: "data"
      top: "label"
      transform_param {
        mirror: false
        crop_size: 224
        mean_value:129.1863
        mean_value:104.7624
        mean_value:93.5940
      }
      memory_data_param {
        batch_size: 1
        channels:3
        height:224
        224
      }
    }
    layer {
      bottom: "data"
      top: "conv1_1"
      name: "conv1_1"
      type: "Convolution"
      convolution_param {
        num_output: 64
        pad: 1
        kernel_size: 3
      }
    }
    layer {
      bottom: "conv1_1"
      top: "conv1_1"
      name: "relu1_1"
      type: "ReLU"
    }
    layer {
      bottom: "conv1_1"
      top: "conv1_2"
      name: "conv1_2"
      type: "Convolution"
      convolution_param {
        num_output: 64
        pad: 1
        kernel_size: 3
      }
    }
    layer {
      bottom: "conv1_2"
      top: "conv1_2"
      name: "relu1_2"
      type: "ReLU"
    }
    layer {
      bottom: "conv1_2"
      top: "pool1"
      name: "pool1"
      type: "Pooling"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      bottom: "pool1"
      top: "conv2_1"
      name: "conv2_1"
      type: "Convolution"
      convolution_param {
        num_output: 128
        pad: 1
        kernel_size: 3
      }
    }
    layer {
      bottom: "conv2_1"
      top: "conv2_1"
      name: "relu2_1"
      type: "ReLU"
    }
    layer {
      bottom: "conv2_1"
      top: "conv2_2"
      name: "conv2_2"
      type: "Convolution"
      convolution_param {
        num_output: 128
        pad: 1
        kernel_size: 3
      }
    }
    layer {
      bottom: "conv2_2"
      top: "conv2_2"
      name: "relu2_2"
      type: "ReLU"
    }
    layer {
      bottom: "conv2_2"
      top: "pool2"
      name: "pool2"
      type: "Pooling"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      bottom: "pool2"
      top: "conv3_1"
      name: "conv3_1"
      type: "Convolution"
      convolution_param {
        num_output: 256
        pad: 1
        kernel_size: 3
      }
    }
    layer {
      bottom: "conv3_1"
      top: "conv3_1"
      name: "relu3_1"
      type: "ReLU"
    }
    layer {
      bottom: "conv3_1"
      top: "conv3_2"
      name: "conv3_2"
      type: "Convolution"
      convolution_param {
        num_output: 256
        pad: 1
        kernel_size: 3
      }
    }
    layer {
      bottom: "conv3_2"
      top: "conv3_2"
      name: "relu3_2"
      type: "ReLU"
    }
    layer {
      bottom: "conv3_2"
      top: "conv3_3"
      name: "conv3_3"
      type: "Convolution"
      convolution_param {
        num_output: 256
        pad: 1
        kernel_size: 3
      }
    }
    layer {
      bottom: "conv3_3"
      top: "conv3_3"
      name: "relu3_3"
      type: "ReLU"
    }
    layer {
      bottom: "conv3_3"
      top: "pool3"
      name: "pool3"
      type: "Pooling"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      bottom: "pool3"
      top: "conv4_1"
      name: "conv4_1"
      type: "Convolution"
      convolution_param {
        num_output: 512
        pad: 1
        kernel_size: 3
      }
    }
    layer {
      bottom: "conv4_1"
      top: "conv4_1"
      name: "relu4_1"
      type: "ReLU"
    }
    layer {
      bottom: "conv4_1"
      top: "conv4_2"
      name: "conv4_2"
      type: "Convolution"
      convolution_param {
        num_output: 512
        pad: 1
        kernel_size: 3
      }
    }
    layer {
      bottom: "conv4_2"
      top: "conv4_2"
      name: "relu4_2"
      type: "ReLU"
    }
    layer {
      bottom: "conv4_2"
      top: "conv4_3"
      name: "conv4_3"
      type: "Convolution"
      convolution_param {
        num_output: 512
        pad: 1
        kernel_size: 3
      }
    }
    layer {
      bottom: "conv4_3"
      top: "conv4_3"
      name: "relu4_3"
      type: "ReLU"
    }
    layer {
      bottom: "conv4_3"
      top: "pool4"
      name: "pool4"
      type: "Pooling"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      bottom: "pool4"
      top: "conv5_1"
      name: "conv5_1"
      type: "Convolution"
      convolution_param {
        num_output: 512
        pad: 1
        kernel_size: 3
      }
    }
    layer {
      bottom: "conv5_1"
      top: "conv5_1"
      name: "relu5_1"
      type: "ReLU"
    }
    layer {
      bottom: "conv5_1"
      top: "conv5_2"
      name: "conv5_2"
      type: "Convolution"
      convolution_param {
        num_output: 512
        pad: 1
        kernel_size: 3
      }
    }
    layer {
      bottom: "conv5_2"
      top: "conv5_2"
      name: "relu5_2"
      type: "ReLU"
    }
    layer {
      bottom: "conv5_2"
      top: "conv5_3"
      name: "conv5_3"
      type: "Convolution"
      convolution_param {
        num_output: 512
        pad: 1
        kernel_size: 3
      }
    }
    layer {
      bottom: "conv5_3"
      top: "conv5_3"
      name: "relu5_3"
      type: "ReLU"
    }
    layer {
      bottom: "conv5_3"
      top: "pool5"
      name: "pool5"
      type: "Pooling"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      bottom: "pool5"
      top: "fc6"
      name: "fc6"
      type: "InnerProduct"
      inner_product_param {
        num_output: 4096
      }
    }
    layer {
      bottom: "fc6"
      top: "fc6"
      name: "relu6"
      type: "ReLU"
    }
    layer {
      bottom: "fc6"
      top: "fc6"
      name: "drop6"
      type: "Dropout"
      dropout_param {
        dropout_ratio: 0.5
      }
    }
    layer {
      bottom: "fc6"
      top: "fc7"
      name: "fc7"
      type: "InnerProduct"
      inner_product_param {
        num_output: 4096
      }
    }
    layer {
      bottom: "fc7"
      top: "fc7"
      name: "relu7"
      type: "ReLU"
    }
    layer {
      bottom: "fc7"
      top: "fc7"
      name: "drop7"
      type: "Dropout"
      dropout_param {
        dropout_ratio: 0.5
      }
    }
    layer {
      bottom: "fc7"
      top: "fc8"
      name: "fc8"
      type: "InnerProduct"
      inner_product_param {
        num_output: 2622
      }
    }
    layer {
      bottom: "fc8"
      top: "prob"
      name: "prob"
      type: "Softmax"
    }
    

    =================================================================

    基于深度学习的人脸识别系统系列(Caffe+OpenCV+Dlib)——【三】使用Caffe的MemoryData层与VGG网络模型提取Mat的特征 完结,如果在代码过程中出现了任何问题,直接在博客下留言即可,共同交流学习。

  • 相关阅读:
    [设计模式]之依赖倒置
    CSS的三种使用方式
    CSS的语法结构
    学习 jQueryMobile 第一个程序
    初识 GoogleMap
    程序员考试
    程序员考试
    CSS学习
    认识CSS
    开始忙一段时间
  • 原文地址:https://www.cnblogs.com/mtcnn/p/9412035.html
Copyright © 2020-2023  润新知