Caffe和py-faster-rcnn日常使用备忘录

Caffe和py-faster-rcnn日常使用备忘录
罗列日常使用中遇到的问题和解决办法。包括:
```
{
    caffe使用中的疑惑和解释；
    无法正常执行 train/inference 的情况；
    Caffe基础工具的微小调整，比如绘loss曲线图；
    调试python代码技巧，基于vscode;
    py-faster-rcnn在自己数据集上调参技巧
    py-faster-rcnn因为numpy版本、自己数据集等各种原因导致的坑和解决办法
    py-faster-rcnn本身细节的各种坑
    调试matcaffe的技巧
    protobuf版本的坑
    ...
}
```
保持更新。

last update: 2018.05.13 18:59 GTC

test/deploy阶段的Accuracy层和Softmax层

拿AlexNet举例，为什么计算Accuracy时使用fc8和label这两个blob作为输入，而deploy时fc8这个blob要经过Softmax层再作为结果输出？

解释：fc8相当于定性比较，显示了网络对各个类别上的相对置信度（“相对概率”）
经过softmax是做了归一化操作，是严格服从概率的归一化性质的。
而在Accuracy的计算过程中，只需要知道置信度最大的那个类别，用argmax求出来，所以不需要严格的概率结果（softmax产出的prob），只需要相对置信度向量（fc8）即可。

解析py-faster-rcnn训练的日志，绘制loss曲线

1. 准备文件

拷贝caffe/tools/extra目录下的：
```
parse_log.sh  
extract_seconds.py
plot_training_log.py.example (复制后去掉.example)
```
到训练产生的log文件所在目录。

2.修改代码

extract_seconds.py 38行：
```
        #if line.find('Solving') != -1:
        if line.find('Initializing') != -1:
```
parse_log.sh 29行：
```
# grep '] Solving ' $1 > aux3.txt
grep 'net.cpp:' $1 > aux3.txt
```
修改后，能保证正常log的解析，也能用来解析py-faster-rcnn训练的日志，而不再报错提示说'Start time not found'

3.个性化定制

比如网络有多个loss，需要分开绘图。修改parse_log.sh中，利用grep和awk获取的loss等数据。默认只有training loss。我的改成这样：
```
# grep '] Solving ' $1 > aux.txt
grep 'Initializing solver' $1 > aux.txt
grep ', loss = ' $1 >> aux.txt
grep 'Iteration ' aux.txt | sed  's/.*Iteration ([[:digit:]]*).*/1/g' > aux0.txt
grep ', loss = ' $1 | awk '{print $9}' > aux1.txt
grep ', lr = ' $1 | awk '{print $9}' > aux2.txt
## 新提取4个loss
grep ' loss_bbox = ' | awk '{print $11}' > aux4.txt
grep ' loss_cls = '  | awk '{print $11}' > aux5.txt
grep ' rpn_cls_loss = '  | awk '{print $11}' > aux6.txt
grep ' rpn_loss_bbox = '  | awk '{print $11}' > aux7.txt

# Extracting elapsed seconds
$DIR/extract_seconds.py aux.txt aux3.txt

# Generating. 将新增列添加列名称
echo '#Iters Seconds TrainingLoss loss_bbox loss_cls rpn_cls_loss rpn_loss_bbox LearningRate'> $LOG.train
# paste aux0.txt aux3.txt aux1.txt aux2.txt | column -t >> $LOG.train
paste aux0.txt aux3.txt aux1.txt aux4.txt aux5.txt aux6.txt aux7.txt aux2.txt | column -t >> $LOG.train
rm aux.txt aux0.txt aux1.txt aux2.txt  aux3.txt aux4.txt aux5.txt aux6.txt aux7.txt
```
然后修改plot_training_log.py。这个代码主要是调用parse_log.sh提取数据到xxxx.log.train中，xxxx.log是你指定的训练所得log文件；而其他代码则显得冗余，我反正忽略了，自行绘图了。改成这样：
```
#!/usr/bin/env python
# coding:utf-8

"""
默认的模板写的太复杂。简单点。步骤包括：
1. 调用shell脚本，利用grep等命令得到xxx.log.train文件，里面全都是整理好的各列数据
2. 利用matplotlib/ggplot/visdom等工具绘图
"""
import inspect
import os
import random
import sys
import matplotlib.cm as cmx
import matplotlib.colors as colors
import matplotlib.pyplot as plt
import matplotlib.legend as lgd

def get_parsed_data(log_fname):
    """
    @description 
        执行shell脚本来解析xx.log到xx.log.train；从这个xx.log.train文件读取数据到变量并返回
        需要保证parse_frcnn_log.sh脚本和xx.log在同一个目录下
    """
    os.system('%s %s' % (get_log_parsing_script(), log_fname))

    # 从xxx.log.train读取数据
    data_file_name = log_fname + '.train'
    fin = open(data_file_name)
    lines = [_.rstrip() for _ in fin.readlines()]
    fin.close()
    column_names = lines[0].split(' ')
    n = len(column_names)
    data = []
    for i in range(n):
        data.append([])
    for line in lines[1:]:
        t = line.split()
        for i in range(n):
            data[i].append(t[i])
    
    return column_names, data


def get_log_parsing_script():
    dirname = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
    return dirname + '/parse_frcnn_log.sh'


def plot_frcnn_data(log_fname):
    column_names, data = get_parsed_data(log_fname)
    
    """
    以下是我自己的绘图语句，5个loss-iter曲线在一个figure中绘制。你也可以分别绘制
    """
    # ax = plt.subplot2grid(shape, loc)
    ax0 = plt.subplot2grid((2,4), (0,0), colspan=2, rowspan=2)
    ax1 = plt.subplot2grid((2,4), (0,2))
    ax2 = plt.subplot2grid((2,4), (0,3))
    ax3 = plt.subplot2grid((2,4), (1,2))
    ax4 = plt.subplot2grid((2,4), (1,3))

    ax0.plot(data[0], data[2])

    ax1.plot(data[0], data[3])
    ax1.set_title(column_names[3]+' vs. Iters')

    ax2.plot(data[0], data[4])
    ax2.set_title(column_names[4]+' vs. Iters')

    ax3.plot(data[0], data[5])
    ax3.set_title(column_names[5]+' vs. Iters')

    ax4.plot(data[0], data[6])
    ax4.set_title(column_names[6]+' vs. Iters')

    plt.show()

if __name__ == '__main__':
    log_fname = 'faster_rcnn_end2end_VGG16_.txt.2018-03-14_01-12-12.log'
    plot_frcnn_data_try(log_fname)
```
绘制验证集上的精度曲线

在分类任务的各种教程里，都有让在validation集上绘制accuracy-iteration曲线，如果它越来越高就说明训练的好，而且曲线应当没有太大的“反弹”（fluctuations）。
但是在目标检测任务上，各种大牛的论文和小白的教程，都故意忽略这种调试方式。个人训练了RPN网络，每间隔500迭代存储一个模型，在验证集上测试得到AP（Average Precision），绘制AP-Iteration曲线。
实际情况：曲线的fluctuations很大，可能前一次是16%的AP，下一次可以跌到8%；我自己的数据集和VOC2007的都有这样的波动。估计太难看了，大家都不在发表的东西上绘制出来。

Caffe自带绘制网络图的工具，使用报错

执行命令：python draw_net.py ~/work/caffe/models/bvlc_googlenet/train_val.prototxt ~/work/caffe/models/bvlc_googlenet/train_val.png --rankdir=LR

报错：'google.protobuf.pyext._message.RepeatedScalarConta' object has no attribute '_values'

解决： removed the "._values" on lines 94,96,98 in python/caffe/draw.py. Looks like it wants to get the length of the array. I worked by remove them
ref:https://github.com/NVIDIA/DIGITS/issues/591

训练时出现: check failure stack trace

有博客提到是路径错误。
我的解决办法是，不要用shell调用python代码，而是直接执行python的训练代码。（我是py-faster-rcnn修改时遇到；C++版本的训练？不知道）

Check failed: status == CUBLAS_STATUS_SUCCESS (1 vs 0)

帮阿毛在Mint系统上运行RCF这个边缘检测算法的代码时，网络加载的最后提示这句。注意，错误码是1而不是11，这个问题比较少见。Mint是基于ubuntu的改造版。
解决办法：
```
rm ~/.nv
```
然后重新运行代码

参考：BVLC/caffe#5564

pycaffe引入caffe报错

看到很多做法是设置PYTHONPATH这个环境变量。其实个人不推荐。原因：很多基于caffe的算法代码，自行修改过caffe代码，导致你的系统上有多个版本的caffe。如果用PYTHONPATH，那么如何区分这些版本的caffe？很容易冲突。
解决办法：在需要import caffe的代码头部：
```
pycaffe_dir = '/home/chris/work/caffe-BVLC/python'
import sys
sys.path.insert(0, pycaffe_dir)
```
或者，干脆写一个专门用来导入pycaffe路径的文件init_paths.py:
```
pycaffe_dir = '/home/chris/work/caffe-BVLC/python'
import sys
sys.path.insert(0, pycaffe_dir)
```
然后，每次需要import caffe的代码，先import init_paths

VSCODE调试python代码

不一定是pycaffe，但是处理数据什么的，用python，也需要debug。首先装python插件。

官方python debug说明页：https://code.visualstudio.com/docs/python/debugging

需要输入参数来执行的python代码的调试

python debug的配置文件launch.json中，添加args关键字段和具体取值

使用到相对路径，但是该python代码不在打开的目录顶层，而是在子目录中，导致调试运行时路径找不到。

launch.json中修改cwd字段。是current working directory的意思。

具体参考如下：
```
...
    "version": "0.2.0",
    "configurations": [

        {
            "name": "Python: Current File",
            "type": "python",
            "request": "launch",
            "stopOnEntry": true,
            "pythonPath": "${config:python.pythonPath}",  
            "program": "${file}",
            "cwd": "${workspaceFolder}/vis/plot_loss",      // 把python调试所使用的工作目录，调到vis/plot_loss子目录
            "env": {},
            "envFile": "${workspaceFolder}/.env",
            "debugOptions": [
                "RedirectOutput"
            ],
            "args": [       // args关键字，指定程序输入的参数。 如果是"--key val"类型的，分拆为"--key", "val"
                "6",
                "loss_vs_iter.png",
                "coolnet.log"
            ]
        },
...
```
关闭终端输出的log信息

使用已经训练好的caffemodel来执行inference时，我不需要看到log信息，而只想看到我需要的计算结果。比如跑faster-rcnn的测试，我只想看到整个数据集上的AP结果输出，而不要看到网络建立的各种信息。

pycaffe下的解决办法：在当前执行的python脚本文件import部分，第一次（包含间接的)import caffe前，使用这三行：
```
import os
os.environ['GLOG_minloglevel'] = '3'
import caffe
```
所谓间接的import caffe，就是例如import fast_rcnn.test_net，而fast_rcnn/test_net.py中有import caffe。

C++下的解决办法：没试过，有网友写的：http://blog.csdn.net/xiaoheiblack/article/details/54969642

AttributeError: 'module' object has no attribute 'text_format'

在文件./lib/fast_rcnn/train.py增加一行import google.protobuf.text_format 即可解决问题
原因是用的protobuf的pip包的版本升级到2.6了，原本是2.5版不需要这样import的。或者，也可以通过指定安装2.5版本的protobuf的pip包来解决：sudo pip install protobuf==2.5.0

Caffe中常见层的常见参数默认值

lr_mult: 1
decay_mult: 1

测试精度突然降低为0

在训练RPN检测网络，自己的数据集，训练结束后每隔500迭代的caffemodel执行一次测试（算出AP），前面6000次（2个epoch）AP在5%~20%之间，第6500次的AP突然降低到0。训练期间learning rate没有降低过。

原因：梯度爆炸，某次forward时候产生过大的梯度，对应的loss值反常的大。

解决办法：增大batch size来平滑。通过solver中添加iter_size:2改进。另，有人提到clip_gradient，但感觉取值很难确定，不好玩。

参考：https://stats.stackexchange.com/questions/255105/why-is-the-validation-accuracy-fluctuating

调整参考窗口大小

Anchor机制下，指定Anchor box的scale和aspect ratios，生成rpn_conv上的参考窗口；这些窗口再乘以base_size，就放大（映射）为网络输入图上的参考窗口。
当待检测目标很小（比如平均只有26像素），但是Anchor scale很大（使用[8, 16,32]），那么参考窗口会很大（仅考虑1:1方形，base size=16, 参考窗尺寸[128, 256,512]），按说，和gt_boxes的IoU，不是0就是0.01这么小。overlap这么小，小于0.7这个预设值，不是好的正样本；但是RPN里，还设定了和每个gt_box的最大overlap的参考窗口，也作为正样本：
```
# fg label: for each gt, anchor with highest overlap
labels[gt_argmax_overlaps] = 1
```
也就是：参考窗口和gt明明IoU只有0.01，但是被拉过来做为正样本了！矬子里拔将军，显然是Garbage In, Garbage Out了，不可能产生好的训练结果。
这时候虽然开启anchor_target_layer.py的Debug模式，看到正负样本比例接近1：1，但显然正样本可能都是负样本。。

训练faster-rcnn出现错误“KeyError: 'max_overlaps'”解决

原因可能有多种，最可能的是：缓存文件出错。也即：训练使用的annotation内容更新过了；或者使用了别的数据集训练了。
把py-faster-rcnn/data/cache目录清空；把py-faster-rcn/data/VOCdevkit2007/annotations_cache清空

有时候觉得这个cache机制反而很蠢，埋藏错误。

去除各种随机过程，让同样参数配置下训练结果一致

严格的完全一致没有必要，这里只做到每次loss输出的小数点后2位能一致就可以了。
做这个去除随机过程的初衷是，在自己数据集上训练，如果迭代几万次，结果AP可以稳定一些，但是要等很久；如果只迭代几千次，比如3个epoch，9000iteration，每隔500iter测试全体测试数据计算出AP，绘制AP-Iteration曲线，发现这个曲线想当不稳定，同一个迭代次上AP可能相差10个点。
为了避免这种情况，固定样本出现顺序、固定网络初始化模型，基本锁定了大的变化。

固定样本出现顺序：

lib/roi_data_layer/layer.py有两处：
```
        # self._perm = np.random.permutation(np.arange(len(self._roidb)))
        self._perm = np.arange(len(self._roidb))
```
固定网络初始化模型：

copy_from()会从同名层复制，预训练模型中没有的层，则初始化，每次初始化服从的分布一样但是具体数值不一样。
创建一个RPN/FRCNN网络，不执行训练，直接保存。以后用这个模型初始化即可。
mk_init_model.py:
```
# coding:utf-8
import _init_paths
import caffe

from fast_rcnn.config import cfg, cfg_from_file

if __name__ == '__main__':
    cfg_file = 'experiments/cfgs/myconfig.yml'  
    cfg_from_file(cfg_file)   #这一步很重要，否则网络都建立不起来，因为第一层数据层的建立需要各种cfg的判断
    
    prototxt = 'models/mydb/rpn/zf/train.pt'
    pretrained_model = 'data/imagenet_models/ZF.v2.caffemodel'
    save_pth = 'data/imagenet_models/rpn_zf_init.caffemodel'
    
    net = caffe.Net(prototxt, caffe.TRAIN)
    net.copy_from(pretrained_model)
    net.save(save_pth)
```
TypeError: 'numpy.float64' object cannot be interpreted as an index

因为numpy版本高导致的。个人倾向于改代码而不是降低numpy版本。
1. /home/xxx/py-faster-rcnn/lib/roi_data_layer/minibatch.py
```
将第26行：fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image)
改为：fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image).astype(np.int)
```
2)/home/xxx/py-faster-rcnn/lib/datasets/ds_utils.py
```
将第12行：hashes = np.round(boxes * scale).dot(v)
改为：hashes = np.round(boxes * scale).dot(v).astype(np.int)
```
1. /home/xxx/py-faster-rcnn/lib/fast_rcnn/test.py
```
将第129行： hashes = np.round(blobs['rois'] * cfg.DEDUP_BOXES).dot(v)
改为： hashes = np.round(blobs['rois'] * cfg.DEDUP_BOXES).dot(v).astype(np.int)
```
4)/home/xxx/py-faster-rcnn/lib/rpn/proposal_target_layer.py
```
将第60行：fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image)
改为：fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image).astype(np.int)
```
TypeError: slice indices must be integers or None or have an index method

依然是numpy版本问题，依然推荐改代码：

修改 /home/lzx/py-faster-rcnn/lib/rpn/proposal_target_layer.py，转到123行，原来内容：
```
for ind in inds:
        cls = clss[ind]
        start = 4 * cls
        end = start + 4
        bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
        bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
    return bbox_targets, bbox_inside_weights
```
修改为：
```
for ind in inds:
        ind = int(ind)
        cls = clss[ind]
        start = int(4 * cls)
        end = int(start + 4)
        bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
        bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
    return bbox_targets, bbox_inside_weights
```
ValueError: operands could not be broadcast together with shapes (84,1024) (8,1)

完整的报错位置：
```
File "/home/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 73, in snapshot  
    self.bbox_stds[:, np.newaxis])  
ValueError: operands could not be broadcast together with shapes (84,1024) (8,1)
```
现象：在保存caffemodel文件时报错
修改：train.prototxt中类别数量忘记修改了，84对应着20类的pascal voc原版，8对应着自己的单个类别数据集（比如行人检测/人脸检测）

bbox_transform.py:48: RuntimeWarning: overflow encountered in exp

完整报错：
```
I0408 14:59:36.903856 16481 solver.cpp:229] Iteration 0, loss = 16.1743
I0408 14:59:36.903894 16481 solver.cpp:245]     Train net output #0: loss_bbox = 0.169268 (* 1 = 0.169268 loss)
I0408 14:59:36.903903 16481 solver.cpp:245]     Train net output #1: loss_cls = 17.8601 (* 1 = 17.8601 loss)
I0408 14:59:36.903909 16481 solver.cpp:245]     Train net output #2: rpn_cls_loss = 7.35363 (* 1 = 7.35363 loss)
I0408 14:59:36.903914 16481 solver.cpp:245]     Train net output #3: rpn_loss_bbox = 0.271347 (* 1 = 0.271347 loss)
I0408 14:59:36.903920 16481 sgd_solver.cpp:106] Iteration 0, lr = 0.001
/opt/shiyan/py-faster-rcnn/lib/fast_rcnn/bbox_transform.py:58: RuntimeWarning: overflow encountered in exp
  pred_w = np.exp(dw) * widths[:, np.newaxis]
/opt/shiyan/py-faster-rcnn/lib/fast_rcnn/bbox_transform.py:58: RuntimeWarning: overflow encountered in multiply
  pred_w = np.exp(dw) * widths[:, np.newaxis]
/opt/shiyan/py-faster-rcnn/lib/fast_rcnn/bbox_transform.py:59: RuntimeWarning: overflow encountered in exp
  pred_h = np.exp(dh) * heights[:, np.newaxis]
/opt/shiyan/py-faster-rcnn/lib/fast_rcnn/bbox_transform.py:59: RuntimeWarning: overflow encountered in multiply
  pred_h = np.exp(dh) * heights[:, np.newaxis]
('anchor_target_layer: avg_num_pos=', 9, 'avg_num_neg=', 246)
('proposal_target_layer: avg_num_pos=', 8, 'avg_num_pos=', 67, 'ratio: 0.132')
[1]    16481 floating point exception  tools/train_net.py --gpu 0 --solver models/bdcirpn/frcnn/res101_np/solver.pt 
```
按照github上这个issue, meetshah1995的评论我有用到，也就是：

在config.py中添加：
```
__C.BBOX_XFORM_CLIP = np.log(1000. / 16.)
```
and add these lines just before the predict_ctr_x is computed in your bbox_transform.py:
```
    # Prevent sending too large values into np.exp()
    dw = np.minimum(dw, cfg.BBOX_XFORM_CLIP)
    dh = np.minimum(dh, cfg.BBOX_XFORM_CLIP)
```
这时候重新执行训练，报错变成了：
```
I0408 15:02:49.634021 17637 solver.cpp:229] Iteration 0, loss = 16.4431
I0408 15:02:49.634066 17637 solver.cpp:245]     Train net output #0: loss_bbox = 0.172947 (* 1 = 0.172947 loss)
I0408 15:02:49.634074 17637 solver.cpp:245]     Train net output #1: loss_cls = 18.3941 (* 1 = 18.3941 loss)
I0408 15:02:49.634080 17637 solver.cpp:245]     Train net output #2: rpn_cls_loss = 7.35363 (* 1 = 7.35363 loss)
I0408 15:02:49.634088 17637 solver.cpp:245]     Train net output #3: rpn_loss_bbox = 0.271347 (* 1 = 0.271347 loss)
I0408 15:02:49.634109 17637 sgd_solver.cpp:106] Iteration 0, lr = 0.001
[1]    17637 floating point exception  tools/train_net.py --gpu 0 --solver models/bdcirpn/frcnn/res101_np/solver.pt 
```
通过一番debug找到问题了，是因为产生了空的proposals，大坑啊，一堆边长为１的proposal，无论如何在_filter_boxes()之后肯定一个都不留啊。解决方法:把原来的：
```
# proposal_layer.py
        keep = _filter_boxes(proposals, min_size * im_info[2])
        proposals = proposals[keep, :]
        scores = scores[keep]
```
换成：
```
# proposal_layer.py
        keep = _filter_boxes(proposals, min_size * im_info[2])
        if len(keep)!=0:
            proposals = proposals[keep, :]
            scores = scores[keep]
```
然而，虽然训练不会断了，但是loss直接变成nan了。显然上述修改还是有问题。解决方法：调小learning rate

"/usr/bin/ld: cannot find -lboost_python"

编译caffe时候遇到这个问题。
解决办法：
```
sudo apt install libboost-all-dev
```
MatCaffe创建网络失败，莫名其妙崩溃

通常出现在调试老版本的matcaffe。比如看中了一个论文的算法，其作者开源的代码基于matcaffe，但是caffe版本很老，而且作者的代码中各种bug，比如路径写死了但是readme中又不说，比如网络prototxt文件里面不兼容。。。
说一下我的调试技巧吧。如果matcaffe的网络建立时就报错，那么尝试用python接口去建立该网络，然后能在终端看到错误输出，而不是傻逼matlab什么都看不到就给一个窗口说，出现了问题是否要发送邮件，blablabla。
```
# fuck_matcaffe.py
import sys, os
sys.path.insert(0, '/opt/work/caffe-BVLC-cvprw15')
import caffe

prototxt = 'deploy.prototxt'
caffemodel = 'mynet.caffemodel'
net = caffe.Net(prototxt, caffemodel, caffe.TEST)
```
运行：
```
python fuck_matcaffe.py
```
CMake编译Caffe

不想吐槽官方Caffe了，一个安装程序都写不好。。
Ubuntu16.04, 使用CMake编译，要修改CAFFE_ROOT/cmake/Dependencies.cmake，添加boost库里面的regex:
```
find_package(Boost 1.54 REQUIRED COMPONENTS system thread filesystem regex)
```
否则编译到92%会提示：

../lib/libcaffe.so.1.0.0: undefined reference to `boost::re_detail::cpp_regex_traits_implementation::transform_primary(char const, char const) const'

以及添加自己编译的opencv路径（如果需要的话）：
```
list(APPEND CMAKE_PREFIX_PATH "/opt/usr/opencv-git-3.4")
```
用PyCaffe生成prototxt，但是0.01这样的小数位数不准？

在利用pycaffe生成prototxt时用到了浮点数，比如指定conv1的初始化为std=0.01的高斯分布；查看生成的prototxt，发现0.01变成了0.009999999235这样的“不准确”数字。这个问题在protobuf版本为3.5.2时出现，在protobuf为3.3.0版本时消失（也就是准确的0.01)。似乎用sudo pip install protobuf==3.3.0就可以解决问题？

做了另一个尝试：加载prototxt文件，然后新保存为新文件。在初始的prototxt中，浮点数是正常的0.01这种，但在另存的prototxt中依然不准确。原因：protobuf前后端不一致导致的。

protobuf本身是C++写的。用pip装的protobuf是python接口。protobuf的Python接口可以用python自身的数据类型，也可以用c++的数据类型，默认是C++作为后端。因此切换后端为python即可。如何切换？通过环境变量：
```
export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python
python my_code.py
```
参考我在官方issue中的评论
相关阅读:
docker原理(转)
HTTP代理(转)
租房的注意事项
 聊聊常见的网络攻击
 我眼中的 Nginx（一）：Nginx 和位运算
 5G网络与4G相比，有什么区别？
当 “HTTP” 先生遇上“S”小姐
 虎牙直播张波：掘金Nginx日志
 又拍云张聪：OpenResty 动态流控的几种姿势
 一文读懂 HTTP/2 特性
原文地址：https://www.cnblogs.com/zjutzz/p/8577231.html

Caffe和py-faster-rcnn日常使用备忘录

test/deploy阶段的Accuracy层和Softmax层

解析py-faster-rcnn训练的日志，绘制loss曲线

1. 准备文件

2.修改代码

3.个性化定制

绘制验证集上的精度曲线

Caffe自带绘制网络图的工具，使用报错

训练时出现: check failure stack trace

Check failed: status == CUBLAS_STATUS_SUCCESS (1 vs 0)

pycaffe引入caffe报错

VSCODE调试python代码

需要输入参数来执行的python代码的调试

使用到相对路径，但是该python代码不在打开的目录顶层，而是在子目录中，导致调试运行时路径找不到。

关闭终端输出的log信息

AttributeError: 'module' object has no attribute 'text_format'

Caffe中常见层的常见参数默认值

测试精度突然降低为0

调整参考窗口大小

训练faster-rcnn出现错误“KeyError: 'max_overlaps'”解决

去除各种随机过程，让同样参数配置下训练结果一致

固定样本出现顺序：

固定网络初始化模型：

TypeError: 'numpy.float64' object cannot be interpreted as an index

TypeError: slice indices must be integers or None or have an index method

ValueError: operands could not be broadcast together with shapes (84,1024) (8,1)

bbox_transform.py:48: RuntimeWarning: overflow encountered in exp

"/usr/bin/ld: cannot find -lboost_python"

MatCaffe创建网络失败，莫名其妙崩溃

CMake编译Caffe

用PyCaffe生成prototxt，但是0.01这样的小数位数不准？