1.首先看数据集是否有非jpg格式图片
2.sub_batch = batch/subdivision, 实际在每个sub_batch后先不迭代,等整个batch计算完之后迭代一次。降低显存要求
3.起初测试集的置信度阈值设置太低0.25,导致最后有些低置信度阈值的box(比如0.25,0.37)没有被抑制掉。一般设置到0.5左右。
4.控制台命令行训练代码理解(控制台命令行调用darknet框架,同理调用其他):
darknet.exe detector train data/img.data yolo-obj.cfg darknet53.conv.74 -map
首先darknet.exe找到函数执行入口,相当于找到main函数的位置。
后面是形参列表(argc,argv) argv[0]是当前进程的完整执行路径,arg[1] = = detector,darknet的main函数里有strmp(argv[1],"detector") == 0,则执行函数体run_detector(argc, argv),进入run_detector函数,进行第二个参数判断argv[2],如训练argv[2] = = train,则执行函数 train_detector(),
如darknet.exe detector train data/img.data yolo-obj.cfg darknet53.conv.74 -map
data/img.data == datacfg
yolo-obj.cfg == cfg
darknet53.conv.74 == weights
-map == calc_map
1 int calc_map = find_arg(argc, argv, "-map"); 2 if (0 == strcmp(argv[2], "test")) test_detector(datacfg, cfg, weights, filename, thresh, hier_thresh, dont_show, ext_output, save_labels, outfile, letter_box); 3 else if (0 == strcmp(argv[2], "train")) train_detector(datacfg, cfg, weights, gpus, ngpus, clear, dont_show, calc_map, mjpeg_port, show_imgs); 4 else if (0 == strcmp(argv[2], "valid")) validate_detector(datacfg, cfg, weights, outfile); 5 else if (0 == strcmp(argv[2], "recall")) validate_detector_recall(datacfg, cfg, weights); 6 else if (0 == strcmp(argv[2], "map")) validate_detector_map(datacfg, cfg, weights, thresh, iou_thresh, map_points, letter_box, NULL); 7 else if (0 == strcmp(argv[2], "calc_anchors")) calc_anchors(datacfg, num_of_clusters, width, height, show);
有calc_map会进入以下函数体,做mAP计算。 此处可以修改计算一次mAP的迭代次数。修改——重新编译——再训练。
1 int calc_map_for_each = 4 * train_images_num / (net.batch * net.subdivisions); // calculate mAP for each 4 Epochs 2 calc_map_for_each = fmax(calc_map_for_each, 100); 3 int next_map_calc = iter_map + calc_map_for_each; 4 next_map_calc = fmax(next_map_calc, net.burn_in); 5 next_map_calc = fmax(next_map_calc, 400); 6 if (calc_map) { 7 printf(" (next mAP calculation at %d iterations) ", next_map_calc); 8 if (mean_average_precision > 0) printf(" Last accuracy mAP@0.5 = %2.2f %%, best = %2.2f %% ", mean_average_precision * 100, best_map * 100); 9 }
5.burn in参数
框架将burn in次前的迭代时,学习率更新策略为从小到大;之后的为递减。
小样本要将burn in 适当调小,加快收敛。大概为imgs/batch_size的10—20倍