• darknet-yolov3 burn_in learning_rate policy


    darknet-yolov3中的learning_rate是一个超参数,调参时可通过调节该参数使模型收敛到一个较好的状态。

    在cfg配置中的呈现如下图:

    我这里随便设了一个值。

    接下来说一下burn_in和policy.

    这两者在代码中的呈现如下所示:

    float get_current_rate(network *net)
    {
        size_t batch_num = get_current_batch(net);
        int i;
        float rate;
        if (batch_num < net->burn_in)  //当batch_num小于burn_in时,返回如下learning_rate
          return net->learning_rate * pow((float)batch_num / net->burn_in, net->power);   
        switch (net->policy) {//当大于burn_in时,按如下方式,原配值中给的是STEPS
            case CONSTANT:
                return net->learning_rate;
            case STEP:
                return net->learning_rate * pow(net->scale, batch_num/net->step);
            case STEPS:
                rate = net->learning_rate;     for(i = 0; i < net->num_steps; ++i){
                    if(net->steps[i] > batch_num) return rate;
                    rate *= net->scales[i];
                }
                return rate;
            case EXP:
                return net->learning_rate * pow(net->gamma, batch_num);
            case POLY:
                return net->learning_rate * pow(1 - (float)batch_num / net->max_batches, net->power);
            case RANDOM:
                return net->learning_rate * pow(rand_uniform(0,1), net->power);
            case SIG:
                return net->learning_rate * (1./(1.+exp(net->gamma*(batch_num - net->step))));
            default:
                fprintf(stderr, "Policy is weird!
    ");
                return net->learning_rate;
        }
    }

    这里我做了一些调整。

    调整依据是:发现自己设置的学习率和burn_in结束时的学习率总是有很大差异,造成loss变化出现停滞,或者剧烈抖动。

    调整办法:让steps的起始学习率=burn_in结束时的学习率。

    实现如下:

    float last_rate;
    float get_current_rate(network *net)
    {
        size_t batch_num = get_current_batch(net);
        int i;
        float rate;
        if (batch_num < net->burn_in)
        {
          /******************************************************/
          last_rate = net->learning_rate * pow((float)batch_num / net->burn_in, net->power);
          /*****************************************************/
          return net->learning_rate * pow((float)batch_num / net->burn_in, net->power);
        }
        switch (net->policy) {
            case CONSTANT:
                return net->learning_rate;
            case STEP:
                return net->learning_rate * pow(net->scale, batch_num/net->step);
            case STEPS:
                //rate = net->learning_rate;
               rate = last_rate;
                for(i = 0; i < net->num_steps; ++i){
                    if(net->steps[i] > batch_num) return rate;
                    rate *= net->scales[i];
                }
                return rate;
            case EXP:
                return net->learning_rate * pow(net->gamma, batch_num);
            case POLY:
                return net->learning_rate * pow(1 - (float)batch_num / net->max_batches, net->power);
            case RANDOM:
                return net->learning_rate * pow(rand_uniform(0,1), net->power);
            case SIG:
                return net->learning_rate * (1./(1.+exp(net->gamma*(batch_num - net->step))));
            default:
                fprintf(stderr, "Policy is weird!
    ");
                return net->learning_rate;
        }
    }
  • 相关阅读:
    Socket编程实现客户端与服务器一对一聊天
    HttpClient获取页面信息与Jsoup封装获取
    代码推送
    re正则
    MySQL 的主从复制
    关于前后端的缓存
    session/cookie/token
    如何保证缓存(redis)与数据库(MySQL)的一致性
    进程与线程(程序与任务)
    QA/QC
  • 原文地址:https://www.cnblogs.com/zhibei/p/12165360.html
Copyright © 2020-2023  润新知