• AIR32F103(四) 27倍频216MHz,CoreMark跑分测试


    目录

    27倍频运行216MHz主频

    合宙开发团队10月11日的提交中开源了AIR32F103的PLL倍频调节的代码, 使得在 Linux 下通过 GCC Arm 工具链也能编译运行216MHz.

    代码示例

    示例代码位于 Examples/NonFreeRTOS/RCC 下: https://gitee.com/iosetting/air32f103-template/tree/master/Examples/NonFreeRTOS/RCC

    编译时的注意事项

    编译时需要注意避免编译器对AIR_RCC_PLLConfig()这个函数的优化.

    这个函数的源代码如下, 可以看到其中会对特定地址(例如 0x40016C00)进行连续的写操作, 编译时如果优化参数不是-O0, 就大概率会将这些写操作合并或调换位置.

    uint32_t AIR_RCC_PLLConfig(uint32_t RCC_PLLSource, uint32_t RCC_PLLMul, uint8_t Latency)
    { 
      uint32_t sramsize = 0;
      uint32_t pllmul = 0;
      FunctionalState pwr_gating_state = 0;
      /* Check the parameters */
      assert_param(IS_RCC_PLL_SOURCE(RCC_PLLSource));
      assert_param(IS_RCC_PLL_MUL(RCC_PLLMul));
      
      *(uint32_t *)(0x400210F0) = BIT(0);//开启sys_cfg门控
      *(uint32_t *)(0x40016C00) = 0xa7d93a86;//解一、二、三级锁
      *(uint32_t *)(0x40016C00) = 0xab12dfcd;
      *(uint32_t *)(0x40016C00) = 0xcded3526;
      sramsize = *(uint32_t *)(0x40016C18);
      *(uint32_t *)(0x40016C18) = 0x200183FF;//配置sram大小, 将BOOT使用对sram打开 
      *(uint32_t *)(0x4002228C) = 0xa5a5a5a5;//QSPI解锁
      
      SysFreq_Set(RCC_PLLMul,Latency ,0,1);
      RCC->CFGR = (RCC->CFGR & ~0x00030000) | RCC_PLLSource;
      
      //恢复配置前状态
      *(uint32_t *)(0x40016C18) = sramsize;
      *(uint32_t *)(0x400210F0) = 0;//开启sys_cfg门控
      *(uint32_t *)(0x40016C00) = ~0xa7d93a86;//加一、二、三级锁
      *(uint32_t *)(0x40016C00) = ~0xab12dfcd;
      *(uint32_t *)(0x40016C00) = ~0xcded3526;
      *(uint32_t *)(0x4002228C) = ~0xa5a5a5a5;//QSPI解锁
      return 1;
    }
    

    解决的方法一, 是通过调整编译参数

    • 在Keil5下, 可以对 air32f10x_rcc_ex.c 这个文件右键单独设置 AC6 编译选项. AC5可以使用 注解, AC6不再支持文件内部单个函数的优化设置
    • 在GCC Arm中, 可以通过 Makefile 对 air32f10x_rcc_ex.c 设置单独的-O0参数, 也可以在代码中增加屏障避免优化(例如在两行代码之间增加__NOP();), 还可以通过int foo(int i) __attribute__((optimize("-O3")));这样的形式, 参考GNU GCC文档

    因此将库函数修改为

    __attribute__((optimize("-O0"))) uint32_t AIR_RCC_PLLConfig(uint32_t RCC_PLLSource, uint32_t RCC_PLLMul, uint8_t Latency)
    

    CoreMark跑分结果

    示例中的 CoreMark_256MHz 项目, 可以将AIR32F103运行在最高256MHz主频下, 运行CoreMark性能测试. 以下是分别在 256MHz, 216MHz, 72MHz 不同编译器版本下的测试结果.

    32倍频, 256MHz

    编译器 GCC11.2.1

    SYSCLK: 256000000, HCLK: 256000000, PCLK1: 128000000, PCLK2: 256000000, ADCCLK: 128000000␊
    2K performance run parameters for coremark.␊
    CoreMark Size    : 666␊
    Total ticks      : 17054␊
    Total time (secs): 17.054000␊
    Iterations/Sec   : 586.372698␊
    Iterations       : 10000␊
    Compiler version : GCC11.2.1 20220111␊
    Compiler flags   : -O3␊
    Memory location  : STACK␊
    seedcrc          : 0xe9f5␊
    [0]crclist       : 0xe714␊
    [0]crcmatrix     : 0x1fd7␊
    [0]crcstate      : 0x8e3a␊
    [0]crcfinal      : 0x988c␊
    Correct operation validated. See readme.txt for run and reporting rules.␊
    CoreMark 1.0 : 586.372698 / GCC11.2.1 20220111 -O3 / STACK␊
    IR32F103 CoreMark␊
    

    编译器 GCC11.3.1, 256MHz

    SYSCLK: 256000000, HCLK: 256000000, PCLK1: 128000000, PCLK2: 256000000, ADCCLK: 128000000␊
    2K performance run parameters for coremark.␊
    CoreMark Size    : 666␊
    Total ticks      : 17054␊
    Total time (secs): 17.054000␊
    Iterations/Sec   : 586.372698␊
    Iterations       : 10000␊
    Compiler version : GCC11.3.1 20220712␊
    Compiler flags   : -O3␊
    Memory location  : STACK␊
    seedcrc          : 0xe9f5␊
    [0]crclist       : 0xe714␊
    [0]crcmatrix     : 0x1fd7␊
    [0]crcstate      : 0x8e3a␊
    [0]crcfinal      : 0x988c␊
    Correct operation validated. See readme.txt for run and reporting rules.␊
    CoreMark 1.0 : 586.372698 / GCC11.3.1 20220712 -O3 / STACK␊
    

    编译器 GCC12.2.0 256MHz

    SYSCLK: 256000000, HCLK: 256000000, PCLK1: 128000000, PCLK2: 256000000, ADCCLK: 128000000␊
    2K performance run parameters for coremark.␊
    CoreMark Size    : 666␊
    Total ticks      : 15822␊
    Total time (secs): 15.822000␊
    Iterations/Sec   : 632.031349␊
    Iterations       : 10000␊
    Compiler version : GCC12.2.0␊
    Compiler flags   : -O3␊
    Memory location  : STACK␊
    seedcrc          : 0xe9f5␊
    [0]crclist       : 0xe714␊
    [0]crcmatrix     : 0x1fd7␊
    [0]crcstate      : 0x8e3a␊
    [0]crcfinal      : 0x988c␊
    Correct operation validated. See readme.txt for run and reporting rules.␊
    CoreMark 1.0 : 632.031349 / GCC12.2.0 -O3 / STACK␊
    

    27倍频, 216MHz

    GCC11.2.1 216MHz

    SYSCLK: 216000000, HCLK: 216000000, PCLK1: 108000000, PCLK2: 216000000, ADCCLK: 108000000␊
    2K performance run parameters for coremark.␊
    CoreMark Size    : 666␊
    Total ticks      : 20213␊
    Total time (secs): 20.213000␊
    Iterations/Sec   : 494.731114␊
    Iterations       : 10000␊
    Compiler version : GCC11.2.1 20220111␊
    Compiler flags   : -O3␊
    Memory location  : STACK␊
    seedcrc          : 0xe9f5␊
    [0]crclist       : 0xe714␊
    [0]crcmatrix     : 0x1fd7␊
    [0]crcstate      : 0x8e3a␊
    [0]crcfinal      : 0x988c␊
    Correct operation validated. See readme.txt for run and reporting rules.␊
    CoreMark 1.0 : 494.731114 / GCC11.2.1 20220111 -O3 / STACK␊
    

    GCC11.3.1 216MHz

    SYSCLK: 216000000, HCLK: 216000000, PCLK1: 108000000, PCLK2: 216000000, ADCCLK: 108000000␊
    2K performance run parameters for coremark.␊
    CoreMark Size    : 666␊
    Total ticks      : 20213␊
    Total time (secs): 20.213000␊
    Iterations/Sec   : 494.731114␊
    Iterations       : 10000␊
    Compiler version : GCC11.3.1 20220712␊
    Compiler flags   : -O3␊
    Memory location  : STACK␊
    seedcrc          : 0xe9f5␊
    [0]crclist       : 0xe714␊
    [0]crcmatrix     : 0x1fd7␊
    [0]crcstate      : 0x8e3a␊
    [0]crcfinal      : 0x988c␊
    Correct operation validated. See readme.txt for run and reporting rules.␊
    CoreMark 1.0 : 494.731114 / GCC11.3.1 20220712 -O3 / STACK␊
    

    GCC12.2.0 216MHz

    SYSCLK: 216000000, HCLK: 216000000, PCLK1: 108000000, PCLK2: 216000000, ADCCLK: 108000000␊
    2K performance run parameters for coremark.␊
    CoreMark Size    : 666␊
    Total ticks      : 18753␊
    Total time (secs): 18.753000␊
    Iterations/Sec   : 533.248014␊
    Iterations       : 10000␊
    Compiler version : GCC12.2.0␊
    Compiler flags   : -O3␊
    Memory location  : STACK␊
    seedcrc          : 0xe9f5␊
    [0]crclist       : 0xe714␊
    [0]crcmatrix     : 0x1fd7␊
    [0]crcstate      : 0x8e3a␊
    [0]crcfinal      : 0x988c␊
    Correct operation validated. See readme.txt for run and reporting rules.␊
    CoreMark 1.0 : 533.248014 / GCC12.2.0 -O3 / STACK␊
    

    编译器 AC6 (GCCClang 15.0.0) 216MHz

    SYSCLK: 216.0Mhz, HCLK: 216.0Mhz, PCLK1: 108.0Mhz, PCLK2: 216.0Mhz, ADCCLK: 108.0Mhz
    2K performance run parameters for coremark.
    CoreMark Size    : 666
    Total ticks      : 16328
    Total time (secs): 16.328000
    Iterations/Sec   : 612.444880
    Iterations       : 10000
    Compiler version : GCCClang 15.0.0
    Compiler flags   : -O3
    Memory location  : STACK
    seedcrc          : 0xe9f5
    [0]crclist       : 0xe714
    [0]crcmatrix     : 0x1fd7
    [0]crcstate      : 0x8e3a
    [0]crcfinal      : 0x988c
    Correct operation validated. See readme.txt for run and reporting rules.
    CoreMark 1.0 : 612.444880 / GCCClang 15.0.0 -O3 / STACK
    

    9倍频, 72MHz

    编译器 GCC11.2.1 72MHz

    SYSCLK: 72000000, HCLK: 72000000, PCLK1: 36000000, PCLK2: 72000000, ADCCLK: 36000000␊
    2K performance run parameters for coremark.␊
    CoreMark Size    : 666␊
    Total ticks      : 60677␊
    Total time (secs): 60.677000␊
    Iterations/Sec   : 164.807093␊
    Iterations       : 10000␊
    Compiler version : GCC11.2.1 20220111␊
    Compiler flags   : -O3␊
    Memory location  : STACK␊
    seedcrc          : 0xe9f5␊
    [0]crclist       : 0xe714␊
    [0]crcmatrix     : 0x1fd7␊
    [0]crcstate      : 0x8e3a␊
    [0]crcfinal      : 0x988c␊
    Correct operation validated. See readme.txt for run and reporting rules.␊
    CoreMark 1.0 : 164.807093 / GCC11.2.1 20220111 -O3 / STACK␊
    

    编译器 GCC11.3.1 72MHz

    SYSCLK: 72000000, HCLK: 72000000, PCLK1: 36000000, PCLK2: 72000000, ADCCLK: 36000000␊
    2K performance run parameters for coremark.␊
    CoreMark Size    : 666␊
    Total ticks      : 60677␊
    Total time (secs): 60.677000␊
    Iterations/Sec   : 164.807093␊
    Iterations       : 10000␊
    Compiler version : GCC11.3.1 20220712␊
    Compiler flags   : -O3␊
    Memory location  : STACK␊
    seedcrc          : 0xe9f5␊
    [0]crclist       : 0xe714␊
    [0]crcmatrix     : 0x1fd7␊
    [0]crcstate      : 0x8e3a␊
    [0]crcfinal      : 0x988c␊
    Correct operation validated. See readme.txt for run and reporting rules.␊
    CoreMark 1.0 : 164.807093 / GCC11.3.1 20220712 -O3 / STACK␊
    

    编译器 GCC12.2.0 72MHz

    SYSCLK: 72000000, HCLK: 72000000, PCLK1: 36000000, PCLK2: 72000000, ADCCLK: 36000000␊
    2K performance run parameters for coremark.␊
    CoreMark Size    : 666␊
    Total ticks      : 56293␊
    Total time (secs): 56.293000␊
    Iterations/Sec   : 177.641980␊
    Iterations       : 10000␊
    Compiler version : GCC12.2.0␊
    Compiler flags   : -O3␊
    Memory location  : STACK␊
    seedcrc          : 0xe9f5␊
    [0]crclist       : 0xe714␊
    [0]crcmatrix     : 0x1fd7␊
    [0]crcstate      : 0x8e3a␊
    [0]crcfinal      : 0x988c␊
    Correct operation validated. See readme.txt for run and reporting rules.␊
    CoreMark 1.0 : 177.641980 / GCC12.2.0 -O3 / STACK␊
    

    总结

    可以看到, GCC11.2和GCC11.3是一样的, GCC12.2生成的二进制执行性能提升了接近8%, 但是性能最好的还是AC6, 比GCC12.2性能高了接近15%.

  • 相关阅读:
    three.js 居中-模型
    three.js 打包为一个组-几个单独的模型
    ABP 菜单和权限
    set
    P2429 制杖题
    对线性筛的新理解
    P2817 宋荣子的城堡
    P2651 添加括号III
    P2858 [USACO06FEB]奶牛零食Treats for the Cows
    P1005 矩阵取数游戏
  • 原文地址:https://www.cnblogs.com/milton/p/16830703.html
Copyright © 2020-2023  润新知