• ARM平台常用性能测试方法


    测试磁盘读写速度:

    通过安装hdparm,可以测试EMMC的读写速度:

    dolphin@localhost:~$ sudo apt-get install hdparm
    dolphin@localhost:/dev$ sudo hdparm -Tt /dev/mmcblk1
    /dev/mmcblk1:
     Timing cached reads:   1202 MB in  2.00 seconds = 601.20 MB/sec
     Timing buffered disk reads: 340 MB in  3.01 seconds = 113.01 MB/sec
    dolphin@localhost:~/data$ dd count=50 bs=1M if=/dev/zero of=~/data/test.img
    50+0 records in
    50+0 records out
    52428800 bytes (52 MB, 50 MiB) copied, 0.620573 s, 84.5 MB/s
    

    测试CPU性能

    sysbench是一款开源的多线程性能测试工具,可以执行CPU/内存/线程/IO/数据库等方面的性能测试。可以通过apt install sysbench来安装。
    CPU 测试时,会计算素数(对这个数字除以 2 到这个数字平方根之间的所有数字来验证素数)直到某个指定值所需要的时间。

    sysbench --num-threads=4 --test=cpu --cpu-max-prime=20000 run
    

    可以看到RK3399在该测试下的结果是9.9957s:

    Threads started!
    
    CPU speed:
        events per second:  1927.70
    
    General statistics:
        total time:                          10.0027s
        total number of events:              19301
    
    Latency (ms):
             min:                                  1.42
             avg:                                  2.07
             max:                                 22.63
             95th percentile:                      3.62
             sum:                              39982.94
    
    Threads fairness:
        events (avg/stddev):           4825.2500/2066.85
        execution time (avg/stddev):   9.9957/0.00
    

    另一个性能测试跑分软件是nbench,可以对单个核进行内存、整型运算和浮点运算性能测试,具体包括以下10个测试项,可以通过wiki参考链接进一步了解。

    1. Numeric sort - Sorts an array of long integers.
    2. String sort - Sorts an array of strings of arbitrary length.
    3. Bitfield - Executes a variety of bit manipulation functions.
    4. Emulated floating-point - A small software floating-point package.
    5. Fourier coefficients - A numerical analysis routine for calculating series approximations of waveforms.
    6. Assignment algorithm - A well-known task allocation algorithm.
    7. Huffman compression - A well-known text and graphics compression algorithm.
    8. IDEA encryption - A relatively new block cipher algorithm.
    9. Neural Net - A small but functional back-propagation network simulator.
    10. LU Decomposition - A robust algorithm for solving linear equations.

    获取源代码,并编译运行。

    wget http://www.math.utah.edu/~mayer/linux/nbench-byte-2.2.3.tar.gz
    tar -xvzf nbench-byte-2.2.3.tar.gz
    cd nbench-byte-2.2.3
    make
    ./nbench
    

    可以看到6核的RK3399在nbench中跑分整数和浮点数分数79.409与44.893

    pi@NanoPi-NEO4:~/nbench-byte-2.2.3$ ./nbench
    
    BYTEmark* Native Mode Benchmark ver. 2 (10/95)
    Index-split by Andrew D. Balsa (11/97)
    Linux/Unix* port by Uwe F. Mayer (12/96,11/97)
    
    TEST                : Iterations/sec.  : Old Index   : New Index
                        :                  : Pentium 90* : AMD K6/233*
    --------------------:------------------:-------------:------------
    NUMERIC SORT        :          1242.3  :      31.86  :      10.46
    STRING SORT         :          389.92  :     174.23  :      26.97
    BITFIELD            :      2.2521e+08  :      38.63  :       8.07
    FP EMULATION        :          369.93  :     177.51  :      40.96
    FOURIER             :           23017  :      26.18  :      14.70
    ASSIGNMENT          :          21.776  :      82.86  :      21.49
    IDEA                :          6673.9  :     102.07  :      30.31
    HUFFMAN             :          2230.3  :      61.85  :      19.75
    NEURAL NET          :           39.53  :      63.50  :      26.71
    LU DECOMPOSITION    :          1050.7  :      54.43  :      39.31
    ==========================ORIGINAL BYTEMARK RESULTS==========================
    INTEGER INDEX       : 79.409
    FLOATING-POINT INDEX: 44.893
    Baseline (MSDOS*)   : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
    ==============================LINUX DATA BELOW===============================
    CPU                 : 6 CPU
    L2 Cache            :
    OS                  : Linux 4.4.143
    C compiler          : gcc version 7.3.0 (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04)
    libc                : static
    MEMORY INDEX        : 16.723
    INTEGER INDEX       : 22.505
    FLOATING-POINT INDEX: 24.899
    Baseline (LINUX)    : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
    * Trademarks are property of their respective holder.
    

    以下是S5P6818上使用32bit 系统的表现:

    pi@NanoPi-Fire3:~/work/nbench-byte-2.2.3$ ./nbench
    
    BYTEmark* Native Mode Benchmark ver. 2 (10/95)
    Index-split by Andrew D. Balsa (11/97)
    Linux/Unix* port by Uwe F. Mayer (12/96,11/97)
    
    TEST                : Iterations/sec.  : Old Index   : New Index
                        :                  : Pentium 90* : AMD K6/233*
    --------------------:------------------:-------------:------------
    NUMERIC SORT        :          660.92  :      16.95  :       5.57
    STRING SORT         :          88.288  :      39.45  :       6.11
    BITFIELD            :      1.9179e+08  :      32.90  :       6.87
    FP EMULATION        :          102.92  :      49.38  :      11.40
    FOURIER             :           10112  :      11.50  :       6.46
    ASSIGNMENT          :          12.921  :      49.17  :      12.75
    IDEA                :          3181.4  :      48.66  :      14.45
    HUFFMAN             :          1202.3  :      33.34  :      10.65
    NEURAL NET          :          13.628  :      21.89  :       9.21
    LU DECOMPOSITION    :          459.08  :      23.78  :      17.17
    ==========================ORIGINAL BYTEMARK RESULTS==========================
    INTEGER INDEX       : 36.521
    FLOATING-POINT INDEX: 18.158
    Baseline (MSDOS*)   : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
    ==============================LINUX DATA BELOW===============================
    CPU                 : 8 CPU
    L2 Cache            :
    OS                  : Linux 4.4.49-s5p6818
    C compiler          : gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.11)
    libc                : libc-2.23.so
    MEMORY INDEX        : 8.119
    INTEGER INDEX       : 9.939
    FLOATING-POINT INDEX: 10.071
    Baseline (LINUX)    : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
    * Trademarks are property of their respective holder.
    

    对应的相同平台上运行64bit系统的测试结果:

    BYTEmark* Native Mode Benchmark ver. 2 (10/95)
    Index-split by Andrew D. Balsa (11/97)
    Linux/Unix* port by Uwe F. Mayer (12/96,11/97)
    
    TEST                : Iterations/sec.  : Old Index   : New Index
                        :                  : Pentium 90* : AMD K6/233*
    --------------------:------------------:-------------:------------
    NUMERIC SORT        :          800.68  :      20.53  :       6.74
    STRING SORT         :          159.48  :      71.26  :      11.03
    BITFIELD            :        2.22e+08  :      38.08  :       7.95
    FP EMULATION        :           219.6  :     105.37  :      24.32
    FOURIER             :           11728  :      13.34  :       7.49
    ASSIGNMENT          :           11.77  :      44.79  :      11.62
    IDEA                :          3420.5  :      52.32  :      15.53
    HUFFMAN             :          1133.7  :      31.44  :      10.04
    NEURAL NET          :          15.737  :      25.28  :      10.63
    LU DECOMPOSITION    :          532.56  :      27.59  :      19.92
    ==========================ORIGINAL BYTEMARK RESULTS==========================
    INTEGER INDEX       : 45.950
    FLOATING-POINT INDEX: 21.031
    Baseline (MSDOS*)   : Pentium* 90, 256 KB L2-cache, Watcom* compiler 10.0
    ==============================LINUX DATA BELOW===============================
    CPU                 : 8 CPU
    L2 Cache            :
    OS                  : Linux 4.4.49-s5p6818
    C compiler          : gcc version 7.3.0 (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04)
    libc                : libc-2.27.so
    MEMORY INDEX        : 10.064
    INTEGER INDEX       : 12.645
    FLOATING-POINT INDEX: 11.664
    Baseline (LINUX)    : AMD K6/233*, 512 KB L2-cache, gcc 2.7.2.3, libc-5.4.38
    * Trademarks are property of their respective holder.
    

    测试结果对比如下,运算性能上,整数提高了26%,浮点提高了16%:

    测试项32bit64bit性能差
    NUMERIC SORT 660.92 800.68 21%
    STRING SORT 88.288 159.48 81%
    BITFIELD 1.92E+08 2.22E+08 16%
    FP EMULATION 102.92 219.6 113%
    FOURIER 10112 11728 16%
    ASSIGNMENT 12.921 11.77 -9%
    IDEA 3181.4 3420.5 8%
    HUFFMAN 1202.3 1133.7 -6%
    NEURAL NET 13.628 15.737 15%
    LU DECOMPOSITION 459.08 532.56 16%
    INTEGER 36.521 45.95 26%
    FLOATING-POINT 18.158 21.031 16%

    测试DDR性能

    在小机上创建内存盘,将频繁读写的缓存和对速度有要求的小文件放到内存中,当然也可以用来测试读写速度:

    dolphin@localhost:/$ sudo mkdir /ram
    dolphin@localhost:/$ sudo mount -t tmpfs -o size=100m,mode=0777 tmpfs /ram
    dolphin@localhost:/$ cd ram
    dolphin@localhost:/ram$ dd count=80 bs=1M if=/dev/zero of=/ram/test.img
    50+0 records in
    50+0 records out
    52428800 bytes (52 MB, 50 MiB) copied, 0.134943 s, 389 MB/s
    

    也可以用sysbench来测试,例如指定 4 个线程,缓冲区大小为 64KB,在内存中传输 10G 数据的测试:

    sysbench --threads=4  --memory-block-size=64k --memory-total-size=10G memory run
    10240.00 MiB transferred (3538.21 MiB/sec)
    
    
    General statistics:
        total time:                          2.8907s
        total number of events:              163840
    
    Latency (ms):
             min:                                  0.01
             avg:                                  0.07
             max:                                 11.02
             95th percentile:                      0.10
             sum:                              11207.68
    
    Threads fairness:
        events (avg/stddev):           40960.0000/0.00
        execution time (avg/stddev):   2.8019/0.02
    

    小礼物走一走,来简书关注我



    作者:shaniadolphin
    链接:https://www.jianshu.com/p/7a0dc79ced11
    来源:简书
    简书著作权归作者所有,任何形式的转载都请联系作者获得授权并注明出处。
  • 相关阅读:
    C# 实现简单打印(二)打印一个文本文档,打印的内容是多行的
    用户管理:登录窗体通过ShowDialog()方法实现切换
    SQL 定义与使用数据库及表 实例_(学生,课程表,选修表)
    temp0305
    计算机硬件通用功能类:硬件信息控制器(主机名,cpu编号,网卡地址,MAC地址,主硬盘编号,ip地址,获取最大线程数,验证服务IP)
    socket编程:简单的TCP服务器
    从输入的邮箱地址中提取用户名
    C#基础:helloWord book 实例小集合
    怎么样datatable表中增加一行合计行?
    C#基础:多态:基类可以定义并实现虚(virtual)方法,派生类可以重写(override)这些方法
  • 原文地址:https://www.cnblogs.com/idyllcheung/p/11282495.html
Copyright © 2020-2023  润新知