• 用linux perf命令来分析程序的cpu cache miss现象


    #include <stdio.h>
    #include <unistd.h>
     
    int main(int argc, char **argv)
    {
        int a[1000][1000];
        if(1 == argc)
        {
            for(int i = 0; i < 1000; ++i)
            {
                    for(int j = 0; j < 1000; ++j)
                    {
                            a[i][j] = 0;
                    }
            }
        }
        else
        {
            for(int i = 0; i < 1000; ++i)
            {
                    for(int j = 0; j < 1000; ++j)
                    {
                            a[j][i] = 0;
                    }
            }
        }
     
        return 0;
    }
     

     上面有两个小程序片段, 哪段效率高? 显然, 第一段效率高, 为什么呢? 因为在C/C++中,数组是按行存储的,程序的按行访问可以充分利用程序的局部性原理(空间局部性), 用time命令来看看结果:

    [root@bogon c++]# g++ miss.c  -o miss
    [root@bogon c++]# ./miss 
    [root@bogon c++]# time ./miss 
    
    real    0m0.009s
    user    0m0.009s
    sys     0m0.000s
    [root@bogon c++]# time ./miss 1
    
    real    0m0.013s
    user    0m0.013s
    sys     0m0.000s
    [root@bogon c++]# time ./miss 
    
    real    0m0.010s
    user    0m0.010s
    sys     0m0.000s
    [root@bogon c++]# time ./miss 1
    
    real    0m0.013s
    user    0m0.013s
    sys     0m0.000s
    [root@bogon c++]# perf stat -e L1-dcache-load-misses ./miss
    
     Performance counter stats for './miss':
    
                88,780      L1-dcache-load-misses                                       
    
           0.009002291 seconds time elapsed
    
           0.009174000 seconds user
           0.000000000 seconds sys
    
    
    [root@bogon c++]# perf stat -e L1-dcache-load-misses ./miss 1
    
     Performance counter stats for './miss 1':
    
             1,015,683      L1-dcache-load-misses                                       
    
           0.012000335 seconds time elapsed
    
           0.006059000 seconds user
           0.006059000 seconds sys
    
    
    [root@bogon c++]# perf stat -e L1-dcache-load-misses ./miss 1
    
     Performance counter stats for './miss 1':
    
             1,015,363      L1-dcache-load-misses                                       
    
           0.012145156 seconds time elapsed
    
           0.006134000 seconds user
           0.006134000 seconds sys
    
    
    [root@bogon c++]# perf stat -e L1-dcache-load-misses ./miss 0
    
     Performance counter stats for './miss 0':
    
             1,011,740      L1-dcache-load-misses                                       
    
           0.012363858 seconds time elapsed
    
           0.012484000 seconds user
           0.000000000 seconds sys
    
    
    [root@bogon c++]# perf stat -e L1-dcache-load-misses ./miss 0
    
     Performance counter stats for './miss 0':
    
             1,015,347      L1-dcache-load-misses                                       
    
           0.012348778 seconds time elapsed
    
           0.006237000 seconds user
           0.006237000 seconds sys
  • 相关阅读:
    IDEA使用总结1-Github下载代码和上传代码到Git
    Mysql-安装指南
    Dubbo安装及其实战1
    分布式设计(学习内容目录--后端架构师)
    elasticsearch安装指导(new)
    浅谈TCP/IP(new 常见面试问题)
    浅谈常用的设计模式(new)
    浅谈Elasicsearch
    浅谈数据库分库分表(new)
    JAVA 电子书下载地址
  • 原文地址:https://www.cnblogs.com/dream397/p/14694998.html
Copyright © 2020-2023  润新知