• 一次内核 crash 的排查记录


    一次内核 crash 的排查记录

    使用的发行版本是 CentOS,内核版本是 3.10.0,在正常运行的情况下内核发生了崩溃,还好有 vmcore 生成。

    准备排查环境

    1. crash
    2. 内核调试信息rpm,下载的两个 rpm 版本必须和内核版本一致

    排查

    准备好生成的 vmcore

    1. 进入 crash
      [zzz@localhost kernel-debug]# crash ../vmcore /usr/lib/debug/lib/modules/3.10.0-327.el7.x86_64/vmlinux
      
    2. 可以看到直接原因是访问了空指针 (0000000000000008)
          KERNEL: /usr/lib/debug/lib/modules/3.10.0-327.el7.x86_64/vmlinux
          DUMPFILE: ../vmcore  [PARTIAL DUMP]
              CPUS: 8
              DATE: Wed Apr 29 19:40:42 2020
          UPTIME: 335 days, 01:46:01
      LOAD AVERAGE: 23.98, 26.19, 15.75
          TASKS: 688
          NODENAME: localhost.localdomain
          RELEASE: 3.10.0-327.el7.x86_64
          VERSION: #1 SMP Thu Nov 19 22:10:57 UTC 2015
          MACHINE: x86_64  (3408 Mhz)
          MEMORY: 15.6 GB
          PANIC: "BUG: unable to handle kernel NULL pointer dereference at 0000000000000008"
              PID: 76
          COMMAND: "kswapd0"
              TASK: ffff88044beba280  [THREAD_INFO: ffff88044bef0000]
              CPU: 2
          STATE: TASK_RUNNING (PANIC)
      
    3. 观察堆栈,看代码层面大概是哪里产生的问题
      crash> bt
      PID: 76     TASK: ffff88044beba280  CPU: 2   COMMAND: "kswapd0"
      #0 [ffff88044bef3610] machine_kexec at ffffffff81051beb
      #1 [ffff88044bef3670] crash_kexec at ffffffff810f2542
      #2 [ffff88044bef3740] oops_end at ffffffff8163e1a8
      #3 [ffff88044bef3768] no_context at ffffffff8162e2b8
      #4 [ffff88044bef37b8] __bad_area_nosemaphore at ffffffff8162e34e
      #5 [ffff88044bef3800] bad_area_nosemaphore at ffffffff8162e4b8
      #6 [ffff88044bef3810] __do_page_fault at ffffffff81640fce
      #7 [ffff88044bef3868] do_page_fault at ffffffff81641113
      #8 [ffff88044bef3890] page_fault at ffffffff8163d408
          [exception RIP: down_read_trylock+9]
          RIP: ffffffff810aa989  RSP: ffff88044bef3940  RFLAGS: 00010202
          RAX: 0000000000000000  RBX: ffff8801b4f9ff80  RCX: 0000000000000000
          RDX: 0000000000000000  RSI: 0000000000000001  RDI: 0000000000000008
          RBP: ffff88044bef3940   R8: 0000000000000000   R9: 0000000000017bc0
          R10: ffff880465fd8000  R11: 0000000000000000  R12: ffff8801b4f9ff81
          R13: ffffea00047dfbc0  R14: 0000000000000008  R15: 0000000000000001
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
      #9 [ffff88044bef3948] page_lock_anon_vma_read at ffffffff811a2e65
      #10 [ffff88044bef3978] page_referenced at ffffffff811a30e7
      #11 [ffff88044bef39f0] shrink_page_list at ffffffff8117d264
      #12 [ffff88044bef3b28] shrink_inactive_list at ffffffff8117df3a
      #13 [ffff88044bef3bf0] shrink_lruvec at ffffffff8117ea05
      #14 [ffff88044bef3cf0] shrink_zone at ffffffff8117ee66
      #15 [ffff88044bef3d48] balance_pgdat at ffffffff8118010c
      #16 [ffff88044bef3e20] kswapd at ffffffff811803d3
      #17 [ffff88044bef3ec8] kthread at ffffffff810a5aef
      #18 [ffff88044bef3f50] ret_from_fork at ffffffff81645858
      
      异常发生在 down_read_trylock 函数内,后面发生了 page fault,先反汇编看一下 RIP 内地址(ffffffff810aa989)的内容:
      crash> dis -l ffffffff810aa989
          /usr/src/debug/kernel-3.10.0-327.el7/linux-3.10.0-327.el7.x86_64/arch/x86/include/asm/rwsem.h: 83
          0xffffffff810aa989 <down_read_trylock+9>:       mov    (%rdi),%rax
      
      打开这个内核版本的 arch/x86/include/asm/rwsem.h,可以使用网址 https://elixir.bootlin.com/linux/v3.10/source/arch/x86/include/asm/rwsem.h 打开,代码如下
      /*
      * trylock for reading -- returns 1 if successful, 0 if contention
      */
      static inline int __down_read_trylock(struct rw_semaphore *sem)
      {
          long result, tmp;
          asm volatile("# beginning __down_read_trylock
      	"
                  "  mov          %0,%1
      	"
                  "1:
      	"
                  "  mov          %1,%2
      	"
                  "  add          %3,%2
      	"
                  "  jle	     2f
      	"
                  LOCK_PREFIX "  cmpxchg  %2,%0
      	"
                  "  jnz	     1b
      	"
                  "2:
      	"
                  "# ending __down_read_trylock
      	"
                  : "+m" (sem->count), "=&a" (result), "=&r" (tmp)
                  : "i" (RWSEM_ACTIVE_READ_BIAS)
                  : "memory", "cc");
          return result >= 0 ? 1 : 0;
      }
      
      指针 sem 也就是寄存器 RAX 的值为 0000000000000008,地址解引用失败,访问空指针,引发异常。观察堆栈调用情况,都是内存管理相关操作,其中还有一个 kswapd 调用,到此,推测是内存爆了导致的
    4. 查看当时内存使用情况,观察内存使用到了 98%,而交换空间也被大量使用,符合上面函数调用的推导
      crash> kmem -i
                       PAGES        TOTAL      PERCENTAGE
          TOTAL MEM  3962164      15.1 GB         ----
               FREE    46698     182.4 MB    1% of TOTAL MEM
               USED  3915466      14.9 GB   98% of TOTAL MEM
             SHARED   224712     877.8 MB    5% of TOTAL MEM
            BUFFERS        0            0    0% of TOTAL MEM
             CACHED   555017       2.1 GB   14% of TOTAL MEM
               SLAB   136079     531.6 MB    3% of TOTAL MEM
      
         TOTAL HUGE        0            0         ----
          HUGE FREE        0            0    0% of TOTAL HUGE
      
         TOTAL SWAP  4194303        16 GB         ----
          SWAP USED  3042976      11.6 GB   72% of TOTAL SWAP
          SWAP FREE  1151327       4.4 GB   27% of TOTAL SWAP
      
       COMMIT LIMIT  6175385      23.6 GB         ----
          COMMITTED  9409769      35.9 GB  152% of TOTAL LIMIT
      
    5. 查看当时进程使用情况(节选),几乎全部是页面交换进程在运行
      crash> ps
         PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
      >     0      0   0  ffffffff81951440  RU   0.0       0      0  [swapper/0]
      >     0      0   1  ffff88044f942e00  RU   0.0       0      0  [swapper/1]
            0      0   2  ffff88044f943980  RU   0.0       0      0  [swapper/2]
      >     0      0   3  ffff88044f944500  RU   0.0       0      0  [swapper/3]
      >     0      0   4  ffff88044f945080  RU   0.0       0      0  [swapper/4]
            0      0   5  ffff88044f945c00  RU   0.0       0      0  [swapper/5]
      >     0      0   6  ffff88044f946780  RU   0.0       0      0  [swapper/6]
      >     0      0   7  ffff88044f947300  RU   0.0       0      0  [swapper/7]
            1      0   7  ffff88044f848000  IN   0.0  189172   3080  systemd
            2      0   0  ffff88044f848b80  IN   0.0       0      0  [kthreadd]
            3      2   0  ffff88044f849700  IN   0.0       0      0  [ksoftirqd/0]
            5      2   0  ffff88044f84ae00  IN   0.0       0      0  [kworker/0:0H]
            7      2   0  ffff88044f84c500  IN   0.0       0      0  [migration/0]
            8      2   7  ffff88044f84d080  IN   0.0       0      0  [rcu_bh]
      

    到此,问题可以锁定为是在大量内存页交换导致的问题,但是具体的代码逻辑并不能确定是为何处。而当时crash的时间也是在急需内存的操作下发生了,需要避免的地方也只有关闭一些服务释放出内存,在该操作完成后再启动服务。

  • 相关阅读:
    用AB对Webservice做压力测试
    web压力测试工具(小而精)
    让IIS支持10万并发
    数据库之间数据转换最快方法
    C#中执行Dos命令
    .NET 4.0中使用内存映射文件实现进程通讯
    HTML解析利器HtmlAgilityPack
    C#快速找出磁盘内的所有文件
    ORM for Net主流框架汇总与效率测试
    新浪微博基于MySQL的分布式数据库实践
  • 原文地址:https://www.cnblogs.com/shuqin/p/12859779.html
Copyright © 2020-2023  润新知