• 通过memcg触发oom


    当应用程序申请内存时,如果系统内存不足或者到达了memory cgroup设置的limit,会触发内存回收,如果还是得不到想要的数量的内存,最后会出发oom,选择一个进程杀死,释放其占用的内存。

    下面通过实验来学习一下通过memory cgroup触发oom。

    内核:linux-5.14
    发行版:ubuntu20.4 (默认使用的是cgroup v1)
    发行版:ubuntu22.04 (默认使用的是cgroup v2)

    作者: pengdonglin137@163.com

    oom对应的源码是mm/oom_kill.c。

    cgroup v1

    在memory cgroup的挂载目录下创建一个子目录,然后内核会自动在这个目录下创建一些预定义的文件,下面主要用到如下几个节点:

    memory.oom_control     # 控制是否启动oom,以及查看当前的oom状态
    memory.limit_in_bytes  # 用于设置一个memcg的内存上限
    memory.usage_in_bytes  # 用于查看一个memcg当前的内存使用量
    cgroup.procs           # 用于将某个进程加入到这个memcg
    
    • 关闭swap,防止内存回收时匿名内存被交换出去影响测试
    swapon -s
    swapoff <swap_file>
    
    • 在memory cgroup顶层目录新建一个oom_test控制组
    cd /sys/fs/cgroup/memory
    mkdir oom_test
    
    • 建立三个终端,然后将三个终端进程加入到oom_test控制组中
      分别在三个终端中执行如下命令:
    echo $$ > /sys/fs/cgroup/memory/oom_test/cgroup.procs
    
    • 设置内存限制
    echo 80M > memory.limit_in_bytes
    
    • 分别在三个终端里执行三个不同的命令
    1. 终端1: 执行alloc_memory_1,这个程序每敲一次回车申请10MB的内存

    2. 终端2: 执行alloc_memory_2,这个程序每敲一次回车申请10MB的内存

    3. 终端3: 执行print_counter.sh,每秒打印一行log,表示还活着

    • 在另外一个终端观察内存使用
    watch -n1 cat /sys/fs/cgroup/memory/oom_test/memory.usage_in_bytes
    
    • 在终端1和终端2中交替敲回车
      观察oom_test的变化,看哪个进程被杀死。

    • 结果
      观察到如下现象,当在终端2中敲回车后,内存超过设置的limit时,终端1中的进程alloc_memory_1被kill掉了,然后控制的内存使用量瞬间降了下来。而终端2和终端3中的
      进程还活着。同时内核输出如下的log:

    oom的内核log
    [24445.896298] alloc_memory_2 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
    [24445.896317] CPU: 7 PID: 343405 Comm: alloc_memory_2 Not tainted 5.13.0-40-generic #45~20.04.1-Ubuntu
    [24445.896320] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/29/2019
    [24445.896323] Call Trace:
    [24445.896326]  <TASK>
    [24445.896333]  dump_stack+0x7d/0x9c
    [24445.896342]  dump_header+0x4f/0x1f6
    [24445.896346]  oom_kill_process.cold+0xb/0x10
    [24445.896349]  out_of_memory+0x1cf/0x520
    [24445.896355]  mem_cgroup_out_of_memory+0xe4/0x100
    [24445.896359]  try_charge+0x708/0x760
    [24445.896362]  ? __alloc_pages+0x17b/0x320
    [24445.896367]  __mem_cgroup_charge+0x3e/0xb0
    [24445.896369]  mem_cgroup_charge+0x32/0x90
    [24445.896371]  do_anonymous_page+0x111/0x3b0
    [24445.896375]  __handle_mm_fault+0x8a4/0x8e0
    [24445.896377]  handle_mm_fault+0xda/0x2b0
    [24445.896379]  do_user_addr_fault+0x1bb/0x650
    [24445.896383]  exc_page_fault+0x7d/0x170
    [24445.896387]  ? asm_exc_page_fault+0x8/0x30
    [24445.896391]  asm_exc_page_fault+0x1e/0x30
    [24445.896395] RIP: 0033:0x7fb227245b0b
    [24445.896400] Code: 47 20 c5 fe 7f 44 17 c0 c5 fe 7f 47 40 c5 fe 7f 44 17 a0 c5 fe 7f 47 60 c5 fe 7f 44 17 80 48 01 fa 48 83 e2 80 48 39 d1 74 ba <c5> fd 7f 01 c5 fd 7f 41 20 c5 fd 7f 41 40 c5 fd 7f 41 60 48 81 c1
    [24445.896402] RSP: 002b:00007ffd5e199d78 EFLAGS: 00010206
    [24445.896405] RAX: 00007fb2248b6010 RBX: 000055b6e053d1c0 RCX: 00007fb225120000
    [24445.896407] RDX: 00007fb2252b6000 RSI: 0000000000000000 RDI: 00007fb2248b6010
    [24445.896408] RBP: 00007ffd5e199d90 R08: 00007fb2248b6010 R09: 0000000000000000
    [24445.896409] R10: 0000000000000022 R11: 0000000000000246 R12: 000055b6e053d0a0
    [24445.896410] R13: 00007ffd5e199e80 R14: 0000000000000000 R15: 0000000000000000
    [24445.896413]  </TASK>
    [24445.896414] memory: usage 81920kB, limit 81920kB, failcnt 78
    [24445.896416] memory+swap: usage 81920kB, limit 9007199254740988kB, failcnt 0
    [24445.896417] kmem: usage 544kB, limit 9007199254740988kB, failcnt 0
    [24445.896419] Memory cgroup stats for /oom_test:
    [24445.896482] anon 83329024
                   file 0
                   kernel_stack 65536
                   pagetables 315392
                   percpu 0
                   sock 0
                   shmem 0
                   file_mapped 0
                   file_dirty 0
                   file_writeback 0
                   swapcached 0
                   anon_thp 0
                   file_thp 0
                   shmem_thp 0
                   inactive_anon 83230720
                   active_anon 20480
                   inactive_file 0
                   active_file 0
                   unevictable 0
                   slab_reclaimable 0
                   slab_unreclaimable 91824
                   slab 91824
                   workingset_refault_anon 0
                   workingset_refault_file 0
                   workingset_activate_anon 0
                   workingset_activate_file 0
                   workingset_restore_anon 0
                   workingset_restore_file 0
                   workingset_nodereclaim 0
                   pgfault 53271
                   pgmajfault 0
                   pgrefill 0
                   pgscan 0
                   pgsteal 0
                   pgactivate 3
                   pgdeactivate 0
                   pglazyfree 0
                   pglazyfreed 0
                   thp_fault_alloc 0
                   thp_collapse_alloc 0
    [24445.896491] Tasks state (memory values in pages):
    [24445.896492] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
    [24445.896494] [ 332334]     0 332334     2755     1352    61440        0             0 bash
    [24445.896498] [ 336721]     0 336721     2753     1309    65536        0             0 bash
    [24445.896500] [ 336822]     0 336822     2753     1350    53248        0             0 bash
    [24445.896502] [ 343342]     0 343342    10868    10511   126976        0             0 alloc_memory_1
    [24445.896504] [ 343405]     0 343405    10868    10128   122880        0             0 alloc_memory_2
    [24445.896507] [ 343519]     0 343519     2408      902    57344        0             0 print_counter.s
    [24445.896509] [ 347570]     0 347570     2021      145    49152        0             0 sleep
    [24445.896511] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/oom_test,task_memcg=/oom_test,task=alloc_memory_1,pid=343342,uid=0
    [24445.896549] Memory cgroup out of memory: Killed process 343342 (alloc_memory_1) total-vm:43472kB, anon-rss:40932kB, file-rss:1112kB, shmem-rss:0kB, UID:0 pgtables:124kB oom_score_adj:0
    [24445.897661] oom_reaper: reaped process 343342 (alloc_memory_1), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
    
    • 查看节点memory.oom_control
    root@ubuntu:/sys/fs/cgroup/memory/oom_test# cat memory.oom_control
    oom_kill_disable 0  # 表示当前控制组的oom kill功能处于开启状态
    under_oom 0         # 用于计数是否正处于oom,内核提供了两个接口函数mem_cgroup_mark_under_oom和mem_cgroup_unmark_under_oom来操作这个计数
    oom_kill 1          # 表示已经杀死了一个进程
    
    • 关闭oom_test控制组的oom功能
      执行下面的命令关闭oom_test的oom功能:
    root@ubuntu:/sys/fs/cgroup/memory/oom_test# echo 1 > memory.oom_control
    root@ubuntu:/sys/fs/cgroup/memory/oom_test# cat memory.oom_control
    oom_kill_disable 1
    under_oom 0
    oom_kill 1
    

    然后重新在终端1中运行alloc_memory_1,然后继续在终端1中敲回车,观察控制组的内存占用以及alloc_memory_1进程的状态。

    可以看到如下结果:
    当控制组的内存使用到达limit后,不再增长,并且alloc_memory_1也不动了,通过ps查看进程状态,发现alloc_memory_1编程了D状态,即不可中断睡眠。

    root@ubuntu:/home/pengdl/work/oom_test# ps -aux | grep alloc_me
    root      343405  0.0  0.2  43472 42096 pts/4    S+   00:36   0:00 ./alloc_memory_2
    root      369713  0.0  0.2  43472 41252 pts/2    D+   01:12   0:00 ./alloc_memory_1
    

    此时查看memory.oom_control的内容:

    root@ubuntu:/sys/fs/cgroup/memory/oom_test# cat memory.oom_control
    oom_kill_disable 1
    under_oom 1
    oom_kill 1
    

    其中,under_oom是1,因为此时alloc_memory_1在处理oom时被阻塞了,没有处理完oom。

    内核源码流程:

    handle_mm_fault
    	-> __set_current_state(TASK_RUNNING)
    	-> __handle_mm_fault(vma, address, flags)
    		-> handle_pte_fault
    			-> do_anonymous_page(vmf)
    				-> alloc_zeroed_user_highpage_movable
    				-> mem_cgroup_charge
    					-> __mem_cgroup_charge
    						-> try_charge
    							-> try_charge_memcg
    								-> mem_cgroup_oom
    									-> 如果memcg->oom_kill_disable是1的话:
    										if (memcg->oom_kill_disable) {
    												if (!current->in_user_fault)
    													return OOM_SKIPPED;
    												css_get(&memcg->css);
    												current->memcg_in_oom = memcg;
    												current->memcg_oom_gfp_mask = mask;
    												current->memcg_oom_order = order;
    												return OOM_ASYNC;
    											}
    								-> return -ENOMEM;
    				-> return VM_FAULT_OOM
    	-> mem_cgroup_oom_synchronize
    		-> prepare_to_wait(&memcg_oom_waitq, &owait.wait, TASK_KILLABLE)
    

    在mem_cgroup_oom_synchronize会将当前进程设置为TASK_KILLABLE状态,被称为中度睡眠,是(TASK_WAKEKILL | TASK_UNINTERRUPTIBLE)两种状态的组合,即还不完全是D,还可以响应kill信号。
    此时,我们可以在终端1中用ctrl c杀死alloc_memory_1进程。

    • 接着上一步,重新开启oom_test的oom功能
      执行下面的命令开启oom_test的oom,然后观察现象,注此时alloc_memory_1处于D
    root@ubuntu:/sys/fs/cgroup/memory/oom_test# echo 0 > memory.oom_control
    root@ubuntu:/sys/fs/cgroup/memory/oom_test# cat memory.oom_control
    oom_kill_disable 0
    under_oom 0
    oom_kill 2
    

    可以看到,此时oom被打开,然后终端2中的alloc_memory_2进程被杀死了,节点memory.oom_controlunder_oom变成了0,表示oom处理完了,而终端1中的进程alloc_memory_1的状态变成了S:

    root@ubuntu:/home/pengdl/work/oom_test# ps -aux | grep alloc_me
    root      369713  0.0  0.3  63960 62632 pts/2    S+   01:12   0:00 ./alloc_memory_1
    

    原因是,当开启oom时,会将memcg_oom_waitq中睡眠的进程唤醒,当alloc_memory_1唤醒后,继续运行,因为上次没有处理成功缺页,所以mmu中的pte的还不是有效的映射,所以还是会触发跟上一次相同的缺页
    终端,由于此时开启了oom,会选中alloc_memory_2并且杀掉以释放出内存。

    mem_cgroup_oom_control_write
    	-> memcg->oom_kill_disable = val
    	-> memcg_oom_recover
    		-> __wake_up(&memcg_oom_waitq, TASK_NORMAL, 0, memcg)
    
    alloc_memory_2被杀死的内核log
    [27661.346219] alloc_memory_1 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
    [27661.346286] CPU: 2 PID: 369713 Comm: alloc_memory_1 Not tainted 5.13.0-40-generic #45~20.04.1-Ubuntu
    [27661.346289] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/29/2019
    [27661.346291] Call Trace:
    [27661.346294]  <TASK>
    [27661.346297]  dump_stack+0x7d/0x9c
    [27661.346303]  dump_header+0x4f/0x1f6
    [27661.346307]  oom_kill_process.cold+0xb/0x10
    [27661.346310]  out_of_memory+0x1cf/0x520
    [27661.346314]  mem_cgroup_out_of_memory+0xe4/0x100
    [27661.346318]  try_charge+0x708/0x760
    [27661.346320]  ? __alloc_pages+0x17b/0x320
    [27661.346323]  __mem_cgroup_charge+0x3e/0xb0
    [27661.346325]  mem_cgroup_charge+0x32/0x90
    [27661.346327]  do_anonymous_page+0x111/0x3b0
    [27661.346330]  ? psi_task_switch+0x121/0x260
    [27661.346334]  __handle_mm_fault+0x8a4/0x8e0
    [27661.346336]  handle_mm_fault+0xda/0x2b0
    [27661.346338]  do_user_addr_fault+0x1bb/0x650
    [27661.346342]  exc_page_fault+0x7d/0x170
    [27661.346344]  ? asm_exc_page_fault+0x8/0x30
    [27661.346348]  asm_exc_page_fault+0x1e/0x30
    [27661.346350] RIP: 0033:0x7f1e90080b0b
    [27661.346353] Code: 47 20 c5 fe 7f 44 17 c0 c5 fe 7f 47 40 c5 fe 7f 44 17 a0 c5 fe 7f 47 60 c5 fe 7f 44 17 80 48 01 fa 48 83 e2 80 48 39 d1 74 ba <c5> fd 7f 01 c5 fd 7f 41 20 c5 fd 7f 41 40 c5 fd 7f 41 60 48 81 c1
    [27661.346355] RSP: 002b:00007ffc5ff69658 EFLAGS: 00010206
    [27661.346357] RAX: 00007f1e8d6f1010 RBX: 000055c44f5c01c0 RCX: 00007f1e8e013000
    [27661.346359] RDX: 00007f1e8e0f1000 RSI: 0000000000000000 RDI: 00007f1e8d6f1010
    [27661.346360] RBP: 00007ffc5ff69670 R08: 00007f1e8d6f1010 R09: 0000000000000000
    [27661.346361] R10: 0000000000000022 R11: 0000000000000246 R12: 000055c44f5c00a0
    [27661.346362] R13: 00007ffc5ff69760 R14: 0000000000000000 R15: 0000000000000000
    [27661.346365]  </TASK>
    [27661.346366] memory: usage 81920kB, limit 81920kB, failcnt 354
    [27661.346367] memory+swap: usage 81920kB, limit 9007199254740988kB, failcnt 0
    [27661.346369] kmem: usage 336kB, limit 9007199254740988kB, failcnt 0
    [27661.346370] Memory cgroup stats for /oom_test:
    [27661.346408] anon 83542016
                   file 0
                   kernel_stack 32768
                   pagetables 229376
                   percpu 0
                   sock 0
                   shmem 0
                   file_mapped 0
                   file_dirty 0
                   file_writeback 0
                   swapcached 0
                   anon_thp 0
                   file_thp 0
                   shmem_thp 0
                   inactive_anon 83529728
                   active_anon 12288
                   inactive_file 0
                   active_file 0
                   unevictable 0
                   slab_reclaimable 0
                   slab_unreclaimable 42736
                   slab 42736
                   workingset_refault_anon 0
                   workingset_refault_file 0
                   workingset_activate_anon 0
                   workingset_activate_file 0
                   workingset_restore_anon 0
                   workingset_restore_file 0
                   workingset_nodereclaim 0
                   pgfault 144268
                   pgmajfault 0
                   pgrefill 0
                   pgscan 0
                   pgsteal 0
                   pgactivate 5
                   pgdeactivate 0
                   pglazyfree 0
                   pglazyfreed 0
                   thp_fault_alloc 0
                   thp_collapse_alloc 0
    [27661.346414] Tasks state (memory values in pages):
    [27661.346415] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
    [27661.346417] [ 332334]     0 332334     2755     1352    61440        0             0 bash
    [27661.346421] [ 336721]     0 336721     2753     1309    65536        0             0 bash
    [27661.346424] [ 336822]     0 336822     2753     1350    53248        0             0 bash
    [27661.346426] [ 343405]     0 343405    10868    10524   122880        0             0 alloc_memory_2
    [27661.346428] [ 369713]     0 369713    10868    10313   126976        0             0 alloc_memory_1
    [27661.346430] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/oom_test,task_memcg=/oom_test,task=alloc_memory_2,pid=343405,uid=0
    [27661.346452] Memory cgroup out of memory: Killed process 343405 (alloc_memory_2) total-vm:43472kB, anon-rss:40932kB, file-rss:1164kB, shmem-rss:0kB, UID:0 pgtables:120kB oom_score_adj:0
    

    cgroup v2

    cgroup v2默认也被挂载在/sys/fs/cgroup/下,用到的文件:

    cgroup.procs     # 用于将进程加入到控制组
    memory.max       # 用于设置控制组的内存上限
    memory.current   # 用于查看控制组的内存消耗
    memory.oom.group # 用于控制oom发生时,是否回收控制组中的所有进程
    

    cgroupv2中删除了控制oom是否开启的文件,即oom始终开启。并且多了一个memory.oom.group的文件,表示当发生oom时,是否杀死这个控制组下的所有进程。

    • 创建oom_test控制组

    • 上前面同样的操作步骤,关闭全局swap,创建三个终端,将终端进程加入到oom_test控制组,然后分别运行alloc_memory_1、alloc_memory_2以及print_counter.sh

    • 设置内存上限

    root@ubuntu2204:/sys/fs/cgroup/oom_test# echo 80M > memory.max
    root@ubuntu2204:/sys/fs/cgroup/oom_test# cat memory.max
    83886080
    
    • 在终端1和终端2中交替敲回车
      可以看到跟之前的一样,当在终端2中输入回车后,导致内存超过max,从而触发了内存回收,由于没有swap,并且页缓存也不够,这样会触发oom,选中了终端1中的进程alloc_memory_1,杀死后,
      控制组的内存占用降了下来,终端2和终端3终端的进程还在运行。
    回收alloc_memory_1的内核log
    [ 7950.851327] alloc_memory_2 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
    [ 7950.851338] CPU: 2 PID: 470201 Comm: alloc_memory_2 Tainted: P           O      5.15.0-25-generic #25-Ubuntu
    [ 7950.851342] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/29/2019
    [ 7950.851344] Call Trace:
    [ 7950.851385]  <TASK>
    [ 7950.851408]  show_stack+0x52/0x58
    [ 7950.851436]  dump_stack_lvl+0x4a/0x5f
    [ 7950.851444]  dump_stack+0x10/0x12
    [ 7950.851446]  dump_header+0x53/0x224
    [ 7950.851451]  oom_kill_process.cold+0xb/0x10
    [ 7950.851454]  out_of_memory+0x106/0x2e0
    [ 7950.851483]  mem_cgroup_out_of_memory+0x13b/0x160
    [ 7950.851509]  try_charge_memcg+0x68a/0x740
    [ 7950.851512]  charge_memcg+0x45/0xb0
    [ 7950.851514]  __mem_cgroup_charge+0x2d/0x80
    [ 7950.851516]  do_anonymous_page+0x110/0x3b0
    [ 7950.851519]  handle_pte_fault+0x1fe/0x230
    [ 7950.851522]  __handle_mm_fault+0x3c7/0x700
    [ 7950.851525]  handle_mm_fault+0xd8/0x2c0
    [ 7950.851528]  do_user_addr_fault+0x1c5/0x670
    [ 7950.851532]  exc_page_fault+0x77/0x160
    [ 7950.851535]  ? asm_exc_page_fault+0x8/0x30
    [ 7950.851539]  asm_exc_page_fault+0x1e/0x30
    [ 7950.851541] RIP: 0033:0x7fb5c441c180
    [ 7950.851566] Code: 81 fa 80 00 00 00 76 d2 c5 fe 7f 40 40 c5 fe 7f 40 60 48 83 c7 80 48 81 fa 00 01 00 00 76 2b 48 8d 90 80 00 00 00 48 83 e2 c0 <c5> fd 7f 02 c5 fd 7f 42 20 c5 fd 7f 42 40 c5 fd 7f 42 60 48 83 ea
    [ 7950.851569] RSP: 002b:00007ffcc9e212f8 EFLAGS: 00010283
    [ 7950.851572] RAX: 00007fb5c1a74010 RBX: 0000000000000000 RCX: 00007fb5c4399bd7
    [ 7950.851574] RDX: 00007fb5c22d6000 RSI: 0000000000000000 RDI: 00007fb5c2473f90
    [ 7950.851575] RBP: 00007ffcc9e21310 R08: 00007fb5c1a74010 R09: 00007fb5c1a74010
    [ 7950.851576] R10: 0000000000000022 R11: 0000000000000246 R12: 00007ffcc9e21428
    [ 7950.851577] R13: 000055a56aa69189 R14: 0000000000000000 R15: 00007fb5c44ed040
    [ 7950.851581]  </TASK>
    [ 7950.851582] memory: usage 81920kB, limit 81920kB, failcnt 26
    [ 7950.851584] swap: usage 0kB, limit 9007199254740988kB, failcnt 0
    [ 7950.851586] Memory cgroup stats for /oom_test:
    [ 7950.851671] anon 83296256
                   file 0
                   kernel_stack 65536
                   pagetables 352256
                   percpu 0
                   sock 0
                   shmem 0
                   file_mapped 0
                   file_dirty 0
                   file_writeback 0
                   swapcached 0
                   anon_thp 0
                   file_thp 0
                   shmem_thp 0
                   inactive_anon 83263488
                   active_anon 16384
                   inactive_file 0
                   active_file 0
                   unevictable 0
                   slab_reclaimable 0
                   slab_unreclaimable 89112
                   slab 89112
                   workingset_refault_anon 0
                   workingset_refault_file 0
                   workingset_activate_anon 0
                   workingset_activate_file 0
                   workingset_restore_anon 0
                   workingset_restore_file 0
                   workingset_nodereclaim 0
                   pgfault 36929
                   pgmajfault 0
                   pgrefill 0
                   pgscan 0
                   pgsteal 0
                   pgactivate 3
                   pgdeactivate 0
                   pglazyfree 0
                   pglazyfreed 0
                   thp_fault_alloc 0
                   thp_collapse_alloc 0
    [ 7950.851676] Tasks state (memory values in pages):
    [ 7950.851677] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
    [ 7950.851678] [ 458753]     0 458753     5020     1481    73728        0             0 bash
    [ 7950.851683] [ 466556]     0 466556     5020     1490    65536        0             0 bash
    [ 7950.851686] [ 466666]     0 466666     5020     1477    69632        0             0 bash
    [ 7950.851688] [ 470004]     0 470004     4689     1074    69632        0             0 print_counter.s
    [ 7950.851691] [ 470201]     0 470201    10937    10162   122880        0             0 alloc_memory_2
    [ 7950.851693] [ 470338]     0 470338    10937    10533   126976        0             0 alloc_memory_1
    [ 7950.851696] [ 472314]     0 472314     4256      257    69632        0             0 sleep
    [ 7950.851698] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/oom_test,task_memcg=/oom_test,task=alloc_memory_1,pid=470338,uid=0
    [ 7950.851712] Memory cgroup out of memory: Killed process 470338 (alloc_memory_1) total-vm:43748kB, anon-rss:40980kB, file-rss:1152kB, shmem-rss:0kB, UID:0 pgtables:124kB oom_score_adj:0
    [ 7950.853548] oom_reaper: reaped process 470338 (alloc_memory_1), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
    
     
    • 开启oom.group,观察是否会杀死控制组中的所有进程

    在终端1中重新启动alloc_memory_1,然后执行下面的命令开启oom.group

    root@ubuntu2204:/sys/fs/cgroup/oom_test# cat memory.oom.group
    0
    root@ubuntu2204:/sys/fs/cgroup/oom_test# echo 1 > memory.oom.group
    root@ubuntu2204:/sys/fs/cgroup/oom_test# cat memory.oom.group
    1
    
    • 在终端1中连续敲回车,观察现象

    可以看到,当控制组的内存占用超过max后,会将控制组中的进程全部杀死:

    root@ubuntu2204:/sys/fs/cgroup/oom_test# cat cgroup.procs
    
    

    可以看到,这个控制组下一个进程也没有了。在看看内核log:

    控制组中的所有进程全部被杀死的log
    [ 8471.032495] alloc_memory_1 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
    [ 8471.032504] CPU: 2 PID: 479953 Comm: alloc_memory_1 Tainted: P           O      5.15.0-25-generic #25-Ubuntu
    [ 8471.032508] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/29/2019
    [ 8471.032510] Call Trace:
    [ 8471.032512]  <TASK>
    [ 8471.032515]  show_stack+0x52/0x58
    [ 8471.032523]  dump_stack_lvl+0x4a/0x5f
    [ 8471.032527]  dump_stack+0x10/0x12
    [ 8471.032530]  dump_header+0x53/0x224
    [ 8471.032534]  oom_kill_process.cold+0xb/0x10
    [ 8471.032538]  out_of_memory+0x106/0x2e0
    [ 8471.032543]  mem_cgroup_out_of_memory+0x13b/0x160
    [ 8471.032548]  try_charge_memcg+0x68a/0x740
    [ 8471.032551]  charge_memcg+0x45/0xb0
    [ 8471.032553]  __mem_cgroup_charge+0x2d/0x80
    [ 8471.032555]  do_anonymous_page+0x110/0x3b0
    [ 8471.032558]  handle_pte_fault+0x1fe/0x230
    [ 8471.032561]  __handle_mm_fault+0x3c7/0x700
    [ 8471.032564]  handle_mm_fault+0xd8/0x2c0
    [ 8471.032567]  do_user_addr_fault+0x1c5/0x670
    [ 8471.032570]  ? do_syscall_64+0x69/0xc0
    [ 8471.032575]  exc_page_fault+0x77/0x160
    [ 8471.032577]  ? asm_exc_page_fault+0x8/0x30
    [ 8471.032580]  asm_exc_page_fault+0x1e/0x30
    [ 8471.032583] RIP: 0033:0x7fbcda9f2180
    [ 8471.032615] Code: 81 fa 80 00 00 00 76 d2 c5 fe 7f 40 40 c5 fe 7f 40 60 48 83 c7 80 48 81 fa 00 01 00 00 76 2b 48 8d 90 80 00 00 00 48 83 e2 c0 <c5> fd 7f 02 c5 fd 7f 42 20 c5 fd 7f 42 40 c5 fd 7f 42 60 48 83 ea
    [ 8471.032617] RSP: 002b:00007ffced1f5408 EFLAGS: 00010283
    [ 8471.032620] RAX: 00007fbcd804a010 RBX: 0000000000000000 RCX: 00007fbcda96fbd7
    [ 8471.032622] RDX: 00007fbcd88a8000 RSI: 0000000000000000 RDI: 00007fbcd8a49f90
    [ 8471.032623] RBP: 00007ffced1f5420 R08: 00007fbcd804a010 R09: 00007fbcd804a010
    [ 8471.032624] R10: 0000000000000022 R11: 0000000000000246 R12: 00007ffced1f5538
    [ 8471.032626] R13: 00005649ef725189 R14: 0000000000000000 R15: 00007fbcdaac3040
    [ 8471.032629]  </TASK>
    [ 8471.032676] memory: usage 81920kB, limit 81920kB, failcnt 232
    [ 8471.032679] swap: usage 0kB, limit 9007199254740988kB, failcnt 0
    [ 8471.032681] Memory cgroup stats for /oom_test:
    [ 8471.032696] anon 83308544
                   file 0
                   kernel_stack 65536
                   pagetables 339968
                   percpu 0
                   sock 0
                   shmem 0
                   file_mapped 0
                   file_dirty 0
                   file_writeback 0
                   swapcached 0
                   anon_thp 0
                   file_thp 0
                   shmem_thp 0
                   inactive_anon 83243008
                   active_anon 16384
                   inactive_file 0
                   active_file 0
                   unevictable 0
                   slab_reclaimable 0
                   slab_unreclaimable 89112
                   slab 89112
                   workingset_refault_anon 0
                   workingset_refault_file 0
                   workingset_activate_anon 0
                   workingset_activate_file 0
                   workingset_restore_anon 0
                   workingset_restore_file 0
                   workingset_nodereclaim 0
                   pgfault 113787
                   pgmajfault 0
                   pgrefill 0
                   pgscan 0
                   pgsteal 0
                   pgactivate 5
                   pgdeactivate 0
                   pglazyfree 0
                   pglazyfreed 0
                   thp_fault_alloc 0
                   thp_collapse_alloc 0
    [ 8471.032700] Tasks state (memory values in pages):
    [ 8471.032701] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
    [ 8471.032702] [ 458753]     0 458753     5020     1483    73728        0             0 bash
    [ 8471.032707] [ 466556]     0 466556     5020     1490    65536        0             0 bash
    [ 8471.032710] [ 466666]     0 466666     5020     1477    69632        0             0 bash
    [ 8471.032712] [ 470004]     0 470004     4689     1074    69632        0             0 print_counter.s
    [ 8471.032715] [ 470201]     0 470201    10937    10558   122880        0             0 alloc_memory_2
    [ 8471.032718] [ 479953]     0 479953    10937    10171   122880        0             0 alloc_memory_1
    [ 8471.032720] [ 480081]     0 480081     4256      252    61440        0             0 sleep
    [ 8471.032723] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/oom_test,task_memcg=/oom_test,task=alloc_memory_2,pid=470201,uid=0
    [ 8471.032751] Memory cgroup out of memory: Killed process 470201 (alloc_memory_2) total-vm:43748kB, anon-rss:40976kB, file-rss:1256kB, shmem-rss:0kB, UID:0 pgtables:120kB oom_score_adj:0
    [ 8471.032837] Tasks in /oom_test are going to be killed due to memory.oom.group set
    [ 8471.032856] Memory cgroup out of memory: Killed process 458753 (bash) total-vm:20080kB, anon-rss:1836kB, file-rss:4096kB, shmem-rss:0kB, UID:0 pgtables:72kB oom_score_adj:0
    [ 8471.032891] Memory cgroup out of memory: Killed process 466556 (bash) total-vm:20080kB, anon-rss:1832kB, file-rss:4128kB, shmem-rss:0kB, UID:0 pgtables:64kB oom_score_adj:0
    [ 8471.032914] Memory cgroup out of memory: Killed process 466666 (bash) total-vm:20080kB, anon-rss:1824kB, file-rss:4084kB, shmem-rss:0kB, UID:0 pgtables:68kB oom_score_adj:0
    [ 8471.032939] Memory cgroup out of memory: Killed process 470004 (print_counter.s) total-vm:18756kB, anon-rss:476kB, file-rss:3820kB, shmem-rss:0kB, UID:0 pgtables:68kB oom_score_adj:0
    [ 8471.032974] Memory cgroup out of memory: Killed process 470201 (alloc_memory_2) total-vm:43748kB, anon-rss:40976kB, file-rss:1256kB, shmem-rss:0kB, UID:0 pgtables:120kB oom_score_adj:0
    [ 8471.032992] Memory cgroup out of memory: Killed process 479953 (alloc_memory_1) total-vm:43748kB, anon-rss:39392kB, file-rss:1292kB, shmem-rss:0kB, UID:0 pgtables:120kB oom_score_adj:0
    [ 8471.033016] Memory cgroup out of memory: Killed process 480081 (sleep) total-vm:17024kB, anon-rss:88kB, file-rss:920kB, shmem-rss:0kB, UID:0 pgtables:60kB oom_score_adj:0
    [ 8471.033243] oom_reaper: reaped process 480081 (sleep), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
    [ 8471.037155] oom_reaper: reaped process 479953 (alloc_memory_1), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
    [ 8471.037371] oom_reaper: reaped process 470201 (alloc_memory_2), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
    

    代码流程:

    out_of_memory
    	-> select_bad_process
    	-> oom_kill_process
    		-> mem_cgroup_get_oom_group
    		-> __oom_kill_process
    		-> mem_cgroup_scan_tasks
    

    完。

  • 相关阅读:
    认识Python
    MongoDB
    K8S搭建过程随笔_证书CFSSL
    K8S搭建过程随笔_系统初始化
    zabbix4.2Proxy安装文档
    Zabbix4.2Server端部署
    单节点FastDFS与Nginx部署
    Windows Server 2016分层式存储,使用PowerShell修改底层介质类型
    kill命令和killall命令
    Nuget使用时遇到的问题,Solved
  • 原文地址:https://www.cnblogs.com/pengdonglin137/p/16213042.html
Copyright © 2020-2023  润新知