• Linux OOM Killer造成数据库访问异常排查


    服务器上的服务器访问异常,查看/va/log/messages发现如下:

    Sep 22 16:08:21 safeserver kernel: java invoked oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0
    Sep 22 16:08:21 safeserver kernel: java cpuset=/ mems_allowed=0
    Sep 22 16:08:21 safeserver kernel: Pid: 14859, comm: java Not tainted 2.6.32-754.30.2.el6.x86_64 #1

    OOM Killer机制是怎样?又如何设置防止此种情况发生?Linux内存如何排查?

    首先看内存:
    $ free
                                      total           used                 free    shared    buffers    cached
    Mem:                         4040360    4012200       28160         0     176628   3571348
    -/+ buffers/cache:                        264224     3776136
    Swap:                         0                         0                 0

    注意要看红色的部分,上面的哪个free 28160不是真正的free,有如下说明:
    In this example the total amount of available memory is 4040360 KB. 264224 KB are used by processes and 3776136 KB are free for other applications. Do not get confused by the first line which shows that 28160KB are free! If you look at the usage figures you can see that most of the memory use is for buffers and cache. Linux always tries to use RAM to speed up disk operations by using available memory for buffers (file system metadata) and cache (pages with actual contents of files or block devices). This helps the system to run faster because disk information is already in memory which saves I/O operations. If space is needed by programs or applications like Oracle, then Linux will free up the buffers and cache to yield memory for the applications. If your system runs for a while you will usually see a small number under the field "free" on the first line.
    --from redhat

    发现服务器没有设置Swap导致OOM killer频繁发生。

    那又如何查看swap设置呢?

    检查是否启用swap:
    cat /proc/swaps
    grep Swap /proc/meminfo
    swapon -s
    free -m
    vmstat

    Swap到底该设置多大呢?

    https://access.redhat.com/solutions/15244

    redhat 6,7一般推荐和内存一致(4~8G),具体参考上面链接。

    启用swap:

    swap:可以用逻辑卷或者文件方式。下面是采用文件方式。

    [root@safedemo bin]# dd if=/dev/zero of=/swapfile bs=1G count=4
    4+0 records in
    4+0 records out
    4294967296 bytes (4.3 GB) copied, 37.4051 s, 115 MB/s
    [root@safedemo bin]# chmod 600 /swapfile
    [root@safedemo bin]# mkswap /swapfile
    mkswap: /swapfile: warning: don't erase bootbits sectors
            on whole disk. Use -f to force.
    Setting up swapspace version 1, size = 4194300 KiB
    no label, UUID=96e8b638-b36c-4660-8667-5654a92dc520
    [root@safedemo bin]# swapon /swapfile
    [root@safedemo bin]# vi /etc/fstab
    /swapfile    swap    swap   defaults 0 0

    做了一个例子来重现OOM killer

    import java.util.Scanner;
    
    public class OOMTest {
    
        private static Scanner scanner = new Scanner(System.in);
    
        public static void main(String[] args) {
            java.util.List<int[]> l = new java.util.ArrayList();
            
            try {
                for (int i = 0; i < 1000; i++) {
                    System.out.println("Please press any text to allocate ~100M memory:");
                    String input = scanner.nextLine();
                    System.out.println("new memory(~100M)");
                    l.add(new int[26107200]);
                }
            } catch (Throwable t) {
                t.printStackTrace();
            }
        }
    
    }

    运行:
    [root@safedemo bin]# java -Xmx2g OOMTest
    Picked up JAVA_TOOL_OPTIONS: -Dhttps.protocols=TLSv1.2
    Please press any text to allocate ~100M memory:

    new memory(~100M)
    Please press any text to allocate ~100M memory:

    new memory(~100M)
    Please press any text to allocate ~100M memory:

    new memory(~100M)
    Please press any text to allocate ~100M memory:

    new memory(~100M)
    Please press any text to allocate ~100M memory:

    new memory(~100M)
    Please press any text to allocate ~100M memory:

    new memory(~100M)
    Please press any text to allocate ~100M memory:

    new memory(~100M)
    Killed <-它自己触发系统oom killer,结果把自己杀死了。


    //check /var/log/messages.
    Sep 22 16:08:21 safeserver kernel: java invoked oom-killer: gfp_mask=0x280da, order=0, oom_adj=0, oom_score_adj=0
    Sep 22 16:08:21 safeserver kernel: java cpuset=/ mems_allowed=0
    Sep 22 16:08:21 safeserver kernel: Pid: 14859, comm: java Not tainted 2.6.32-754.30.2.el6.x86_64 #1
    //14859就是引发oom killer的进程(上面的OOMTest)
    ....
    Sep 22 16:08:21 safeserver kernel: Out of memory: Kill process 14857 (java) score 142 or sacrifice child
    Sep 22 16:08:21 safeserver kernel: Killed process 14857, UID 0, (java) total-vm:3191104kB, anon-rss:676096kB, file-rss:68kB

    OOM能不能禁用?
    //Disable OOM killer  in redhat
    Red Hat Enteprise Linux 5, 6 and 7 do not have the ability to completely disable OOM-KILLER. Please see the following section for tuning OOM-KILLER operation within RHEL 5, RHEL 6 and RHEL 7.

    答案是不完全能够禁用。


    可以通过调整某个进程的score来避免oom killer
    There is also a special value of -17, which disables oom_killer for that process. In the example below, oom_score returns a value of O,indicating that this process would not be killed.
    Raw

        # cat /proc/12465/oom_score
        78           
        # echo -17 > /proc/12465/oom_adj           
        # cat /proc/12465/oom_score
        0


    也可以通过调整overcommit_memory来调整

    ,如果设置为2,内存不够时会报错,达到间接控制oom killer的目的(官方文档提到某些情况下也会trigger oom killer)
    The /etc/sysctl.conf file consists
    vm.overcommit_memory = 2
    vm.overcommit_ratio = 100






    over

  • 相关阅读:
    centos下vsftpd不能显示文件,不能创建文件及文件夹
    PHP过滤常用标签的正则表达式
    px、dp、sp、mm、in、pt这些单位有什么区别?
    Android Studio升级后报 method not found: 'runProguard'的错误
    Android应用签名
    Android技巧小结之新旧版本Notification
    java中 synchronized 的使用,确保异步执行某一段代码。
    android开发笔记(二)导入项目到eclipse和另一个项目
    android开发笔记(一)Android studio 输入法
    这个算asp.net的一个bug吗?
  • 原文地址:https://www.cnblogs.com/bjfarmer/p/13717598.html
Copyright © 2020-2023  润新知