使用cgroups来控制内存使用

磨砺技术珠矶，践行数据之道，追求卓越价值

回到上一级页面：PostgreSQL内部结构与源代码研究索引页回到顶级页面：PostgreSQL索引页

[作者高健@博客园 luckyjackgao@gmail.com]

首先学习网上例子，进行体验性的试验：

首先不限制内存使用来进行下载：

[root@cent6 Desktop]# free -m
             total       used       free     shared    buffers     cached
Mem:          2006        484       1522          0         29        175
-/+ buffers/cache:        279       1727
Swap:         4031          0       4031
[root@cent6 Desktop]#

然后，再下载约700M:

wget http://centos.arcticnetwork.ca/6.4/isos/x86_64/CentOS-6.4-x86_64-LiveCD.iso

然后看内存使用情况：

[root@cent6 Desktop]# free -m
             total       used       free     shared    buffers     cached
Mem:          2006       1224        782          0         33        878
-/+ buffers/cache:        312       1694
Swap:         4031          0       4031
[root@cent6 Desktop]#

确实是用掉了700多M内存。

然后，重新启动，限制内存使用：

[root@cent6 Desktop]# service cgconfig status
Stopped


#mount -t cgroup -o memory memcg /cgroup 

# mkdir /cgroup/GroupA  
# echo 10M > /cgroup/GroupA/memory.limit_in_bytes  
# echo $$ > /cgroup/GroupA/tasks

然后，再看内存状况：

[root@cent6 Desktop]# free -m
             total       used       free     shared    buffers     cached
Mem:          2006        481       1525          0         29        174
-/+ buffers/cache:        276       1729
Swap:         4031          0       4031
[root@cent6 Desktop]#

再下载约700M:

wget http://centos.arcticnetwork.ca/6.4/isos/x86_64/CentOS-6.4-x86_64-LiveCD.iso

再看内存使用前后对比:

[root@cent6 Desktop]# free -m
             total       used       free     shared    buffers     cached
Mem:          2006        512       1494          0         32        186
-/+ buffers/cache:        293       1713
Swap:         4031          0       4031
[root@cent6 Desktop]#

可以知道，大约的内存使用量为 1525-1494=31M。不过free命令观察到的结果是有误差的，程序执行时间长，free就是一个不断累减的值，由于当前shell被限制使用内存最大10M，那么基数很小的情况下，时间越长，误差越大。

下面，看看对PostgreSQL能否产生良好的限制：

再此之前，通过系统设定来看看对postgres用户进行wget操作时的内存的控制：

[postgres@cent6 Desktop]$ cat /etc/cgconfig.conf
#
#  Copyright IBM Corporation. 2007
#
#  Authors:    Balbir Singh <balbir@linux.vnet.ibm.com>
#  This program is free software; you can redistribute it and/or modify it
#  under the terms of version 2.1 of the GNU Lesser General Public License
#  as published by the Free Software Foundation.
#
#  This program is distributed in the hope that it would be useful, but
#  WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See man cgconfig.conf for further details.
#
# By default, mount all controllers to /cgroup/<controller>

mount {
    cpuset    = /cgroup/cpuset;
    cpu    = /cgroup/cpu;
    cpuacct    = /cgroup/cpuacct;
    memory    = /cgroup/memory;
    devices    = /cgroup/devices;
    freezer    = /cgroup/freezer;
    net_cls    = /cgroup/net_cls;
    blkio    = /cgroup/blkio;
}

group test1 {
    perm {
          task{
              uid=postgres;
              gid=postgres;
          }
          
          admin{
             uid=root;
             gid=root; 
          }

    } memory {
       memory.limit_in_bytes=30M;
    }
}

[postgres@cent6 Desktop]$

还有一个文件，cgrules.conf，也很重要:

[postgres@cent6 Desktop]$ cat /etc/cgrules.conf
# /etc/cgrules.conf
#
#Each line describes a rule for a user in the forms:
#
#<user>            <controllers>        <destination>
#<user>:<process name>    <controllers>        <destination>
#
#Where:
# <user> can be:
#        - an user name
#        - a group name, with @group syntax
#        - the wildcard *, for any user or group.
#        - The %, which is equivalent to "ditto". This is useful for
#          multiline rules where different cgroups need to be specified
#          for various hierarchies for a single user.
#
# <process name> is optional and it can be:
#     - a process name
#     - a full command path of a process
#
# <controller> can be:
#      - comma separated controller names (no spaces)
#      - * (for all mounted controllers)
#
# <destination> can be:
#      - path with-in the controller hierarchy (ex. pgrp1/gid1/uid1)
#
# Note:
# - It currently has rules based on uids, gids and process name.
#
# - Don't put overlapping rules. First rule which matches the criteria
#   will be executed.
#
# - Multiline rules can be specified for specifying different cgroups
#   for multiple hierarchies. In the example below, user "peter" has
#   specified 2 line rule. First line says put peter's task in test1/
#   dir for "cpu" controller and second line says put peter's tasks in
#   test2/ dir for memory controller. Make a note of "%" sign in second line.
#   This is an indication that it is continuation of previous rule.
#
#
#<user>      <controllers>      <destination>
#
#john          cpu        usergroup/faculty/john/
#john:cp       cpu        usergroup/faculty/john/cp
#@student      cpu,memory    usergroup/student/
#peter           cpu        test1/
#%           memory        test2/
#@root            *        admingroup/
#*        *        default/
# End of file
 postgres      memory           test1/
#
[postgres@cent6 Desktop]$

在root用户，设置如下两个服务随系统启动：

chkconfig cgconfig on

chkconfig cgred on

然后重新启动系统后，用postgres用户进行登录，进行检验：

[postgres@cent6 Desktop]$ free -m
             total       used       free     shared    buffers     cached
Mem:          2006        381       1625          0         25        134
-/+ buffers/cache:        221       1785

[postgres@cent6 Desktop]$ wget http://centos.arcticnetwork.ca/6.4/isos/x86_64/CentOS-6.4-x86_64-LiveCD.iso

执行完毕后，看内存状况，成功。

[postgres@cent6 Desktop]$ free -m
             total       used       free     shared    buffers     cached
Mem:          2006        393       1613          0         28        141
-/+ buffers/cache:        224       1782
Swap:         4031         67       3964
[postgres@cent6 Desktop]$

下面看对postgresql中执行sql 的限制如何：

步骤1: 对/etc/cgconfig.conf 文件和 /etc/cgrules.conf 文件的设置如前所述。

步骤2: 运行前查看内存状况：

[postgres@cent6 Desktop]$ free -m

total used free shared buffers cached

Mem: 2006 384 1622 0 26 138

-/+ buffers/cache: 219 1787

Swap: 4031 87 3944

[postgres@cent6 Desktop]$

步骤3: 开始处理大量数据(约600MB)

postgres=# select count(*) from test01;
 count 
-------
     0
(1 row)
 
postgres=# insert into test01 values(generate_series(1,614400),repeat( chr(int4(random()*26)+65),1024));

运行刚刚开始，就出现了如下的错误：

The connection to the server was lost. Attempting reset: Failed.
!>

这和之前碰到的崩溃情形一致。

PostgreSQL的log本身是这样的：

[postgres@cent6 pgsql]$ LOG:  database system was shut down at 2013-09-09 16:20:29 CST
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started
LOG:  server process (PID 2697) was terminated by signal 9: Killed
DETAIL:  Failed process was running: insert into test01 values(generate_series(1,614400),repeat( chr(int4(random()*26)+65),1024));
LOG:  terminating any other active server processes
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
FATAL:  the database system is in recovery mode
LOG:  all server processes terminated; reinitializing
LOG:  database system was interrupted; last known up at 2013-09-09 17:35:42 CST
LOG:  database system was not properly shut down; automatic recovery in progress
LOG:  redo starts at 1/9E807C90
LOG:  unexpected pageaddr 1/946BE000 in log file 1, segment 159, offset 7069696
LOG:  redo done at 1/9F6BDB50
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started

通过dmesg命令，可以看到，发生了Out of Memory错误，这次是 cgroup out of memory

[postgres@cent6 Desktop]$ dmesg | grep post
[ 2673]   500  2673    64453      200   0       0             0 postgres
[ 2675]   500  2675    64494       79   0       0             0 postgres
[ 2676]   500  2676    64453       75   0       0             0 postgres
[ 2677]   500  2677    64453       77   0       0             0 postgres
[ 2678]   500  2678    64667       80   0       0             0 postgres
[ 2679]   500  2679    28359       72   0       0             0 postgres
[ 2697]   500  2697    64764      100   0       0             0 postgres
[ 2673]   500  2673    64453      200   0       0             0 postgres
[ 2675]   500  2675    64494       79   0       0             0 postgres
[ 2676]   500  2676    64453       75   0       0             0 postgres
[ 2677]   500  2677    64453       77   0       0             0 postgres
[ 2678]   500  2678    64667       80   0       0             0 postgres
[ 2679]   500  2679    28359       72   0       0             0 postgres
[ 2697]   500  2697    64764      100   0       0             0 postgres
[ 2673]   500  2673    64453      208   0       0             0 postgres
[ 2675]   500  2675    64494       79   0       0             0 postgres
[ 2676]   500  2676    64453       98   0       0             0 postgres
[ 2677]   500  2677    64453      782   0       0             0 postgres
[ 2678]   500  2678    64667      133   0       0             0 postgres
[ 2679]   500  2679    28359       86   0       0             0 postgres
[ 2697]   500  2697    73075     3036   0       0             0 postgres
Memory cgroup out of memory: Kill process 2697 (postgres) score 1000 or sacrifice child
Killed process 2697, UID 500, (postgres) total-vm:292300kB, anon-rss:8432kB, file-rss:3712kB
[postgres@cent6 Desktop]$

我怀疑自己的内存开得过小了，影响到一些基本的运行。PostgreSQL本身也需要一些资源(shared_buffers、wal_buffers都需要用一些内存)

所以我调整了参数 memory.limit_in_bytes=300M ，再次运行：
前述的sql问处理1200MB数据，成功结束，内存没有过多增长。

[作者高健@博客园 luckyjackgao@gmail.com]

回到上一级页面：PostgreSQL内部结构与源代码研究索引页回到顶级页面：PostgreSQL索引页

磨砺技术珠矶，践行数据之道，追求卓越价值

相关阅读:
软件测试技术实战设计、工具及管理(51Testing软件测试网作品系列)
MATLAB智能算法超级学习手册
 HTML与CSS入门经典(第9版)
深入理解Android 5 源代码
 中文版Dreamweaver CS6基础培训教程(第2版)
可用性测试手册(第2版)
网络综合布线系统与施工技术第4版
 PHP核心技术与最佳实践(第2版)
[OC Foundation框架
 [OC Foundation框架
原文地址：https://www.cnblogs.com/gaojian/p/3305551.html