• 相似进程死掉Process com.midea.mmp2 died.


    此异常查到网上有一篇不错的文章例如以下:
    08:56:03,273 INFO – 运行Do func=[GetSeqNo] keyNam=[keynam];KeyVal=[PRYPAYBILSYSTRACKNO20130125];SeqNam=[keyval];tblName=[pryseqrec];len=[6];circleString=[1];colName=[null]
    2 08:56:03,296 ERROR – 获取数据库连接失败! : Cannot create PoolableConnectionFactory (ORA-01034: ORACLE not available
    3 ORA-27123: unable to attach to shared memory segment
    4 Linux Error: 22: Invalid argument
    5 Additional information: 7
    6 Additional information: 2162692
    7 )
    复制代码

       OK,我看到了ORA-27123:unable to attach to shared memory segment错误,我猜想多数和内存有关。

       这是一台CentOS 5.5上跑着Oracle 11.2.0.1的PC SERVER。

       首先我查了下告警日志,发现近期有不少进程死掉现象,例如以下:

    复制代码
     1 ...
     2 Fri Jan 25 00:00:46 2013
     3 Process m001 died, see its trace file
     4 Fri Jan 25 00:07:36 2013
     5 Process W000 died, see its trace file
     6 Fri Jan 25 01:00:48 2013
     7 Process m000 died, see its trace file
     8 Fri Jan 25 01:07:45 2013
     9 Process W000 died, see its trace file
    10 Fri Jan 25 01:55:40 2013
    11 Process m000 died, see its trace file
    12 Fri Jan 25 02:07:50 2013
    13 Process W000 died, see its trace file
    14 Fri Jan 25 02:35:28 2013
    15 Process m000 died, see its trace file
    16 Fri Jan 25 03:05:45 2013
    17 Process m000 died, see its trace file
    18 Fri Jan 25 03:37:56 2013
    19 Process W000 died, see its trace file
    20 Fri Jan 25 03:58:01 2013
    21 Process W000 died, see its trace file
    22 Fri Jan 25 04:05:34 2013
    23 Process m000 died, see its trace file
    24 Fri Jan 25 04:25:51 2013
    25 Process m000 died, see its trace file
    26 Fri Jan 25 05:00:07 2013
    27 Process m001 died, see its trace file
    28 Fri Jan 25 05:09:22 2013
    29 Process m000 died, see its trace file
    30 Fri Jan 25 05:45:57 2013
    31 Process m000 died, see its trace file
    32 Fri Jan 25 06:01:04 2013
    33 Thread 1 cannot allocate new log, sequence 1781
    34 Private strand flush not complete
    35   Current log# 1 seq# 1780 mem# 0: /opt/11g/oracle/oradata/orcl/redo01.log
    36 Thread 1 advanced to log sequence 1781 (LGWR switch)
    37   Current log# 2 seq# 1781 mem# 0: /opt/11g/oracle/oradata/orcl/redo02.log
    38 Fri Jan 25 06:20:45 2013
    39 Process m000 died, see its trace file
    40 Fri Jan 25 07:00:15 2013
    41 Process m001 died, see its trace file
    42 …...
    复制代码

        网上对Process m001 died,see its trace file和Process W0000 died,see its trace file类似的问题有非常多讨论,基本都是由于进程数达到上限了。
        我也查了下数据库參数设置和当前情况:

    复制代码
    1 SQL> col RESOURCE_NAME for a20
    2 SQL> col LIMIT_VALUE for a20
    3 SQL> select resource_name,MAX_UTILIZATION,LIMIT_VALUE from v$resource_limit where resource_name in ('processes','sessions');
    4 
    5 RESOURCE_NAME        MAX_UTILIZATION LIMIT_VALUE
    6 -------------------- --------------- --------------------
    7 processes                        281        500
    8 sessions                         282        792
    复制代码

        --可见离上限还有非常大距离。问题不是它导致的。

        --查看了下top,当前server也非常空暇,free查看内存使用情况,也没有问题。

    1 [root@orcl ~]# free -m
    2              total       used       free     shared    buffers     cached
    3 Mem:         12172      10245       1926          0        363       8288
    4 -/+ buffers/cache:       1593      10579
    5 Swap:         5535         68       5467

        --我手动切换日志而且checkpoint了下,发现告警日志里又出现了Process died。

    复制代码
     1 SQL> alter system switch logfile;
     2 
     3 System altered.
     4 
     5 SQL> alter system checkpoint;
     6 
     7 System altered.
     8 
     9 Process m001 died, see its trace file
    10 Fri Jan 25 10:09:34 2013
    11 Process m000 died, see its trace file
    12 Fri Jan 25 10:17:39 2013
    复制代码

        --我查了下/dev/shm的情况,尽管在物理内存12G的server上仅设置6G认为有点小。但眼下情况看来并不会导致太大问题。

    1 [root@orcl ~]# df -Th /dev/shm
    2 文件系统      类型    容量  已用 可用 已用% 挂载点
    3 tmpfs        tmpfs    6.0G  8.1M  6.0G   1% /dev/shm

        --接着我查了下shmmax值设置。这个參数定义了共享内存段的最大尺寸(以字节为单位)。假设设置不当,我们就会遇到ORA-27123。

    1 [root@orcl ~]# more /etc/sysctl.conf | grep shmmax
    2 kernel.shmmax = 1073741824
    3 
    4 [root@orcl ~]# more /proc/sys/kernel/shmmax
    5 1073741824

        --我发现此參数设置了1G,我认为在一个使用AMM的系统(这是我没全然检查之后的想法)上这个值应该须要上调的。

    我没立即更改,继续查问题。
        --OS上找了以上内容之后我又回到数据库。看了下内存相关设置,这一查吓一跳。本以为使用AMM的系统竟然颠覆了我得想法:

    复制代码
    1 SQL> show parameter memory
    2 
    3 NAME                                 TYPE                   VALUE
    4 ------------------------------------ ---------------------- ------------------------------
    5 hi_shared_memory_address             integer                0
    6 memory_max_target                    big integer            0
    7 memory_target                        big integer            0
    8 shared_memory_address                integer                0
    复制代码

        --OK。Oracle 11g引入了AMM特性之后。Oracle也推荐使用它,当然,这不表示10g的ASSM不可用,依据特殊情况也有系统这样使用。 但我手头里的11g版本号数据库基本都在使用AMM特性。并且用得也非常好。
       

        --查看sga和pga。例如以下:

    复制代码
     1 SQL> show parameter sga
     2 
     3 NAME                                 TYPE                   VALUE
     4 ------------------------------------ ---------------------- ------------------------------
     5 lock_sga                             boolean                FALSE
     6 pre_page_sga                         boolean                FALSE
     7 sga_max_size                         big integer            1G
     8 sga_target                           big integer            1G
     9 
    10 SQL> show parameter pga
    11 
    12 NAME                                 TYPE                   VALUE
    13 ------------------------------------ ---------------------- ------------------------------
    14 pga_aggregate_target                 big integer            3844M
    复制代码

        --到这里问题发生原因基本浮出水面了,SGA大小不足以满足需求而导致的。
        --问题发生阶段的AWR报告中也能看到library hit命中率非常低。硬解析比較严重。另外软解析的比重也非常低。

        原因找到了,解决这个问题就简单了。

        首先跟开发的同事大概说明了一下原因。然后也跟开发的领导申请了停机时间,通知了其它开发者之后改动參数并重新启动数据库攻克了问题。

        --操作例如以下:

        1、改动shmmax參数值,提升到2G:

    复制代码
    1 [root@orcl ~]# vi /etc/sysctl.conf 
    2 kernel.shmmax = 2147483648
    3 
    4 [root@orcl ~]# sysctl -p
    5 
    6 [root@orcl ~]# more /etc/sysctl.conf | grep shmmax
    7 kernel.shmmax = 2147483648
    复制代码

        2、启动AMM特性,设置memory_target为6G:

    复制代码
     1 SQL> create pfile='/home/oracle/pfile_20130125.bk' from spfile;
     2 
     3 File created.
     4 
     5 SQL> alter system set memory_target=6G scope=spfile;
     6 
     7 System altered.
     8 
     9 SQL> shutdown immediate
    10 Database closed.
    11 Database dismounted.
    12 ORACLE instance shut down.
    13 
    14 SQL> startup
    15 ORACLE instance started.
    16 
    17 Total System Global Area                         2042241024 bytes
    18 Fixed Size                                          1337548 bytes
    19 Variable Size                                    1392510772 bytes
    20 Database Buffers                                  637534208 bytes
    21 Redo Buffers                                       10858496 bytes
    22 Database mounted.
    23 Database opened.
    24 
    25 -- 查看
    26 SQL> show parameter memory
    27 
    28 NAME_COL_PLUS_SHOW_PARAM       TYPE                   VALUE_COL_PLUS_SHOW_PARAM
    29 ------------------------------ ---------------------- ------------------------------
    30 hi_shared_memory_address       integer                0
    31 memory_max_target              big integer            6G
    32 memory_target                  big integer            6G
    33 shared_memory_address          integer                0
    34 
    35 SQL> select * from v$sgainfo;
    36 
    37 NAME                                                                  BYTES RESIZE
    38 ---------------------------------------------------------------- ---------- ------
    39 Fixed SGA Size                                                      1337548 No
    40 Redo Buffers                                                       10858496 No
    41 Buffer Cache Size                                                 520093696 Yes
    42 Shared Pool Size                                                  486539264 Yes
    43 Large Pool Size                                                    16777216 Yes
    44 Java Pool Size                                                     16777216 Yes
    45 Streams Pool Size                                                  16777216 Yes
    46 Shared IO Pool Size                                                       0 Yes
    47 Granule Size                                                       16777216 No
    48 Maximum SGA Size                                                 2042241024 No
    49 Startup overhead in Shared Pool                                   167772160 No
    50 Free SGA Memory Available                                         973078528
    51 
    52 12 rows selected.
    53 
    54 SQL> show parameter sga
    55 
    56 NAME_COL_PLUS_SHOW_PARAM       TYPE                   VALUE_COL_PLUS_SHOW_PARAM
    57 ------------------------------ ---------------------- ------------------------------
    58 lock_sga                       boolean                FALSE
    59 pre_page_sga                   boolean                FALSE
    60 sga_max_size                   big integer            1952M
    61 sga_target                     big integer            1G

  • 相关阅读:
    【C#】Send data between applications
    【C#】Switch datatype between object and byte[]
    【C#】Get the html code of a webpage
    MSIL Hello World
    MonoGame 3.2 下,截屏与 Texture2D 的保存
    mciSendString 的两个小坑
    virtual 修饰符与继承对析构函数的影响(C++)
    让 OpenAL 也支持 S16 Planar(辅以 FFmpeg)
    博客园第一篇——SDL2+FFmpeg 制作简单播放器&同步
    第五次UML作业——结对作业二:班级成绩表
  • 原文地址:https://www.cnblogs.com/lxjshuju/p/7199725.html
Copyright © 2020-2023  润新知