• ORA-00020: No more process state objects available故障一例


    今天公司一大早收到通知,昨天数据库数据未生成。当时查看跑批的日志,发现平常只需运行半个小时的过程,今天整整运行了7个小时(明显存在问题),导致后续数据正常时间读取失败。为了了解起因,查看了oracle 的告警日志,发现在早上1点半左右出现了错误 ORA-00020: No more process state objects available,进程p062在执行的过程中被告知无可用的进程状态 导致进程hang在哪里,知道有连接断开才继续,具体日志信息如下:

      Current log# 5 seq# 9361 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo05.log
    Sat Mar 26 01:31:31 2016
    Thread 1 advanced to log sequence 9362 (LGWR switch)
      Current log# 6 seq# 9362 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo06.log
    Thread 1 advanced to log sequence 9363 (LGWR switch)
      Current log# 9 seq# 9363 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo09.log
    Thread 1 advanced to log sequence 9364 (LGWR switch)
      Current log# 10 seq# 9364 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo10.log
    Thread 1 advanced to log sequence 9365 (LGWR switch)
      Current log# 3 seq# 9365 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo03.log
    Sat Mar 26 01:37:13 2016
    ORA-00020: No more process state objects available
    ORA-20 errors will not be written to the alert log for
     the next minute. Please look at trace files to see all
     the ORA-20 errors.
    Process P062 submission failed with error = 20
    Sat Mar 26 01:40:46 2016
    ORA-00020: No more process state objects available
    ORA-20 errors will not be written to the alert log for
     the next minute. Please look at trace files to see all
     the ORA-20 errors.
    Process P062 submission failed with error = 20
    Sat Mar 26 01:46:17 2016
    ORA-00020: No more process state objects available
    ORA-20 errors will not be written to the alert log for
     the next minute. Please look at trace files to see all
     the ORA-20 errors.
    Process m000 submission failed with error = 20
    Sat Mar 26 01:48:45 2016
    Thread 1 advanced to log sequence 9366 (LGWR switch)
      Current log# 11 seq# 9366 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo11.log
    Sat Mar 26 01:55:33 2016
    Thread 1 advanced to log sequence 9367 (LGWR switch)
      Current log# 12 seq# 9367 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo12.log
    Thread 1 advanced to log sequence 9368 (LGWR switch)
      Current log# 4 seq# 9368 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo04.log
    Sat Mar 26 01:56:39 2016
    Thread 1 advanced to log sequence 9369 (LGWR switch)
      Current log# 1 seq# 9369 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo01.log
    Sat Mar 26 02:00:15 2016
    Thread 1 advanced to log sequence 9370 (LGWR switch)
      Current log# 2 seq# 9370 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo02.log
    Sat Mar 26 02:04:33 2016
    Thread 1 advanced to log sequence 9371 (LGWR switch)
      Current log# 7 seq# 9371 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo07.log
    Thread 1 advanced to log sequence 9372 (LGWR switch)
      Current log# 8 seq# 9372 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo08.log
    Thread 1 advanced to log sequence 9373 (LGWR switch)
      Current log# 5 seq# 9373 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo05.log
    Sat Mar 26 02:04:43 2016
    Thread 1 advanced to log sequence 9374 (LGWR switch)
      Current log# 6 seq# 9374 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo06.log
    Thread 1 advanced to log sequence 9375 (LGWR switch)
      Current log# 9 seq# 9375 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo09.log
    Sat Mar 26 02:26:22 2016
    Thread 1 advanced to log sequence 9376 (LGWR switch)
      Current log# 10 seq# 9376 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo10.log
    Sat Mar 26 04:00:04 2016
    DM00 started with pid=39, OS id=33332, job ETLUSER.NBIFULLDUMP
    Sat Mar 26 04:00:05 2016
    DW00 started with pid=37, OS id=33334, wid=1, job ETLUSER.NBIFULLDUMP
    Sat Mar 26 04:00:14 2016
    Thread 1 advanced to log sequence 9377 (LGWR switch)
      Current log# 3 seq# 9377 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo03.log
    Sat Mar 26 04:00:23 2016
    DW01 started with pid=41, OS id=33342, wid=2, job ETLUSER.NBIFULLDUMP
    Sat Mar 26 04:00:23 2016
    DW02 started with pid=42, OS id=33344, wid=3, job ETLUSER.NBIFULLDUMP
    Sat Mar 26 04:00:23 2016
    DW03 started with pid=43, OS id=33346, wid=4, job ETLUSER.NBIFULLDUMP
    Sat Mar 26 04:00:23 2016
    DW04 started with pid=44, OS id=33348, wid=5, job ETLUSER.NBIFULLDUMP
    Sat Mar 26 04:00:23 2016
    DW05 started with pid=45, OS id=33350, wid=6, job ETLUSER.NBIFULLDUMP
    Sat Mar 26 04:00:23 2016
    DW06 started with pid=46, OS id=33352, wid=7, job ETLUSER.NBIFULLDUMP
    Sat Mar 26 04:00:23 2016
    DW07 started with pid=47, OS id=33354, wid=8, job ETLUSER.NBIFULLDUMP
    Sat Mar 26 04:03:57 2016
    Thread 1 advanced to log sequence 9378 (LGWR switch)
      Current log# 11 seq# 9378 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo11.log
    Sat Mar 26 04:04:18 2016
    Thread 1 advanced to log sequence 9379 (LGWR switch)
      Current log# 12 seq# 9379 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo12.log
    Sat Mar 26 04:04:42 2016
    Thread 1 advanced to log sequence 9380 (LGWR switch)
      Current log# 4 seq# 9380 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo04.log
    Sat Mar 26 04:05:06 2016
    Thread 1 advanced to log sequence 9381 (LGWR switch)
      Current log# 1 seq# 9381 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo01.log
    Sat Mar 26 04:05:51 2016
    Thread 1 advanced to log sequence 9382 (LGWR switch)
      Current log# 2 seq# 9382 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo02.log
    Sat Mar 26 04:51:01 2016
    Thread 1 cannot allocate new log, sequence 9383
    Private strand flush not complete
      Current log# 2 seq# 9382 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo02.log
    Thread 1 advanced to log sequence 9383 (LGWR switch)
      Current log# 7 seq# 9383 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo07.log
    Sat Mar 26 04:52:37 2016
    Thread 1 cannot allocate new log, sequence 9384
    Private strand flush not complete
      Current log# 7 seq# 9383 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo07.log
    Thread 1 advanced to log sequence 9384 (LGWR switch)
      Current log# 8 seq# 9384 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo08.log
    Sat Mar 26 08:28:51 2016
    Thread 1 advanced to log sequence 9385 (LGWR switch)
      Current log# 5 seq# 9385 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo05.log
    Sat Mar 26 12:00:28 2016
    Thread 1 advanced to log sequence 9386 (LGWR switch)
      Current log# 6 seq# 9386 mem# 0: /u01/app/oracle/oradata/ORCL/ORCL/onlinelog/redo06.log
    

    该错误信息一般在Oracle实例在创建一些辅助后台进程(如mmon的子进程m00x或者子进程W00x等)时出现进程启动失败时出现,而造成该错误的可能性有多种,包括Oracle实例资源不足、操作系统资源不足等等。其中较为常见的是实例instance的process使用达到上限,可以通过查询v$resource_limit视图来了解实例生命周期内是否发生过process总数暴满的情况:

    我们可以看到processes的MAX_UTILIZATION最大使用数目曾到过LIMIT_VALUE限定的100,

    sessions是126

    从以上V$resource_limit视图的输出来看,极有可能是processes总数达到上限导致了新的后台辅助进程创建失败,其实我们可以很方便地验证这一点:

    [oracle@db trace]$ sqlplus / as sysdba
    
    SQL*Plus: Release 11.2.0.1.0 Production on Sat Mar 26 14:31:33 2016
    
    Copyright (c) 1982, 2009, Oracle.  All rights reserved.
    
    
    Connected to:
    Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
    With the Partitioning, Oracle Label Security, OLAP, Data Mining,
    Oracle Database Vault and Real Application Testing options
    
    SQL> show parameter processes
    
    NAME                                 TYPE        VALUE
    ------------------------------------ ----------- ------------------------------
    aq_tm_processes                      integer     0
    db_writer_processes                  integer     8
    gcs_server_processes                 integer     0
    global_txn_processes                 integer     1
    job_queue_processes                  integer     0
    log_archive_max_processes            integer     4
    processes                            integer     100
    SQL> 

    以上我们可以清楚地了解到是因为数据库在实际运行中出现了processes进程总数达到参数设定上限从而导致问题出现。

    那么可以合理增加初始化参数processes来解决该问题。

  • 相关阅读:
    Python DayDayUp系列 —— 字符串操作(一)
    Condition对象以及ArrayBlockingQueue阻塞队列的实现(使用Condition在队满时让生产者线程等待, 在队空时让消费者线程等待)
    ReadWriteLock: 读写锁
    优秀的github项目学习
    synchronized:内部锁
    ReentreantLock:重入锁
    好的文章
    Java内存模型与volatile关键字
    GitHub远程库的搭建以及使用
    使用Executor框架创建线程池
  • 原文地址:https://www.cnblogs.com/willsun8023/p/5322890.html
Copyright © 2020-2023  润新知