在Oracle中,各个组件(监听器、数据库实例、各种配置工具)在安装和运行时都会有相应的日志Log和跟踪文件Trace生成。Oracle 11g之前,这些信息都是零散的分布在Oracle组件目录中的。在11g,Oracle推出了ADR(Automatic Diagnostic Repository)的概念,将这些信息统一的列入到其中管理。
1、ADRCI初探
在11g中,提供了ADR_HOME目录,其中集中保存各类型的日志和跟踪信息。
[oracle@bspdev app]$ ls -l
total 24
drwxrwxr-x. 3 oracle oinstall 4096 Mar 25 2011 11.2.0
drwxr-x---. 3 oracle oinstall 4096 Apr 1 2011 admin
drwxr-x---. 5 oracle oinstall 4096 Apr 1 2011 cfgtoollogs
drwxr-xr-x. 2 oracle oinstall 4096 Mar 31 2011 checkpoints
drwxrwxr-x. 6 oracle asmadmin 4096 Apr 1 2011 diag
drwxrwxr-x. 76 oracle oinstall 4096 Feb 23 09:34 oracle
[oracle@bspdev diag]$ ls
asm asmtool rdbms tnslsnr
[oracle@bspdev diag]$ cd rdbms
[oracle@bspdev rdbms]$ ls
ora11g
[oracle@bspdev rdbms]$ cd ora11g
[oracle@bspdev ora11g]$ ls
i_1.mif ora11g
[oracle@bspdev ora11g]$ cd ora11g
[oracle@bspdev ora11g]$ ls
alert cdump hm incident incpkg ir lck metadata stage sweep trace
在$ORACLE_BASE目录下,存在diag文件夹,里面保存如asm、asmtool、rdbms和TNS listener等重要组件的日志信息。针对每一个组件,又按照告警文件、跟踪文件和dump等分类进行组织。
Diag目录实际上就形成了一个输出日志信息资料库,所有的诊断信息和日志信息都在该目录中进行分类保存。此外,Oracle推出了ADRCI工具,可以统一的使用接口命令检查日志和管理诊断信息。
ora11g:/home/ora11g>adrci
ADRCI: Release 11.2.0.1.0 - Production on Mon May 21 13:56:53 2012
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
ADR base = "/nbsdu01/app/oracle"
adrci>
通过命令行adrci,我们可以在统一的命令行中进行诊断信息管理。
adrci> help
HELP [topic]
Available Topics:
CREATE REPORT
ECHO
EXIT
HELP
HOST
IPS
PURGE
RUN
SET BASE
SET BROWSER
SET CONTROL
SET ECHO
SET EDITOR
SET HOMES | HOME | HOMEPATH
SET TERMOUT
SHOW ALERT
SHOW BASE
SHOW CONTROL
SHOW HM_RUN
SHOW HOMES | HOME | HOMEPATH
SHOW INCDIR
SHOW INCIDENT
SHOW PROBLEM
SHOW REPORT
SHOW TRACEFILE
SPOOL
There are other commands intended to be used directly by Oracle, type
"HELP EXTENDED" to see the list
2、查看日志信息
在使用adrci的过程中,要注意当前homepath的问题。每个Oracle组件,都存在一个诊断信息目录。
ADR base = "/nbstu01/app/oracle"
adrci> show homepath
ADR Homes:
diag/rdbms/nbstest/NBSTEST
diag/tnslsnr/P550_05_LC/listener
要注意,如果要进入具体的那个组件查看日志信息和诊断信息,首先需要设置homepath到一个组件目录里面。上面的两个ADR home中,是Oracle数据库和监听器两个组件。如果我们要查看数据库日志,需要如下的配置。
adrci> set homepath diag/rdbms/nbstest/NBSTEST
adrci> show alert -TAIL 10
2012-05-21 15:37:59.861000 +08:00
Thread 1 cannot allocate new log, sequence 2319
Private strand flush not complete
Current log# 2 seq# 2318 mem# 0: /nbstdata01/oradata/NBSTEST/redo02a.log
Current log# 2 seq# 2318 mem# 1: /nbstdata02/oradata/NBSTEST/redo02b.log
2012-05-21 15:38:02.931000 +08:00
Thread 1 advanced to log sequence 2319 (LGWR switch)
Current log# 1 seq# 2319 mem# 0: /nbstdata01/oradata/NBSTEST/redo01a.log
Current log# 1 seq# 2319 mem# 1: /nbstdata02/oradata/NBSTEST/redo01b.log
2012-05-21 15:49:34.382000 +08:00
Thread 1 cannot allocate new log, sequence 2320
Private strand flush not complete
Current log# 1 seq# 2319 mem# 0: /nbstdata01/oradata/NBSTEST/redo01a.log
Current log# 1 seq# 2319 mem# 1: /nbstdata02/oradata/NBSTEST/redo01b.log
2012-05-21 15:49:37.420000 +08:00
Thread 1 advanced to log sequence 2320 (LGWR switch)
Current log# 3 seq# 2320 mem# 0: /nbstdata01/oradata/NBSTEST/redo03a.log
Current log# 3 seq# 2320 mem# 1: /nbstdata02/oradata/NBSTEST/redo03b.log
2012-05-21 16:03:48.579000 +08:00
Thread 1 cannot allocate new log, sequence 2321
Private strand flush not complete
Current log# 3 seq# 2320 mem# 0: /nbstdata01/oradata/NBSTEST/redo03a.log
Current log# 3 seq# 2320 mem# 1: /nbstdata02/oradata/NBSTEST/redo03b.log
2012-05-21 16:03:51.656000 +08:00
Thread 1 advanced to log sequence 2321 (LGWR switch)
Current log# 2 seq# 2321 mem# 0: /nbstdata01/oradata/NBSTEST/redo02a.log
Current log# 2 seq# 2321 mem# 1: /nbstdata02/oradata/NBSTEST/redo02b.log
上面命令show alert显示的内容是数据库组件日志alert信息。注意,此处我们也可以使用tail –n 命令,但是后面的数字表示的是日志的条目数,而不是记录行数!
3、查看incident和problem信息
在很多时候,数据库组件生成的错误事件信息,都是以诊断信息的形式产生出来。利用ADRCI,可以方便的对这些信息进行查看。
adrci> show incident
ADR Home = /nbstu01/app/oracle/diag/rdbms/nbstest/NBSTEST:
*************************************************************************
INCIDENT_ID PROBLEM_KEY CREATE_TIME
-------------------- ----------------------------------------------------------- ----------------------------------------
41385 ORA 445 2011-07-03 00:47:52.612000 +08:00
73745 ORA 3137 [12333] 2011-09-08 09:23:38.004000 +08:00
74145 ORA 3137 [12333] 2011-09-08 10:45:20.543000 +08:00
74225 ORA 3137 [12333] 2011-09-08 10:52:21.273000 +08:00
74217 ORA 3137 [12333] 2011-09-08 10:58:45.016000 +08:00
73753 ORA 3137 [12333] 2011-09-08 11:09:32.727000 +08:00
74073 ORA 3137 [12333] 2011-09-08 12:52:20.201000 +08:00
74089 ORA 3137 [12333] 2011-09-08 12:55:07.228000 +08:00
74074 ORA 3137 [12333] 2011-09-08 12:59:46.138000 +08:00
74075 ORA 3137 [12333] 2011-09-08 13:00:38.048000 +08:00
74457 ORA 3137 [12333] 2011-09-08 13:02:44.184000 +08:00
73841 ORA 3137 [12333] 2011-09-08 14:39:46.547000 +08:00
153367 ORA 445 2012-03-02 19:01:31.854000 +08:00
153368 ORA 445 2012-03-02 23:14:56.008000 +08:00
14 rows fetched
adrci>
adrci> show problem
ADR Home = /nbstu01/app/oracle/diag/rdbms/nbstest/NBSTEST:
*************************************************************************
PROBLEM_ID PROBLEM_KEY LAST_INCIDENT LASTINC_TIME
-------------------- ----------------------------------------------------------- -------------------- ----------------------------------------
1 ORA 445 153368 2012-03-02 23:14:56.008000 +08:00
2 ORA 3137 [12333] 73841 2011-09-08 14:39:46.547000 +08:00
2 rows fetched
4、生成诊断package
对于一些incident,我们是无法进行诊断处理的,需要协同Oracle Support进行检查调试。这个时候,我们可以利用ADRCI工具将错误incident打包成package发送给Oracle客户服务人员。
打包package的步骤分为logical package和physical package两个大步骤。具体如下:
--对事件74073创建逻辑包
adrci> ips create package incident 74073
Created package 1 based on incident id 74073, correlation level typical
--同时将153368事件也加入到package 1中;
adrci> ips add incident 153368 package 1
Added incident 153368 to package 1
最后,将Logical Package输出为Physical Package。
adrci> host
$ pwd
/home/oracle
$ exit
adrci> ips generate package 1 in /home/oracle
Generated package 1 in file /home/oracle/ORA313712_20120521160458_COM_1.zip, mode complete
oracle:/home/oracle>ls -l | grep ORA
-rw-r--r-- 1 oracle dba 3072148 May 21 16:08 ORA313712_20120521160458_COM_1.zip
我们就可以直接将给zip包发送出去,作为诊断材料。
5、purge命令
诊断跟踪信息是一个单项积累的过程。当诊断信息和日志信息过多的时候,就可能会给系统一些负面影响。比较方便的做法是周期性的进行检查,将不需要的诊断信息删除。
在没有adrci的时候,我们不得不分别到所有的目录里面进行清理。但是借助adrci的purge命令和control配置,可以方便的进行整理。
Purge命令自身带有三个操作模式,进入一个特定ADR目录后,可以删除特定incident、特定时间范围和诊断文件类型。语法结构如下:
purge [[-i {id | start_id end_id}] | [-age mins [-type {ALERT|INCIDENT|TRACE|CDUMP|HM}]]]
[[-i {id1 | start_id end_id}]
Purges either a specific incident ID (id) or a range of incident IDs (start_id and end_id)
[-age mins]
Purges only data older than mins minutes.
[-type {ALERT|INCIDENT|TRACE|CDUMP|HM}]
Specifies the type of diagnostic data to purge (alert log messages, incident data, trace files (including dumps), core files, or Health Monitor run data and reports).
如删除20分钟前的所有信息。
adrci> purge -age 20
adrci>
adrci> purge -age 20
adrci> show tracefile
diag/rdbms/ora11g/ora11g/trace/tautltest.txt
diag/rdbms/ora11g/ora11g/trace/alert_ora11g.log
diag/rdbms/ora11g/ora11g/trace/squtltest.txt
diag/rdbms/ora11g/ora11g/trace/tasqdirset.txt
除了手工进行删除外,Oracle ADR还提供了删除策略,通过control进行配置。
adrci> show homepath
ADR Homes:
diag/asm/user_oracle/host_1437849207_76
diag/rdbms/ora11g/ora11g
diag/asmtool/user_root/host_1437849207_76
diag/asmtool/user_oracle/host_1437849207_76
diag/tnslsnr/bspdev/listener
diag/tnslsnr/bspdev/listener_ora11g
adrci> set homepath diag/rdbms/ora11g/ora11g
adrci> show home
ADR Homes:
diag/rdbms/ora11g/ora11g
adrci> show control
ADR Home = /u01/app/diag/rdbms/ora11g/ora11g:
*************************************************************************
ADRID SHORTP_POLICY LONGP_POLICY LAST_MOD_TIME LAST_AUTOPRG_TIME LAST_MANUPRG_TIME ADRDIR_VERSION ADRSCHM_VERSION ADRSCHMV_SUMMARY ADRALERT_VERSION CREATE_TIME
-------------------- -------------------- -------------------- ---------------------------------------- ---------------------------------------- ---------------------------------------- -------------------- -------------------- -------------------- -------------------- ----------------------------------------
799124850 720 8760 2011-04-01 10:13:25.436450 -04:00 2012-05-15 20:08:26.781034 -04:00 1 2 76 1 2011-04-01 10:13:25.436450 -04:00
1 rows fetched
6、结论
Oracle 11g中,一些小工具的推出帮助我们方便解决一些繁琐的工作。使用ADRCI,可以让我们的诊断过程更加方便。