1. Health Monitor简介
Health Monitor是11g里新增加的特性,用于数据库的各层和各个组建的诊断检查。例如可以检查:文件损坏、物理逻辑块损坏、redo和undo故障、数据字典损坏等。HM可以根据检查的结果产生一个报表,并提供解决问题的建议。
1.1 运行方式:
1). Reactive
Fault diagnosability infrastructure能自动响应严重的错误(critical error)。
2). Manually
可以通过DBMS_HM系统包或 Enterprise Manager来手工运行HM。
1.2 运行模式
1). DB-ONLINE
2). DB-OFFLINE
所有HM都能在online模式下运行,只有Redo Integrity Check 和 DB Structure Integrity Check能在offline模式下运行。
2. HM的类型
也可以通过视图来查询:
SYS@PROD1>SELECT name ,DESCRIPTION FROM v$hm_check WHERE internal_check='N'; --internal_check='N' 可以手动检查的项目 NAME DESCRIPTION ---------------------------------------------------------------- -------------------------------------------------- DB Structure Integrity Check Checks integrity of all database files CF Block Integrity Check Checks integrity of a control file block Data Block Integrity Check Checks integrity of a data file block Redo Integrity Check Checks integrity of redo log content Transaction Integrity Check Checks a transaction for corruptions Undo Segment Integrity Check Checks integrity of an undo segment Dictionary Integrity Check Checks dictionary integrity ASM Allocation Check Diagnose allocation failure 8 rows selected.
3. 手工运行HM
3.1 通过DBMS_HM系统包
例如检查字典完整性检查:
BEGIN DBMS_HM.RUN_CHECK(CHECK_NAME=>'Dictionary Integrity Check',RUN_NAME=>'my_run5'); END; /
该程序允许有输入参数,可以通过如下方式查找:
set linesize 200 set pagesize 200 col check_name for a50 col parameter_name for a20 col type for a20 col default_value for a20 col description for a50 SELECT c.name check_name, p.name parameter_name, p.type,p.default_value, p.description FROM v$hm_check_param p, v$hm_check c WHERE p.check_id = c.id and c.internal_check = 'N' ORDER BY c.name; CHECK_NAME PARAMETER_NAME TYPE DEFAULT_VALUE DESCRIPTION -------------------------------------------------- -------------------- -------------------- -------------------- -------------------------------------------------- ASM Allocation Check ASM_DISK_GRP_NAME DBKH_PARAM_TEXT ASM group name CF Block Integrity Check CF_BL_NUM DBKH_PARAM_UB4 Control file block number Data Block Integrity Check BLC_DF_NUM DBKH_PARAM_UB4 File number Data Block Integrity Check BLC_BL_NUM DBKH_PARAM_UB4 Block number Dictionary Integrity Check CHECK_MASK DBKH_PARAM_TEXT ALL Check mask Dictionary Integrity Check TABLE_NAME DBKH_PARAM_TEXT ALL_CORE_TABLES Table name Redo Integrity Check SCN_TEXT DBKH_PARAM_TEXT 0 SCN of the latest good redo (if known) Transaction Integrity Check TXN_ID DBKH_PARAM_TEXT Transaction ID Undo Segment Integrity Check USN_NUMBER DBKH_PARAM_TEXT Undo segment number 9 rows selected.
带参数实例:
BEGIN DBMS_HM.RUN_CHECK (check_name => 'Transaction Integrity Check',run_name => 'my_run1', input_params => 'TXN_ID=7.33.2'); DBMS_HM.RUN_CHECK(CHECK_NAME=>'Data Block Integrity Check',RUN_NAME=>'my_run2',TIMEOUT=>null,INPUT_PARAMS=>'BLC_DF_NUM = 5; BLC_BL_NUM = 203'); END;
3.2 通过OEM允许
Advisor Central -》Checkers -》 Run
4. 查看HM报表
4.1 通过DBMS_HM系统包
实例:
SET LONG 100000 SET LONGCHUNKSIZE 1000 SET PAGESIZE 1000 SET LINESIZE 512 SELECT DBMS_HM.GET_RUN_REPORT('my_run3') FROM DUAL; DBMS_HM.GET_RUN_REPORT('MY_RUN3') ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Basic Run Information Run Name : my_run3 Run Id : 381 Check Name : Data Block Integrity Check Mode : MANUAL Status : COMPLETED Start Time : 2019-08-20 15:26:50.517521 +08:00 End Time : 2019-08-20 15:26:50.964425 +08:00 Error Encountered : 0 Source Incident Id : 0 Number of Incidents Created : 0 Input Paramters for the Run BLC_DF_NUM=1 BLC_BL_NUM=23456 Run Findings And Recommendations
4.2 通过ADRCI查看
adrci> show hm_run
adrci> show hm_run ADR Home = /u01/app/oracle/diag/clients/user_oracle/host_3987733264_80: ************************************************************************* 0 rows fetched <ADR_RELATION name=""> <ADR_HOME name="/u01/app/oracle/diag/clients/user_oracle/host_3987733264_80"> ADR Home = /u01/app/oracle/diag/clients/user_oracle/host_772309917_80: ************************************************************************* 0 rows fetched </ADR_HOME> <ADR_HOME name="/u01/app/oracle/diag/clients/user_oracle/host_772309917_80"> ADR Home = /u01/app/oracle/diag/rdbms/prod1/PROD1: ************************************************************************* ********************************************************** HM RUN RECORD 4 ********************************************************** RUN_ID 381 RUN_NAME my_run3 CHECK_NAME Data Block Integrity Check NAME_ID 3 MODE 0 START_TIME 2019-08-20 15:26:50.517521 +08:00 RESUME_TIME <NULL> END_TIME 2019-08-20 15:26:50.964425 +08:00 MODIFIED_TIME 2019-08-20 15:30:51.962982 +08:00 TIMEOUT 0 FLAGS 0 STATUS 5 SRC_INCIDENT_ID 0 NUM_INCIDENTS 0 ERR_NUMBER 0 REPORT_FILE /u01/app/oracle/diag/rdbms/prod1/PROD1/hm/HMREPORT_my_run3.hm
4.3 通过OEM
Advisor Central -》Checkers-》Runs -》View Report
5. HM视图
除了可以创建一个checker报表外,还可以在ADR里直接查看已经产生的报表。可查看的视图有:
V$HM_RUN
,V$HM_FINDING
, and V$HM_RECOMMENDATION
.
例如:
SYS@PROD1>SELECT run_id, name, check_name, run_mode, src_incident FROM v$hm_run; RUN_ID NAME CHECK_NAME RUN_MODE SRC_INCIDENT ---------- -------------------------------- -------------------------------------------------- -------- ------------ 301 my_run1 Data Block Integrity Check MANUAL 0 361 my_run2 Data Block Integrity Check MANUAL 0 381 my_run3 Data Block Integrity Check MANUAL 0 401 my_run5 Dictionary Integrity Check MANUAL 0 241 my_run Dictionary Integrity Check MANUAL 0
SYS@PROD1> SELECT type, description FROM v$hm_finding WHERE run_id =381; TYPE DESCRIPTION ------------- ----------------------------------------- FAILURE Block 64349 in datafile 1: '/u01/app/oracle/oradata/PROD1/system01.dbf ' is media corrupt FAILURE Block 64351 in datafile 1: '/u01/app/oracle/oradata/PROD1/system01.dbf ' is media corrupt