对于Oracle数据块物理损坏的情形,在我们有备份的情况下可以直接使用备份来恢复。对于通过备份恢复,Oracel为我们提供了很多种方式,冷备,基于用户管理方式,RMAN方式等等。对于这几种方式我们需要实现基于数据库以及文件级别的恢复。RMAN同时也提供了基于块介质方式的恢复。也就是说我们根本不需要还原数据文件,而是直接从备份文件基于块来提取以实现联机恢复。可参考基于RMAN实现坏块介质恢复(blockrecover) 。这是比较理想的情形。如果没有任何备份怎么办?我们可以使用Oracle自带的DBMS_REPAIR包来实现修复。注意,不要被文章题目有所误导。这里的修复是有损修复也就是说将受损的数据块标记为坏块,不对其进行访问而已。就好比我们磁盘有坏道,找个磁盘修复工具将坏道标出来不使用,同理。那受损的数据岂不是无力回天啦,呜呜......要记得随时备份阿。。
1、DBMS_REPAIR包所含的过程
Procedure_Name Description
----------------- ------------------------------------
ADMIN_TABLES Provides administrative functions (create, drop, purge) for repair or orphan key tables.
Note: These tables are always created in the SYS schema.
CHECK_OBJECT Detects and reports corruptions in a table or index
DUMP_ORPHAN_KEYS Reports on index entries that point to rows in corrupt data blocks
FIX_CORRUPT_BLOCKS Marks blocks as software corrupt that have been previously identified as corrupt by the CHECK_OBJECT procedure
REBUILD_FREELISTS Rebuilds the free lists of the object
SEGMENT_FIX_STATUS Provides the capability to fix the corrupted state of a bitmap entry when segment space management is AUTO
SKIP_CORRUPT_BLOCKS When used, ignores blocks marked corrupt during table and index scans.
If not used, you get error ORA-01578 when encountering blocks marked corrupt.
2、DBMS_REPAIR的一些局限性
Tables with LOB data types, nested tables, and varrays are supported, but the out-of-line columns are ignored.
Clusters are supported in the SKIP_CORRUPT_BLOCKS and REBUILD_FREELISTS procedures, but not in the CHECK_OBJECT procedure.
Index-organized tables and LOB indexes are not supported.
The DUMP_ORPHAN_KEYS procedure does not operate on bitmap indexes or function-based indexes.
The DUMP_ORPHAN_KEYS procedure processes keys that are no more than 3,950 bytes long.
3、创建演示环境
--当前环境 sys@USBO> select * from v$version where rownum<2; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production --创建表空间 sys@USBO> create tablespace tbs_tmp datafile '/u02/database/usbo/oradata/tbs_tmp.dbf' size 10m autoextend on; --创建表对象tb_obj及其索引 sys@USBO> create table tb_obj tablespace tbs_tmp as select * from dba_objects; sys@USBO> create index i_tb_obj on tb_obj(object_id); --表段上的相关信息 sys@USBO> select segment_name , header_file , header_block,blocks 2 from dba_segments where segment_name ='TB_OBJ'; SEGMENT_NAME HEADER_FILE HEADER_BLOCK BLOCKS ------------------------------ ----------- ------------ ---------- TB_OBJ 6 130 1152 --使用linux自带的dd命令来损坏数据块 [oracle@linux1 ~]$ dd of=/u02/database/usbo/oradata/tbs_tmp.dbf bs=8192 conv=notrunc seek=131 <<EOF > Corrupt me! > EOF 0+1 records in 0+1 records out 12 bytes (12 B) copied, 0.000209854 seconds, 57.2 kB/s [oracle@linux1 ~]$ dd of=/u02/database/usbo/oradata/tbs_tmp.dbf bs=8192 conv=notrunc seek=141 <<EOF > Corrupt me! > EOF 0+1 records in 0+1 records out 12 bytes (12 B) copied, 0.00019939 seconds, 60.2 kB/s [oracle@linux1 ~]$ dd of=/u02/database/usbo/oradata/tbs_tmp.dbf bs=8192 conv=notrunc seek=151 <<EOF > Corrupt me! > EOF 0+1 records in 0+1 records out 12 bytes (12 B) copied, 2.1672e-05 seconds, 554 kB/s sys@USBO> alter system flush buffer_cache; --下面的查询收到了错误提示 sys@USBO> select count(*) from tb_obj; select count(*) from tb_obj * ERROR at line 1: ORA-01578: ORACLE data block corrupted (file # 6, block # 131) ORA-01110: data file 6: '/u02/database/usbo/oradata/tbs_tmp.dbf'
4、使用DBMS_REPAIR修复坏块
Step a 创建相应的表对象 --使用DBMS_REPAIR.ADMIN_TABLES过程创建一个表对象,用于记录需要被修复的表 sys@USBO> BEGIN 2 DBMS_REPAIR.ADMIN_TABLES ( 3 TABLE_NAME => 'REPAIR_TABLE', 4 TABLE_TYPE => dbms_repair.repair_table, 5 ACTION => dbms_repair.create_action, 6 TABLESPACE => 'USERS'); 7 END; 8 / PL/SQL procedure successfully completed. --使用DBMS_REPAIR.ADMIN_TABLES过程创建一个表对象,用于记录在表块损坏后那些孤立索引,也就是指向坏块的那些索引 sys@USBO> BEGIN 2 DBMS_REPAIR.ADMIN_TABLES 3 ( 4 TABLE_NAME => 'ORPHAN_KEY_TABLE', 5 TABLE_TYPE => DBMS_REPAIR.ORPHAN_TABLE, 6 ACTION => DBMS_REPAIR.CREATE_ACTION, 7 TABLESPACE => 'USERS' 8 ); 9 END; 10 / PL/SQL procedure successfully completed. Step b 校验受损的对象 --使用DBMS_REPAIR.CHECK_OBJECT来检测对象上受损的情形,并返回受损块数 sys@USBO> SET SERVEROUTPUT ON sys@USBO> DECLARE num_corrupt INT; 2 BEGIN 3 num_corrupt := 0; 4 DBMS_REPAIR.CHECK_OBJECT ( 5 SCHEMA_NAME => 'SYS', 6 OBJECT_NAME => 'TB_OBJ', 7 REPAIR_TABLE_NAME => 'REPAIR_TABLE', 8 CORRUPT_COUNT => num_corrupt); 9 DBMS_OUTPUT.PUT_LINE('number corrupt: ' || TO_CHAR (num_corrupt)); 10 END; 11 / number corrupt: 3 PL/SQL procedure successfully completed. --下面我们可以从repair_table查询到受损的块 --从下面的查询中可以看出列marked_corrupt全部为true,表明我们在CHECK_OBJECT已经标注了坏块 --有一点不太清楚的是为什么块131在查询中被列出3次? sys@USBO> COLUMN object_name FORMAT a10 sys@USBO> COLUMN repair_description FORMAT a28 sys@USBO> SET LINES 10000 sys@USBO> SELECT object_name, block_id, corrupt_type,marked_corrupt,repair_description FROM repair_table; OBJECT_NAM BLOCK_ID CORRUPT_TYPE MARKED_COR REPAIR_DESCRIPTION ---------- ---------- ------------ ---------- ---------------------------- TB_OBJ 131 6148 TRUE mark block software corrupt TB_OBJ 131 6148 TRUE mark block software corrupt TB_OBJ 131 6148 TRUE mark block software corrupt TB_OBJ 141 6148 TRUE mark block software corrupt TB_OBJ 151 6148 TRUE mark block software corrupt Step c 标记坏块 --过程FIX_CORRUPT_BLOCKS用于标记坏块,在这个演示中,我们在CHECK_OBJECT已经被标注了,如没有执行下面的过程 --由于上一步已经标注,所以下面的输出为0 sys@USBO> SET SERVEROUTPUT ON sys@USBO> DECLARE num_fix INT; 2 BEGIN 3 num_fix := 0; 4 DBMS_REPAIR.FIX_CORRUPT_BLOCKS ( 5 SCHEMA_NAME => 'SYS', 6 OBJECT_NAME=> 'TB_OBJ', 7 OBJECT_TYPE => dbms_repair.table_object, 8 REPAIR_TABLE_NAME => 'REPAIR_TABLE', 9 FIX_COUNT=> num_fix); 10 DBMS_OUTPUT.PUT_LINE('num fix: ' || TO_CHAR(num_fix)); 11 END; 12 / num fix: 0 -- Author : Robinson Cheng -- Blog : http://blog.csdn.net/robinson_0612 -- DB forum : http://bbs.dbsupport.cn PL/SQL procedure successfully completed. Step d DUMP孤立的索引键值 --使用DUMP_ORPHAN_KEYS过程将那些指向坏块的索引键值填充到ORPHAN_KEY_TABLE sys@USBO> SET SERVEROUTPUT ON sys@USBO> DECLARE num_orphans INT; 2 BEGIN 3 num_orphans := 0; 4 DBMS_REPAIR.DUMP_ORPHAN_KEYS ( 5 SCHEMA_NAME => 'SYS', 6 OBJECT_NAME => 'I_TB_OBJ', 7 OBJECT_TYPE => dbms_repair.index_object, 8 REPAIR_TABLE_NAME => 'REPAIR_TABLE', 9 ORPHAN_TABLE_NAME=> 'ORPHAN_KEY_TABLE', 10 KEY_COUNT => num_orphans); 11 DBMS_OUTPUT.PUT_LINE('orphan key count: ' || TO_CHAR(num_orphans)); 12 END; 13 / orphan key count: 242 PL/SQL procedure successfully completed. --下面的查询可以看到正好等于上面返回的数量也就是242条记录 sys@USBO> select count(*) from orphan_key_table; COUNT(*) ---------- 242 --验证对象是否可以查询,下面的结果显示依旧无法查询 sys@USBO> select count(*) from tb_obj; select count(*) from tb_obj * ERROR at line 1: ORA-01578: ORACLE data block corrupted (file # 6, block # 131) ORA-01110: data file 6: '/u02/database/usbo/oradata/tbs_tmp.dbf' Step e 跳过坏块 --使用SKIP_CORRUPT_BLOCKS来告知Oracle哪些坏块需要被跳过 sys@USBO> BEGIN 2 DBMS_REPAIR.SKIP_CORRUPT_BLOCKS ( 3 SCHEMA_NAME => 'SYS', 4 OBJECT_NAME => 'TB_OBJ', 5 OBJECT_TYPE => dbms_repair.table_object, 6 FLAGS => dbms_repair.skip_flag); 7 END; 8 / PL/SQL procedure successfully completed. --由于索引键上存在孤立索引,因此我们重建索引 sys@USBO> alter index i_tb_obj rebuild; Index altered. --验证结果 sys@USBO> select count(*) from tb_obj; COUNT(*) ---------- 72211
5、后记
a、再次提醒,备份重于一切,因此无论何时应保留可用的备份。
b、DBMS_REPAIR包并不是真正意思上的坏块修复,而是标记坏块,损坏的这部分数据被丢失。
c、DBMS_REPAIR包的几个步骤,先创建相应的表用于存储修复表及孤立索引,其次CHECK_OBJECT,FIX_CORRUPT_BLOCKS,DUMP_ORPHAN_KEYS,SKIP_CORRUPT_BLOCKS。
d、完整DBMS_REPAIR上面提到的几个步骤后,建议重建索引。
e、注,如果受损表对象被其他对象参照,建议先disable这些约束,那些在子表上孤立的记录可根据情形决定后再enable约束。
专业数据库论坛:http://bbs.dbsupport.cn/
相关参考
中小型数据库 RMAN CATALOG 备份恢复方案(一)
基于RMAN的异机数据库克隆(rman duplicate)
基于RMAN从活动数据库异机克隆(rman duplicate from active DB)
RMAN duplicate from active 时遭遇 ORA-17627 ORA-12154
Oracle 基于备份控制文件的恢复(unsing backup controlfile)