• [原]Oracle Control File 意外情况研究


    差不多4年前吧,那时候才刚接触Oracle,只记得一位Oracle老师说过“在你的Oracle DBA生涯中不应该丢失Control File”,可见Control File是如何的重要。

    近日在泡ITPUB的时候发现一贴:能否在不停数据库情况下换掉损坏的控制文件 ,经过多位高现身说法,自己也做了点实验,与大家分享一下:

    首先,看一下我的机器情况:

    OS 情况:

    [root@mailserver ~]# uname -a ; cat /etc/redhat-release ; 
    Linux mailserver 2.6.9-67.ELsmp #1 SMP Fri Nov 16 12:49:06 EST 2007 x86_64 x86_64 x86_64 GNU/Linux
    CentOS release 4.6 (Final)

    Oracle 信息:

    sys@mydb(10.168.0.202) SQL> select INSTANCE_NAME,version,database_status from v$instance;
    INSTANCE_NAME    VERSION           DATABASE_STATUS
    ---------------- ----------------- -----------------
    mydb             10.2.0.1.0        ACTIVE
    
    sys@mydb(10.168.0.202) SQL> select name,open_mode from v$database;
    NAME      OPEN_MODE
    --------- ----------
    MYDB      READ WRITE
    
    sys@mydb(10.168.0.202) SQL> select name,status from v$controlfile;       
    NAME                                               STATUS
    -------------------------------------------------- -------
    /u02/oradata/mydb/controlfile/control01.ctl
    /u02/oradata/mydb/controlfile/control02.ctl
    /u02/oradata/mydb/controlfile/control03.ctl
    /u02/oradata/mydb/controlfile/control04.ctl

    从查询中可以看到我的Oracle有 4 个control file。

    在OS中查一下control file的情况:

    [oracle@mailserver controlfile]$ pwd    
    /u02/oradata/mydb/controlfile
    [oracle@mailserver controlfile]$ ls -lth
    总用量 31M
    -rw-r-----  1 oracle oinstall 7.8M 12月  2 14:57 control01.ctl
    -rw-r-----  1 oracle oinstall 7.8M 12月  2 14:57 control02.ctl
    -rw-r-----  1 oracle oinstall 7.8M 12月  2 14:57 control03.ctl
    -rw-r-----  1 oracle oinstall 7.8M 12月  2 14:57 control04.ctl
    [oracle@mailserver controlfile]$ watch -n1 md5sum * 
    Every 1.0s: md5sum control01.ctl control02.ctl control03.ctl control04.ctl                                  Wed Dec  2 15:11:32 2009
    
    1cbca423c5c18fea3cb1fc13401c49d6  control01.ctl
    b6a02759b5896da4305d0f2d324073fb  control02.ctl
    b6a02759b5896da4305d0f2d324073fb  control03.ctl
    1cbca423c5c18fea3cb1fc13401c49d6  control04.ctl

    观察 control file 更新情况,这是个空闲库,我把listener也关掉了。我一直以为control file应该都是一样的内容,但是从md5sum的结果来看不是,第1和第4始终保持一样,第二和第三始终保持一样,很有趣的发现

    开始做实验: 实验一,模拟Control File 丢失:

    [oracle@mailserver controlfile]$ ls
    control01.ctl  control02.ctl  control03.ctl  control04.ctl
    [oracle@mailserver controlfile]$ rm -rf control04.ctl 
    [oracle@mailserver controlfile]$ ls
    control01.ctl  control02.ctl  control03.ctl
    [oracle@mailserver controlfile]$ watch -n2 md5sum * 
    Every 2.0s: md5sum control01.ctl control02.ctl control03.ctl                                                Wed Dec  2 15:21:12 2009
    
    2ccb97d9a9ab834d75a3a503960a0cbd  control01.ctl
    a1fc984f27925f0b1ab2b17928289963  control02.ctl
    a1fc984f27925f0b1ab2b17928289963  control03.ctl
    

    可以观察到 control file 还是在更新的,在session#0 中做一些操作:

    sys@mydb(10.168.0.202) SQL> alter system checkpoint ;
    
    System altered.
    
    sys@mydb(10.168.0.202) SQL> alter system checkpoint ;
    
    System altered.
    
    sys@mydb(10.168.0.202) SQL> alter system switch logfile ;
    
    System altered.
    

    在session#1 中做一下操作:

    zhyw@mydb(10.168.0.202) SQL> update ip_state set state=2 where state=1;   
    
    130 rows updated.
    
    zhyw@mydb(10.168.0.202) SQL> commit;
    
    Commit complete.
    
    zhyw@mydb(10.168.0.202) SQL> select count(*) from ip_state;                      
    
      COUNT(*)
    ----------
           189
    
    zhyw@mydb(10.168.0.202) SQL> create table ip_state_bak as select * from ip_state;
    create table ip_state_bak as select * from ip_state
                                               *
    ERROR at line 1:
    ORA-00210: cannot open the specified control file
    ORA-00202: control file: '/u02/oradata/mydb/controlfile/control04.ctl'
    ORA-27041: unable to open file
    Linux-x86_64 Error: 2: No such file or directory
    Additional information: 3
    

    终于看到丢失一个控制文件的影响了 ^_^

    尝试复制一个control file 出来,Oracle还在跑啊

    [oracle@mailserver controlfile]$ pwd ; ls;
    /u02/oradata/mydb/controlfile
    control01.ctl  control02.ctl  control03.ctl
    [oracle@mailserver controlfile]$ cp control01.ctl  control04.ctl 
    [oracle@mailserver controlfile]$ watch -n2 md5sum * 
    Every 2.0s: md5sum control01.ctl control02.ctl control03.ctl control04.ctl                                  Wed Dec  2 15:30:53 2009
    
    2998c5d3ffe493312284efc5f87c95e7  control01.ctl
    f77690d016d0f1cbd5256cf0bdda0d93  control02.ctl
    f77690d016d0f1cbd5256cf0bdda0d93  control03.ctl
    49a9c4f48732cded164f474c8464b999  control04.ctl
    
    [oracle@mailserver controlfile]$ watch -n2 md5sum * 
    Every 2.0s: md5sum control01.ctl control02.ctl control03.ctl control04.ctl                                  Wed Dec  2 15:31:13 2009
    
    ae6ae06f1395ac9c6783d783be566fb4  control01.ctl
    7e56cfb3412a0dc0ed15c55c7e128283  control02.ctl
    7e56cfb3412a0dc0ed15c55c7e128283  control03.ctl
    49a9c4f48732cded164f474c8464b999  control04.ctl
    

    可见“自作聪明”地Copy一个Control File 出来是“冻结”不动的。过了一阵,Oracle 挂掉了,alert log中出现:

    Wed Dec  2 15:32:35 2009
    ********************* ATTENTION: ******************** 
     The controlfile header block returned by the OS
     has a sequence number that is too old. 
     The controlfile might be corrupted.
     PLEASE DO NOT ATTEMPT TO START UP THE INSTANCE 
     without following the steps below.
     RE-STARTING THE INSTANCE CAN CAUSE SERIOUS DAMAGE 
     TO THE DATABASE, if the controlfile is truly corrupted.
     In order to re-start the instance safely, 
     please do the following:
     (1) Save all copies of the controlfile for later 
         analysis and contact your OS vendor and Oracle support.
     (2) Mount the instance and issue: 
         ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
     (3) Unmount the instance. 
     (4) Use the script in the trace file to
         RE-CREATE THE CONTROLFILE and open the database. 
    *****************************************************
    Instance terminated by USER, pid = 2335
    

    看来丢失control file不可怕,最怕control file不一致啊。

    此时Oracle挂掉了,control file 当然也不会更新了,做以下操作:

    [oracle@mailserver controlfile]$ pwd ; ls ;
    /u02/oradata/mydb/controlfile
    control01.ctl  control02.ctl  control03.ctl  control04.ctl
    [oracle@mailserver controlfile]$ rm -rf control04.ctl 
    [oracle@mailserver controlfile]$ cp control01.ctl control04.ctl  
    [oracle@mailserver controlfile]$ md5sum *
    f513df299a0789b6e527a2ef6c571d25  control01.ctl
    28fd3396e22604ba55b6cbe9d2e8fd40  control02.ctl
    28fd3396e22604ba55b6cbe9d2e8fd40  control03.ctl
    f513df299a0789b6e527a2ef6c571d25  control04.ctl
    

    然后尝试启动Oracle

    还是可以起来的 :)

    sys@mydb(10.168.0.202) SQL> select name,open_mode from v$database;
    NAME      OPEN_MODE
    --------- ----------
    MYDB      READ WRITE
    
    sys@mydb(10.168.0.202) SQL> select name,status from v$controlfile;       
    NAME                                               STATUS
    -------------------------------------------------- -------
    /u02/oradata/mydb/controlfile/control01.ctl
    /u02/oradata/mydb/controlfile/control02.ctl
    /u02/oradata/mydb/controlfile/control03.ctl
    /u02/oradata/mydb/controlfile/control04.ctl
    

    再做一个实验,叫实验二,这次是破坏掉其中一个control file,不是“弄丢”噢 :-)

    [oracle@mailserver controlfile]$ date +%x%X ; date > control04.ctl 
    2009年12月02日15时57分32秒
    ###### 在 2009年12月02日15时57分32秒 “搞坏”一个Control File
    ###### alert file 中出现:
    Wed Dec  2 15:57:34 2009
    Hex dump of (file 0, block 1) in trace file /u01/app/admin/mydb/bdump/mydb_arc0_2935.trc
    Corrupt block relative dba: 0x00000001 (file 0, block 1)
    Completely zero block found during control file header read
    Wed Dec  2 15:57:34 2009
    Errors in file /u01/app/admin/mydb/bdump/mydb_arc0_2935.trc:
    ORA-00202: control file: '/u02/oradata/mydb/controlfile/control04.ctl'
    Wed Dec  2 15:57:34 2009
    Errors in file /u01/app/admin/mydb/bdump/mydb_arc0_2935.trc:
    ORA-00227: corrupt block detected in control file: (block 1, # blocks 1)
    ORA-00202: control file: '/u02/oradata/mydb/controlfile/control04.ctl'
    

    这证明,我“的确搞坏”了一个控制文件

    在一个普通用户中执行一些语句:

    zhyw@mydb(10.168.0.202) SQL> update ip_state set state=2 where state=1; 
    
    130 rows updated.
    
    zhyw@mydb(10.168.0.202) SQL> commit;
    
    Commit complete.
    

    可以更新,可以提交.

    建表

    zhyw@mydb(10.168.0.202) SQL> create table a01 as select * from ip_state;
    create table a01 as select * from ip_state
                                      *
    ERROR at line 1:
    ORA-00227: corrupt block detected in control file: (block 1, # blocks 1)
    ORA-00202: control file: '/u02/oradata/mydb/controlfile/control04.ctl'
    

    看来Control File出问题后,建表是有问题的,不知其他DDL语句有没有影响。

    做一个查询:

    zhyw@mydb(10.168.0.202) SQL> select sysdate,ip,state from ip_state where rownum<10;    
    
    SYSDATE             IP                   STATE
    ------------------- --------------- ----------
    2009-12-02 16:00:52 10.168.10.132            2
    2009-12-02 16:00:52 10.168.10.133            2
    2009-12-02 16:00:52 10.168.10.134            2
    2009-12-02 16:00:52 10.168.10.135            0
    2009-12-02 16:00:52 10.168.10.136            0
    2009-12-02 16:00:52 10.168.10.137            0
    2009-12-02 16:00:52 10.168.10.138            0
    2009-12-02 16:00:52 10.168.10.139            2
    2009-12-02 16:00:52 10.168.10.140            0
    

    大家可以看到,Oracle在坏掉一个控制文件的情况下坚持近3分钟,从“2009年12月02日15时57分32秒” 到“2009-12-02 16:00:52”

    等不及了,强制checkpoint 一下,让Oracle挂掉

    sys@mydb(10.168.0.202) SQL> select sysdate from dual;
    SYSDATE
    -------------------
    2009-12-02 16:00:56
    
    sys@mydb(10.168.0.202) SQL> alter system checkpoint; 
    ERROR:
    ORA-03114: not connected to ORACLE
    
    
    alter system checkpoint
    *
    ERROR at line 1:
    ORA-03113: end-of-file on communication channel
    

    此时,alert file 出现:

    Hex dump of (file 0, block 1) in trace file /u01/app/admin/mydb/bdump/mydb_ckpt_2909.trc
    Corrupt block relative dba: 0x00000001 (file 0, block 1)
    Completely zero block found during control file header read
    Wed Dec  2 16:00:57 2009
    Errors in file /u01/app/admin/mydb/bdump/mydb_ckpt_2909.trc:
    ORA-00202: control file: '/u02/oradata/mydb/controlfile/control04.ctl'
    Wed Dec  2 16:00:57 2009
    Errors in file /u01/app/admin/mydb/bdump/mydb_ckpt_2909.trc:
    ORA-00227: corrupt block detected in control file: (block 1, # blocks 1)
    ORA-00202: control file: '/u02/oradata/mydb/controlfile/control04.ctl'
    CKPT: terminating instance due to error 227
    Instance terminated by CKPT, pid = 2909
    
    总结一下:
    1。Control File丢失(当然不是全部丢失啦),系统还是可以checkpoing 和 switch logfile 的。
    2。Control File不一致,Oracle马上就挂掉了。
    3。Control File破坏掉后,Oracle还是能够坚持一阵子的,select update都没有问题,checkpoint的时候就挂掉了。
    
  • 相关阅读:
    XidianOJ 1099 A simple problem
    XidianOJ 1097 焊板子的xry111
    XidianOJ 1172 Hiking
    XidianOJ 1176 ship
    bzoj1912(树的直径)
    黑暗城堡(最短路径树)
    poj1639(k限制最小生成树)
    0-1分数规划
    tyvj1391(Kruskal)
    [Usaco2007 Nov]Cow Relays(矩阵乘法+floyd)
  • 原文地址:https://www.cnblogs.com/killkill/p/1619311.html
Copyright © 2020-2023  润新知