rman-08120
We need RMAN to automatically purge archivelogs from the FRA once they are applied to the standby database.
1) From bug 6216036:
In 10g, the archive log deletion policy configured using CONFIGURE command
applies only to automatic deletion of archived logs by Flash Recovery Area.
The other deletion mechanisms like DELETE ARCHIVELOG or DELETE INPUT does not
respect this configuration.
2)APPLIED ON STANDBY
- enables flash recovery area to delete archivelogs that are applied on mandatory standby.
See Oracle Data Guard Concepts and Administration for details.
http://docs.oracle.com/cd/B19306_01/backup.102/b14194/rcmsynta015.htm#sthref289
This restriction is lifted in 11g.
SOLUTION
In a dataguard configuration, the archivelogs from the FRA will be automatically purged when the following conditions are met by the database.
1) prior to 11g, if not using mandatory archivelog destinations, the database (primary and standby) must be restarted with the following parameter:
2) configure the following parameter in RMAN (standby):
10g:
RMAN> CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON STANDBY;
11g:
Starting from 11g, we have enhanced the configure archivelog deletion policy to include TO APPLIED ON [ALL] STANDBY [BACKED UP n TIMES TO DEVICE TYPE ]. This will ensure that the archivelogs is applied as well as backed up on primary before it is being purged.
3) the archivelog must have been applied to the standby. Run the following query to list all archivelogs applied to the standby:
from v$archived_log a, v$database d
where a.activation# = d.activation#
and a.applied='YES'
/
4) there is space pressure in the FRA
When an archivelog is automatically deleted from the FRA, you will see this in the database's alert.log:
Deleted Oracle managed file /opt/app/oracle/FRA/ORA102/archivelog/2011_05_30/o1_mf_1_151_6y71q675_.arc
Deleted Oracle managed file /opt/app/oracle/FRA/ORA102/archivelog/2011_05_31/o1_mf_1_151_6y87pzg4_.arc
The automatic cleanup of reclaimable files is not done unless the archive destination is set to 'location=use_db_recovery_file_dest'. In other words, Oracle behaves differently if your archive destination is set to the flash recovery area itself as opposed to the literal string, use_db_recovery_file_dest. This is true even in the case where you are setting the archive destination to the exact same diskgroup/filesystem that the flashrecovery area is using. Furthermore, archive files generated without use_db_recovery_file_dest do not participate in the flashrecovery area features at all – i.e. they do not contribute towards quota, you will not see them registered in v$recovery_file_dest, etc.
It is also important to note that backups created using the FORMAT option will also fail to participate in the flash recovery area algorithms. If you want the flash recovery area to automatically manage your backups, do NOT specify the FORMAT option and Oracle will implicitly know to use the db_recovery_file_dest
Similarly, RMAN controlfile autobackups should not specify the flash recovery area diskgroup or filesystem. Examples of controlfile autobackups that will NOT be automatically managed by the flash recovery area algorithms are:
a) RMAN> configure controlfile autobackup format for device type disk to '+FRA'; b) RMAN> configure controlfile autobackup format for device type disk to '+FRA/%F' c) RMAN> configure controlfile autobackup format for device type disk to '/flash_recovery_area/boston/%F' You can verify which files are participating in the flash recovery area algorithms by checking the v$archived_log.is_recovery_dest_file column. Yes means the file is participating and No means it isn't. The same column exists in v$backup_piece and it has the same meaning.
Although not precisely related to this problem, it is also important to note that reclaimable space will not be available for reuse until oracle reaches the v$recovery_file_dest.space_limit value (db_recovery_file_dest_size parameter). To ensure that this is possible, the db_recovery_file_dest_size should initially be set to 10% less than the total usable space in your FRA disk group or filesystem.
解释:
原因说明,当FRA 区使用达到80%,会自动触发删除归档日志的操作,如果没有自动触发自动删除操作,
那么检查下哪些可能原因可能导致FRA区归档日志没有标志为不回收的。
目前已知的是
1.10g dg 环境的隐含参数
2.rman 的归档日志 show ARCHIVELOG DELETION POLICY;
3.某些异常原因,导致 v$archived_log.is_recovery_dest_file column显示没有存放到FRA区
Cause:
When the disk space usage reaches 80%, we try to reclaim space pro-actively.
But, when there is no disk space to reclaim, we should not refresh the aging rules which could cause a performance problem.
We print a message when a file is purged/deleted from FRA when reclaim space in alert log.
Look for
'Deleted Oracle Managed File ...'
SELECT * FROM V$RECOVERY_FILE_DEST;
进阶检查脚本
select applied,deleted,decode(rectype,11,'YES','NO') reclaimable ,count(*),min(sequence#),max(sequence#) from v$archived_log left outer join sys.x$kccagf using(recid) where is_recovery_dest_file='YES' and name is not null group by applied,deleted,decode(rectype,11,'YES','NO') order by 5
column deleted format a7
column reclaimable format a11
set linesize 320
select applied,deleted,backup_count ,decode(rectype,11,'YES','NO') reclaimable,count(*) ,to_char(min(completion_time),'dd-mon hh24:mi') first_time ,to_char(max(completion_time),'dd-mon hh24:mi') last_time ,min(sequence#) first_seq,max(sequence#) last_seq from v$archived_log left outer join sys.x$kccagf using(recid) where is_recovery_dest_file='YES' group by applied,deleted,backup_count,decode(rectype,11,'YES','NO') order by min(sequence#)
########进阶2 ISSUE 1: (sample 1)
###查看 NBU 备份信息,发现当天6/9的备份情况正常,所有归档日志备份片都正常,没有报错
NBU CATALOG CHECK :
select ' DB NAME->'||DB_NAME||'- ROW TYPE->'||ROW_TYPE||'- START TIME->'||to_char(start_time, 'Dy DD-Mon-YYYY HH24:MI:SS') ||'- END TIME->'||to_char(end_time, 'Dy DD-Mon-YYYY HH24:MI:SS')||'- MBYTES PROCESSED->'||MBYTES_PROCESSED||'- OBJECT TYPE->'||OBJECT_TYPE||'- STATUS->'||STATUS||'- OUTPUT DEVICE->'||OUTPUT_DEVICE_TYPE||'- INPUT MB->'||INPUT_BYTES/1048576||'- OUT MB'||OUTPUT_BYTES/1048576
FROM rman11G.rc_rman_status
WHERE /* start_time > SYSDATE - 1
AND ( STATUS like '%FAILED%'
OR STATUS like '%ERROR%') */
start_time > SYSDATE - 21
AND db_name='dba'
and output_device_type='SBT_TAPE'
and object_type='ARCHIVELOG'
ORDER BY END_TIME;
4 DB NAME->dba- ROW TYPE->COMMAND- START TIME->Tue 04-Jun-2019 09:13:25- END TIME->Tue 04-Jun-2019 09:13:57- MBYTES PROCESSED->85- OBJECT TYPE->ARCHIVELOG- STATUS->COMPLETED- OUTPUT DEVICE->SBT_TAPE- INPUT MB->85.453125- OUT MB86
5 DB NAME->dba- ROW TYPE->COMMAND- START TIME->Tue 04-Jun-2019 13:13:22- END TIME->Tue 04-Jun-2019 13:13:46- MBYTES PROCESSED->96- OBJECT TYPE->ARCHIVELOG- STATUS->COMPLETED- OUTPUT DEVICE->SBT_TAPE- INPUT MB->95.8896484375- OUT MB96
6 DB NAME->dba- ROW TYPE->COMMAND- START TIME->Tue 04-Jun-2019 17:13:23- END TIME->Tue 04-Jun-2019 17:14:12- MBYTES PROCESSED->2088- OBJECT TYPE->ARCHIVELOG- STATUS->COMPLETED- OUTPUT DEVICE->SBT_TAPE- INPUT MB->2087- OUT MB2088
7 DB NAME->dba- ROW TYPE->COMMAND- START TIME->Tue 04-Jun-2019 21:13:33- END TIME->Tue 04-Jun-2019 21:15:12- MBYTES PROCESSED->1499- OBJECT TYPE->ARCHIVELOG- STATUS->COMPLETED- OUTPUT DEVICE->SBT_TAPE- INPUT MB->1499.38623046875- OUT MB1500
8 DB NAME->dba- ROW TYPE->COMMAND- START TIME->Wed 05-Jun-2019 01:13:36- END TIME->Wed 05-Jun-2019 01:15:23- MBYTES PROCESSED->119- OBJECT TYPE->ARCHIVELOG- STATUS->COMPLETED- OUTPUT DEVICE->SBT_TAPE- INPUT MB->119.484375- OUT MB120
###但是检查FRA归档日志是否可以删除,确发现6/9日归档日志 RECLAIM字段是NO,(也就是不可以删除) ,但是当天的备份确实成功的。status is NO,even the archive log is backuped,
column deleted format a7 column reclaimable format a11 set linesize 120
select applied,deleted,backup_count ,decode(rectype,11,'YES','NO') reclaimable,count(*) ,to_char(min(completion_time),'dd-mon hh24:mi') first_time ,to_char(max(completion_time),'dd-mon hh24:mi') last_time ,min(sequence#) first_seq,max(sequence#) last_seq from v$archived_log left outer join sys.x$kccagf using(recid) where is_recovery_dest_file='YES' group by applied,deleted,backup_count,decode(rectype,11,'YES','NO') order by min(sequence#)
;
SQL> /
APPLIED DELETE BACKUP_COUNT RECLAI COUNT(*) FIRST_TIME LAST_TIMEFIRST_SEQ LAST_SEQ
------------------ ------ ------------ ------ ---------- ------------------------------------------ ------------------------------------------ ---------- ----------
NO YES 1 NO 1021 20-may 00:13 11-jun 17:00 10146 11520
NO NO 1 NO 354 04-jun 21:13 11-jun 00:13 11126 11483 《- 6/4 ~ 6/11 的归档日志都是不可以删除的
NO NO 1 YES 819 11-jun 17:02 24-jun 20:32 11521 12339
NO NO 0 NO 14 24-jun 21:02 24-jun 22:10 12340 12353
###rman config: rman 配置提示主要备份到磁带,就可以删除。
RMAN> show all;
RMAN configuration parameters for database with db_unique_name dba are:
CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 90 DAYS;
CONFIGURE BACKUP OPTIMIZATION OFF; # default
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
CONFIGURE CONTROLFILE AUTOBACKUP OFF; # default
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default
CONFIGURE DEVICE TYPE DISK BACKUP TYPE TO COMPRESSED BACKUPSET PARALLELISM 4;
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE ENCRYPTION FOR DATABASE OFF; # default
CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default
CONFIGURE COMPRESSION ALGORITHM 'BASIC' AS OF RELEASE 'DEFAULT' OPTIMIZE FOR LOAD TRUE ; # default
CONFIGURE ARCHIVELOG DELETION POLICY TO BACKED UP 1 TIMES TO 'SBT_TAPE';
CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/db/dba/app/product/database/11g/dbs/snapcf_dba.f'; # default
### 手工执行删除,报错RMAN-08138
RMAN> delete archivelog until time 'sysdate -19';
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1095 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=1254 device type=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: SID=1290 device type=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: SID=1330 device type=DISK
RMAN-08138: WARNING: archived log not deleted - must create more backups
archived log file name=/db/dba/fra/dba/archivelog/2019_06_04/o1_mf_1_11126_ghdvmxlx_.arc thread=1 sequence=11126
RMAN-08138: WARNING: archived log not deleted - must create more backups
archived log file name=/db/dba/fra/dba/archivelog/2019_06_04/o1_mf_1_11127_ghdxd7v1_.arc thread=1 sequence=11127
RMAN-08138: WARNING: archived log not deleted - must create more backups
RMAN-08138: WARNING: archived log not deleted - must create more backups
archived log file name=/db/dba/fra/dba/archivelog/2019_06_05/o1_mf_1_11186_ghh786vx_.arc thread=1 sequence=11186
RMAN-08138: WARNING: archived log not deleted - must create more backups
archived log file name=/db/dba/fra/dba/archivelog/2019_06_05/o1_mf_1_11187_ghh90ggy_.arc thread=1 sequence=11187
RMAN-08138: WARNING: archived log not deleted - must create more backups
archived log file name=/db/dba/fra/dba/archivelog/2019_06_05/o1_mf_1_11188_ghhbrp4l_.arc thread=1 sequence=11188
RMAN-08138: WARNING: archived log not deleted - must create more backups
archived log file name=/db/dba/fra/dba/archivelog/2019_06_05/o1_mf_1_11189_ghhdk0oc_.arc thread=1 sequence=11189
RMAN-08138: WARNING: archived log not deleted - must create more backups
archived log file name=/db/dba/fra/dba/archivelog/2019_06_05/o1_mf_1_11190_ghhg98cy_.arc thread=1 sequence=11190
RMAN-08138: WARNING: archived log not deleted - must create more backups
archived log file name=/db/dba/fra/dba/archivelog/2019_06_05/o1_mf_1_11191_ghhj0kvq_.arc thread=1 sequence=11191
RMAN-08138: WARNING: archived log not deleted - must create more backups
archived log file name=/db/dba/fra/dba/archivelog/2019_06_05/o1_mf_1_11192_ghhkrvds_.arc thread=1 sequence=11192
RMAN-08138: WARNING: archived log not deleted - must create more backups
archived log file name=/db/dba/fra/dba/archivelog/2019_06_05/o1_mf_1_11193_ghhmk32w_.arc thread=1 sequence=11193
RMAN-08138: WARNING: archived log not deleted - must create more backups
archived log file name=/db/dba/fra/dba/archivelog/2019_06_05/o1_mf_1_11194_ghho9bov_.arc thread=1 sequence=11194
delete force archivelog until time 'sysdate -18';
########### seem catalog db is not sync to local controlfile , nocatlog 模式连接数据库,发现可以检索到6/9日 备份归档片的信息,但是更加详细的 归档日志号 确查不到。
4.1 local controlfile
[opdba@pdbadb04 ~]$ rman target /
Recovery Manager: Release 11.2.0.4.0 - Production on Wed Jun 26 10:02:00 2019
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
connected to target database: dba (DBID=2708786940)
RMAN> list backup of archivelog from logseq 11381 until logseq 11417;
using target database control file instead of recovery catalog
specification does not match any backup in the repository
RMAN>
## nocatalog mode :In target database as SYS user: backupset is still exsits in local controfile; 只检索到6/9日备份集的信息,没有检索到备份归档日志(日志序列号)具体信息
SQL> select BS.RECID,BS.STAMP,BS.SET_STAMP,BS.BACKUP_TYPE,BS.CONTROLFILE_INCLUDED,BS.COMPLETION_TIME,BS.KEEP,BS.KEEP_UNTIL,BS.KEEP_OPTIONS from
v$backup_set BS, v$backup_piece BP where BS.SET_STAMP=BP.SET_STAMP and BS.SET_COUNT=BP.SET_COUNT and BP.TAG='TAG20190609T091402';
RECID STAMP BA CONTRO COMPLETION_T KEEP KEEP_UNTIL KEEP_OPTIONS
---------- ---------- -- ------ ------------ ------ ------------ ----------------------
7225 1010481250 L NO 09-JUN-19 NO
7226 1010481253 L NO 09-JUN-19 NO
### but after check V$BACKUP_REDOLOG, we find there is no backup log in local controfile and V$BACKUP_REDOLOG. 没有6/9检索到备份归档信息,没有检索到备份归档日志(日志序列号)具体信息
SELECT DISTINCT B.SET_STAMP,
B.THREAD#,
B.SEQUENCE#,
B.FIRST_TIME,
B.FIRST_CHANGE#,
B.NEXT_TIME,
B.NEXT_CHANGE#
FROM V$BACKUP_REDOLOG B, V$BACKUP_PIECE A
WHERE A.SET_STAMP = B.SET_STAMP
AND A.DELETED = 'NO'
AND B.SET_STAMP = &SET_STAMP
ORDER BY THREAD#, SEQUENCE#;
1010481250
no rows selected
4.2 catalog db 是可以检查到6/9 备份归档的备份片信息的和归档日志序列号信息。
rman target / catalog rman11g/rman11g@cat11g
[opdba@pdbadb04 ~]$ rman target / catalog rman11g/rman11g@cat11g
Recovery Manager: Release 11.2.0.4.0 - Production on Wed Jun 26 09:55:15 2019
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
connected to target database: dba (DBID=2708786940)
connected to recovery catalog database
RMAN> list backup of archivelog from logseq 11381 until logseq 11417;
List of Backup Sets
===================
BS Key Size Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ ---------------
87896459 17.25M SBT_TAPE 00:00:07 09-JUN-19
BP Key: 87896462 Status: AVAILABLE Compressed: NO Tag: TAG20190609T091402
Handle: al_7239_1_1010481243 Media: /home/diskpool4/dba.db.crb_1560042845_C1_F1
List of Archived Logs in backup set 87896459
Thrd Seq Low SCN Low Time Next SCN Next Time
---- ------- ---------- --------- ---------- ---------
1 11384 1382260141 09-JUN-19 1382263558 09-JUN-19
1 11385 1382263558 09-JUN-19 1382266925 09-JUN-19
1 11386 1382266925 09-JUN-19 1382270374 09-JUN-19
1 11387 1382270374 09-JUN-19 1382274035 09-JUN-19
BS Key Size Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ ---------------
87896460 18.25M SBT_TAPE 00:00:10 09-JUN-19
BP Key: 87896463 Status: AVAILABLE Compressed: NO Tag: TAG20190609T091402
Handle: al_7238_1_1010481243 Media: /home/diskpool4/dba.db.crb_1560042846_C1_F1
List of Archived Logs in backup set 87896460
Thrd Seq Low SCN Low Time Next SCN Next Time
---- ------- ---------- --------- ---------- ---------
1 11381 1382249677 09-JUN-19 1382253664 09-JUN-19
1 11382 1382253664 09-JUN-19 1382256816 09-JUN-19
1 11383 1382256816 09-JUN-19 1382260141 09-JUN-19
BS Key Size Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ ---------------
87905602 20.50M SBT_TAPE 00:00:06 09-JUN-19
BP Key: 87905605 Status: AVAILABLE Compressed: NO Tag: TAG20190609T131348
Handle: al_7241_1_1010495628 Media: /home/diskpool4/dba.db.crb_1560057230_C1_F1
List of Archived Logs in backup set 87905602
Thrd Seq Low SCN Low Time Next SCN Next Time
---- ------- ---------- --------- ---------- ---------
1 11388 1382274035 09-JUN-19 1382277576 09-JUN-19
1 11389 1382277576 09-JUN-19 1382281844 09-JUN-19
1 11390 1382281844 09-JUN-19 1382285645 09-JUN-19
1 11391 1382285645 09-JUN-19 1382289711 09-JUN-19
BS Key Size Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ ---------------
87905603 18.50M SBT_TAPE 00:00:08 09-JUN-19
BP Key: 87905606 Status: AVAILABLE Compressed: NO Tag: TAG20190609T131348
Handle: al_7242_1_1010495628 Media: /home/diskpool4/dba.db.crb_1560057231_C1_F1
List of Archived Logs in backup set 87905603
Thrd Seq Low SCN Low Time Next SCN Next Time
---- ------- ---------- --------- ---------- ---------
1 11392 1382289711 09-JUN-19 1382293557 09-JUN-19
1 11393 1382293557 09-JUN-19 1382297119 09-JUN-19
1 11394 1382297119 09-JUN-19 1382300911 09-JUN-19
1 11395 1382300911 09-JUN-19 1382304477 09-JUN-19
Thrd Seq Low SCN Low Time Next SCN Next Time
---- ------- ---------- --------- ---------- ---------
1 11413 1382364015 09-JUN-19 1382367593 09-JUN-19
1 11414 1382367593 09-JUN-19 1382371755 09-JUN-19
1 11415 1382371755 09-JUN-19 1382375414 09-JUN-19
1 11416 1382375414 09-JUN-19 1382378865 09-JUN-19
1 11417 1382378865 09-JUN-19 1382382458 09-JUN-19
RMAN>
### In catalog database as catalog schema user: see backupset is aleady in rman catalog db,tag is from list backup command:,查询到归档日志备份集的信息。
SQL> select BS.BS_KEY,BS.RECID,BS.STAMP,BS.SET_STAMP,BS.BACKUP_TYPE,BS.CONTROLFILE_INCLUDED,BS.COMPLETION_TIME,BS.KEEP,BS.KEEP_UNTIL,BS.KEEP_OPTIONS from
rman11G.rc_backup_set BS, rman11G.rc_backup_piece BP where BS.SET_STAMP=BP.SET_STAMP and BS.SET_COUNT=BP.SET_COUNT and BP.TAG='TAG20190609T091402';
BS_KEY RECID STAMP SET_STAMP BACKUP_TYPE CONTROLFILE_INCLUDED COMPLETION_TIME KEEP KEEP_UNTIL KEEP_OPTIONS
---------- ---------- ---------- ---------- ----------- -------------------- --------------- ---- ----------- ------------
87896459 7225 1010481250 1010481243 L NONE 2019/6/9 9:14:1 NO
87896460 7226 1010481253 1010481243 L NONE 2019/6/9 9:14:1 NO
### but after check V$BACKUP_REDOLOG, we find there is more rows backup log in catalog and rman11G.rc_BACKUP_REDOLOG B.
## 查询到归档日志序列号详细信息了。
SELECT DISTINCT B.SET_STAMP,
B.THREAD#,
B.SEQUENCE#,
B.FIRST_TIME,
B.FIRST_CHANGE#,
B.NEXT_TIME,
B.NEXT_CHANGE#
FROM rman11G.rc_BACKUP_REDOLOG B, rman11G.rc_BACKUP_PIECE A
WHERE A.SET_STAMP = B.SET_STAMP
AND B.SET_STAMP = &SET_STAMP
ORDER BY THREAD#, SEQUENCE#;
SET_STAMP THREAD# SEQUENCE# FIRST_TIME FIRST_CHANGE# NEXT_TIME NEXT_CHANGE#
---------- ---------- ---------- ----------- ------------- ----------- ------------
1010481243 1 11380 2019/6/9 5: 1382246484 2019/6/9 5: 1382249677
1010481243 1 11381 2019/6/9 5: 1382249677 2019/6/9 6: 1382253664
1010481243 1 11382 2019/6/9 6: 1382253664 2019/6/9 6: 1382256816
1010481243 1 11383 2019/6/9 6: 1382256816 2019/6/9 7: 1382260141
1010481243 1 11384 2019/6/9 7: 1382260141 2019/6/9 7: 1382263558
1010481243 1 11385 2019/6/9 7: 1382263558 2019/6/9 8: 1382266925
1010481243 1 11386 2019/6/9 8: 1382266925 2019/6/9 8: 1382270374
1010481243 1 11387 2019/6/9 8: 1382270374 2019/6/9 9: 1382274035
对比6/25 检查 和6/29 检查的结果,发现V$BACKUP_REDOLOG 一直处于维护状态,清理旧的归档日志片信息。
###6/25 检查
SELECT min(first_time) FROM V$BACKUP_REDOLOG B;
11-jun 17:02
###6/29 检查
SELECT min(first_time) FROM V$BACKUP_REDOLOG B;
2019/6/16 0:32:05
这说明V$BACKUP_REDOLOG 一直在将旧的的备份归档信息刷出(具体估算的时间有可能是Control_file_record_keep_time或者这个值的2倍左右),这样导致FRA 如果还没来记得删除旧的归档,旧的归档就会设置RECLAIABLE 从yes 到NO.这个时间FRA区就无法自动删除归档日志了。
查询可回收归档信息,也验证了我们的观点。2019/6/16 0:32:05 ( V$BACKUP_REDOLOG)之前的归档日志RECLAIABLE 状态已经是 no.也就是无法删除,但是物理上他们是存在在本地磁盘中,这样就会导致文件系统告警。
RECLAIM status is NO,even the archive log is backuped,
column deleted format a7 column reclaimable format a11 set linesize 120
select applied,deleted,backup_count ,decode(rectype,11,'YES','NO') reclaimable,count(*) ,to_char(min(completion_time),'dd-mon hh24:mi') first_time ,to_char(max(completion_time),'dd-mon hh24:mi') last_time ,min(sequence#) first_seq,max(sequence#) last_seq from v$archived_log left outer join sys.x$kccagf using(recid) where is_recovery_dest_file='YES' group by applied,deleted,backup_count,decode(rectype,11,'YES','NO') order by min(sequence#)
;
APPLIED DELETE BACKUP_COUNT RECLAI COUNT(*)
------------------ ------ ------------ ------ ----------
FIRST_TIME
------------------------------------------
LAST_TIME FIRST_SEQ LAST_SEQ
------------------------------------------ ---------- ----------
NO NO 1 NO 350
09-jun 18:13
16-jun 00:32 11405 11797 《=RECLAI 为NO
NO NO 1 YES 666
16-jun 01:02
29-jun 16:32 11798 12657
原因如下:
V$backup_redolog 这个视图比较特殊,oracle 可能会根据Control_file_record_keep_time的时间设置,定期刷出日期之前的备份归档信息;
但是v$backup_set and v$backup_piece 将不会被刷出。
Bug 7357779:RMAN REPORT/DELETE OBSOLETE RETURNS ARCHIVELOG BACKUPS INSIDE RETENTION POLICY
The backupset,backuppiece and contents (in this case, backup_redolog) details are all held in separate sections in the controlfile.
Where archivelog generation is high, the backup_redolog section will inevitably fill up at a much faster rate and
so if control_file_record_keep_time is LESS than the recovery window set, it is possible to have entries in v$backup_redolog
age out of the controlfile prematurely leaving behind the corresponding entries in v$backup_set and v$backup_piece.
When this happens, report obsolete in nocatalog mode, will report these backupsets as obsolete even though they are
inside the recovery window.
Control_file_record_keep_time is < configured retention policy
Records in V$backup_redolog have aged out of the controlfile leaving corresponding entries in v$backup_set and v$backup_piece
解决方法:
workaroud 1:
Better choice is to delete unnecessary info in FRA via rman scripts.用脚本删除归档日志,推荐使用这个方法。
WORKAROUND 2:
-----------
Use a catalog connection for housekeeping operations
and
make sure that
control_file_record_keep_time accurately reflects your recovery window plus 1.
可以暂时调大control_file_record_keep_time 看看是否能防止V$backup_redolog 将旧的归档日志备份信息 刷出
默认是7,调大到30
alter system set control_file_record_keep_time=30;
影响范围:
The controlfile either on both primary and standby or just one site has grown very large.导致DG 控制文件过大,并进而导致DG RFS 扫描归档日志HANG
#############samlpe 3;
如何使用sql 查询rman 配置的 归档日志删除策略 信息
SELECT NVL(CA.ARCHIVELOG, 'OFF') as "ARCHIVELOG"
FROM
(
SELECT
'Policy Settings' AS GROUP_NAME
FROM
DUAL
) G
NATURAL FULL OUTER JOIN
(
SELECT
TRIM(VALUE) AS ARCHIVELOG
FROM
gv$rman_configuration
WHERE
name = 'ARCHIVELOG DELETION POLICY'
) CA;
#####3
1g以前,主库配置了CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON STANDBY
但是指向备库的归档路径不是mandatory的,那么主库不会自动删除已经在备库应用的归档
参考 Configure RMAN to purge archivelogs after applied on standby [ID 728053.1]
11g以后,就算远端归档路径不是mandatory的,主库也会自动删除已经在备库应用的归档
如果11g以前,想归档路径不是mandatory,那么需要i设置 _log_deletion_policy 为 ALL
###for "RMAN-08591
Thanks for your feedback. From the result, we could see when setting the deletion policy to 'applied on standby' , it reports "RMAN-08591: WARNING: invalid archivelog deletion policy". Which is matched with DOC ID 331924.1. Thus to resolve this issue, Please execute below command in primary and standby db , then restart the databases to take effect. Thanks.
1. SQL> alter system set "_log_deletion_policy"=ALL scope=spfile sid='*';
另外,可以参考 Usage of RMAN in Streams Environment [ID 421176.1]
中给出的DG环境,配置FRA管理归档的配置
Backups taken on
-- Primary
Primary Flash Recovery Area Configuration
-- CONFIGURE ARCHIVELOG DELETION POLICY TO NONE;
Standby Flash Recovery Area Configuration
-- CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON STANDBY;
Backups taken on
-- Standby
Primary Flash Recovery Area Configuration
-- CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON STANDBY;
Standby Flash Recovery Area Configuration
-- CONFIGURE ARCHIVELOG DELETION POLICY TO NONE;
注意,FRA最好别在Streams环境中使用,FRA的删除策略,不考虑Streams
#######4
PURPOSE
When using 'CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON STANDBY' on primary database, allow RMAN to consider *both* mandatory and optional destinations when determining which logs on primary can be deleted.
SCOPE
RMAN backups on standby database in Data Guard environment
DETAILS
The 10g RMAN-Data Guardpaper states that 'CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON STANDBY' only checks that a log has been applied on all mandatory standby databases.
However, Data Guard best practice for Max Performance and Max Availability modes is to *not* use mandatory standby (so that primary does not hang in the event of online logs filling up). Therefore, the following parameter must be set:
_log_deletion_policy='ALL'
This will allow RMAN to delete logs from the Flash Recovery Area on primary database host that have been applied to all standby databases (whether mandatory or optional). Database will require bounce after setting this parameter.
NOTE: To make the change in an spfile, double quotes must be used around parameter name:
NOTE: attempt to set this configuration in RMAN without this parameter setting may result in error:
new RMAN configuration parameters:
CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON STANDBY;
new RMAN configuration parameters are successfully stored
starting full resync of recovery catalog
full resync complete
RMAN-08591: WARNING: invalid archivelog deletion policy
####5
http://blog.dbi-services.com/archivelog-deletion-policy-for-standby-database-in-oracle-data-guard/
Do you use ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY for your Oracle databases in Data Guard? Maybe you also use the Fast Recovery Area as archive log destination. That’s good practice! But did you ever check that it works as expected?
What I mean is this:
- The archived logs that you don’t need are reclaimable by the FRA when space is needed
- And the archived logs that are required for availability (standby or backup) are not deleted.
It’s not an easy thing to check because Oracle doesn’t show which archive log is reclaimable. Only the total reclaimable space is shown in v$recovery_area_usage. But that is not sufficient to validate which archivelog sequence is concerned. I’ll show you below a query that returns the reclaimable status from the archived logs. And you will see that until 12c the APPLIED ON ALL STANDBY does not work as expected. You’ve probably seen a FRA full at standby site and solved it by deleting archived logs. But this is not the right solution because the FRA is supposed to do that.
Let’s look at an example I encountered recently. The archivelog deletion policy is set correctly:
RMAN> show archivelog deletion policy; RMAN configuration parameters for database with db_unique_name DATABASE_SITE2 are: CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY;
This configuration, an 11g feature, allows to delete an archive log as soon as it is applied to all standby destinations. Note that it works if I manually do a ‘delete archivelog all;’ but I expect that the archivelogs in the FRA becomes reclaimable automatically.
Unfortunately, this is not the case and the FRA is growing:
SQL> select * from v$recovery_area_usage where file_type='ARCHIVED LOG'; FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES -------------------- ------------------ ------------------------- --------------- ARCHIVED LOG 61.11 43.02 467
Let’s check everything. We are on the standby database:
SQL> select open_mode,database_role from v$database; OPEN_MODE DATABASE_ROLE -------------------- ---------------- MOUNTED PHYSICAL STANDBY
The archivelogs are going to the Fast Recovery Area:
SQL> show parameter log_archive_dest_1 NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ log_archive_dest_1 string location=USE_DB_RECOVERY_FILE_ DEST, valid_for=(ALL_LOGFILES, ALL_ROLES)
All archived logs are applied (we are in SYNC AFFIRM):
DGMGRL> show database 'DATABASE_SITE2'; Database - DATABASE_SITE2 Role: PHYSICAL STANDBY Intended State: APPLY-ON Transport Lag: 0 seconds Apply Lag: 0 seconds Real Time Query: OFF Instance(s): DATABASE Database Status: SUCCESS
Well, with that configuration, I expect that all archivelogs are reclaimable – except the current one.
Let’s investigate. V$RECOVERY_AREA_USAGE is an aggregate view. If we check its definition, we see that the reclaimable size comes from x$kccagf.rectype.
So I’ll use it in in conjunction with v$archived_log in order to give the detail about which archived logs are reclaimable:
SQL> select applied,deleted,decode(rectype,11,'YES','NO') reclaimable ,count(*),min(sequence#),max(sequence#) from v$archived_log left outer join sys.x$kccagf using(recid) where is_recovery_dest_file='YES' and name is not null group by applied,deleted,decode(rectype,11,'YES','NO') order by 5 / APPLIED DELETED RECLAIMABLE COUNT(*) MIN(SEQUENCE#) MAX(SEQUENCE#) --------- ------- ----------- ---------- -------------- -------------- YES NO YES 429 5938 6366 YES NO NO 37 6367 6403 IN-MEMORY NO NO 1 6404 6404
The problem is there: Because of a bug (Bug 14227959 : STANDBY DID NOT RELEASE SPACE IN FRA) the archivelogs are not marked as reclaimable when the database is in mount mode.
The workaround is to execute dbms_backup_restore.refreshagedfiles. This is what must be scheduled (maybe daily) on the standby. It can be a good idea to do it at the same time as a daily ‘delete obsolete’, so here is the way to call it from RMAN:RMAN> sql "begin dbms_backup_restore.refreshagedfiles; end;";
But I’ve found another workaround: just run the CONFIGURE ARCHIVELOG POLICY from RMAN as it refreshes the reclaimable flag – even when there is no change.
Then, you can run a daily job that does CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY on your database, whatever the role is.
It’s different for the database where you do the backup, because you want to be sure that the backup is done before an archivelog is deleted: CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY BACKED UP 1 TIMES TO DISK;
This is a good way to prevent anyone from changing the configuration and keep it close to the backup scripts. At dbi services, we advise to keep the same configuration on all Data Guard sites for the same database so that a switchover can be done without any problem. For this reason, having a script that depends on the place where the backups are done is a better alternative than a configuration that depends on the database role.
Finally, here is the state of our reclaimable archivelogs after any of these solutions:
APPLIED DELETED RECLAIMABLE COUNT(*) MIN(SEQUENCE#) MAX(SEQUENCE#)
--------- ------- ----------- ---------- -------------- --------------
YES NO YES 466 5938 6403
IN-MEMORY NO NO 1 6404 6404
FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE NUMBER_OF_FILES
-------------------- ------------------ ------------------------- ---------------
ARCHIVED LOG 61.11 61.09 467
All applied archived logs are reclaimable and the FRA will never be full.
You can check the primary as well. Are you sure that Oracle will never delete an archived log that has not been backed up ? Check your deletion policy.
Here is the full query I use for that:
column deleted format a7
column reclaimable format a11
set linesize 120
select applied,deleted,backup_count
,decode(rectype,11,'YES','NO') reclaimable,count(*)
,to_char(min(completion_time),'dd-mon hh24:mi') first_time
,to_char(max(completion_time),'dd-mon hh24:mi') last_time
,min(sequence#) first_seq,max(sequence#) last_seq
from v$archived_log left outer join sys.x$kccagf using(recid)
where is_recovery_dest_file='YES'
group by applied,deleted,backup_count,decode(rectype,11,'YES','NO') order by min(sequence#)
/
This is the result on primary where the last archivelog backup has run around 21:00
APPLIED DELETED BACKUP_COUNT RECLAIMABLE COUNT(*) FIRST_TIME LAST_TIME FIRST_SEQ LAST_SEQ
--------- ------- ------------ ----------- -------- ------------ ------------ --------- --------
NO YES 1 NO 277 15-jan 17:56 19-jan 09:49 5936 6212
NO NO 1 YES 339 19-jan 10:09 22-jan 21:07 6213 6516
NO NO 0 NO 33 22-jan 21:27 23-jan 07:57 6517 6549
That is fine according to my policy APPLIED ON ALL STANDBY BACKED UP 1 TIMES TO DISK
And here is the result on standby where the workaround job has run around 06:00 and redo apply is in SYNC
APPLIED DELETED BACKUP_COUNT RECLAIMABLE COUNT(*) FIRST_TIME LAST_TIME FIRST_SEQ LAST_SEQ
--------- ------- ------------ ----------- -------- ------------ ------------ --------- --------
YES YES 0 NO 746 07-jan 13:27 17-jan 11:17 5320 6065
YES NO 0 YES 477 17-jan 11:37 23-jan 05:37 6066 6542
YES NO 0 NO 8 23-jan 05:57 23-jan 08:14 6543 6550
IN-MEMORY NO 0 NO 1 23-jan 08:15 23-jan 08:15 6551 6551
This is good for my policy APPLIED ON ALL STANDBY – except that because of the bug mentioned above, redo applied since 06:00 are not yet reclaimable.
Update SEP-17
When troubleshooting archivelog deletion policy issue, here is a better query which counts the number of backups for each sequence:
set linesize 200 pagesize 1000
column is_recovery_dest_file format a21
select
deleted,status,is_recovery_dest_file,thread#,min(sequence#),max(sequence#),min(first_time),max(next_time),count(distinct sequence#),archived,applied,backup_count,count("x$kccagf")
from (
select deleted,thread#,sequence#,status,name ,first_time, next_time,case x$kccagf.rectype when 11 then recid end "x$kccagf"
,count(case archived when 'YES' then 'YES' end)over(partition by thread#,sequence#) archived
,count(case applied when 'YES' then 'YES' end)over(partition by thread#,sequence#) applied
,sum(backup_count)over(partition by thread#,sequence#) backup_count
,listagg(is_recovery_dest_file||':'||dest_id,',')within group(order by dest_id)over(partition by thread#,sequence#) is_recovery_dest_file
from v$archived_log left outer join sys.x$kccagf using(recid)
) group by deleted,status,is_recovery_dest_file,thread#,archived,applied,backup_count
order by max(sequence#),min(sequence#),thread#,deleted desc,status;
With the following output:
DEL S IS_RECOVERY_DEST_FILE THREAD# MIN(SEQUENCE#) MAX(SEQUENCE#) MIN(FIRST MAX(NEXT_ COUNT(DISTINCTSEQUENCE#) ARCHIVED APPLIED BACKUP_COUNT COUNT("X$KCCAGF")
--- - --------------------- ---------- -------------- -------------- --------- --------- ------------------------ ---------- ---------- ------------ -----------------
NO A YES:1 1 3233 3233 23-JUN-17 23-JUN-17 1 1 0 1 1
NO A YES:1,NO:2 1 3234 5387 23-JUN-17 21-JUL-17 2154 2 1 1 2154
NO A YES:1,NO:2 1 5388 11596 21-JUL-17 10-OCT-17 6209 2 1 0 6208
NO A YES:1,NO:2 1 11597 11597 10-OCT-17 10-OCT-17 1 2 0 0 0
#############参考文档5
https://www.cnblogs.com/kerrycode/p/5684768.html
关于Oracle数据库的RMAN备份,除了邮件外,是否能通过其它方式检查RMAN备份的成功与失败呢?其实我们可以通过下面SQL脚本来检查某个时间段备份失败的记录:
SELECT * FROM V$RMAN_STATUS
WHERE START_TIME >= TO_DATE(&START_TIME,'YYYY-MM-DD HH24:MI:SS')
AND END_TIME <= TO_DATE(&END_TIME ,'YYYY-MM-DD HH24:MI:SS')
AND OPERATION ='BACKUP'
AND STATUS !='COMPLETED'
AND STATUS NOT LIKE 'RUNNING%'
查看备份成功的历史记录:
SELECT * FROM V$RMAN_STATUS
WHERE START_TIME >= TO_DATE(&START_TIME,'YYYY-MM-DD HH24:MI:SS')
AND END_TIME <= TO_DATE(&END_TIME ,'YYYY-MM-DD HH24:MI:SS')
AND OPERATION ='BACKUP'
AND STATUS ='COMPLETED'
其中STATUS主要有RUNNING、RUNNING WITH WARNINGS、RUNNING WITH ERRORS、COMPLETED、COMPLETED WITH WARNINGS、COMPLETED WITH ERRORS、FAILED等几种状态。另外,如果你在命令窗口输入
[oracle@MyDB ~]$ date
Tue Jul 19 10:52:02 CST 2016
[oracle@MyDB ~]$ rman target /
Recovery Manager: Release 10.2.0.5.0 - Production on Tue Jul 19 10:52:13 2016
Copyright (c) 1982, 2007, Oracle. All rights reserved.
connected to target database: SCM2 (DBID=3990839260)
RMAN>
在V$RMAN_STATUS里面,你会看到插入了一条记录STATUS为RUNNING状态
SQL> COL ROW_TYPE FOR A10;
SQL> COL OPERATION FOR A10;
SQL> COL COMMAND_ID FOR A20;
SQL> COL STATUS FOR A30;
SQL> COL OBJECT_TYPE FOR A16;
SQL> SELECT ROW_TYPE, COMMAND_ID, OPERATION, STATUS,OBJECT_TYPE
2 FROM V$RMAN_STATUS
3 WHERE START_TIME >= TO_DATE('2016-07-19 10:52:00', 'YYYY-MM-DD HH24:MI:SS');
ROW_TYPE COMMAND_ID OPERATION STATUS OBJECT_TYPE
---------- -------------------- ---------- ----------------- ----------------
SESSION 2016-07-19T10:52:13 RMAN RUNNING
此时如果在RMAN中随意执行一个错误命令,如下所示
[oracle@MyDB ~]$ rman target /
Recovery Manager: Release 10.2.0.5.0 - Production on Tue Jul 19 10:52:13 2016
Copyright (c) 1982, 2007, Oracle. All rights reserved.
connected to target database: SCM2 (DBID=3990839260)
RMAN> /
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-00558: error encountered while parsing input commands
RMAN-01006: error signalled during parse
RMAN-02001: unrecognized punctuation symbol "/"
SQL> COL ROW_TYPE FOR A10;
SQL> COL OPERATION FOR A10;
SQL> COL COMMAND_ID FOR A20;
SQL> COL STATUS FOR A30;
SQL> COL OBJECT_TYPE FOR A16;
SQL> SELECT ROW_TYPE, COMMAND_ID, OPERATION, STATUS,OBJECT_TYPE
2 FROM V$RMAN_STATUS
3 WHERE START_TIME >= TO_DATE('2016-07-19 10:52:00', 'YYYY-MM-DD HH24:MI:SS');
ROW_TYPE COMMAND_ID OPERATION STATUS OBJECT_TYPE
---------- -------------------- ---------- --------------------- ----------------
SESSION 2016-07-19T10:52:13 RMAN RUNNING WITH ERRORS
在RMAN中退出,此时你会看到STAUS记录从"RUNNING WITH ERRORS"变成了"COMPLETED WITH ERRORS"
也就是说,你可以在这个视图里面查看在RMAN里面执行的一些操作,例如删除归档日志等,另外,如果要查看RMAN的输出,可以查看V$RMAN_OUTPUT,V$RMAN_OUTPUT 视图记录了RMAN生成的信息,这是在内存中额视图,不会记录到控制文件上。最多有37278条记录.
当然也可以从V$RMAN_BACKUP_JOB_DETAILS中查看RMAN备份更详细的信息。
SELECT START_TIME,
END_TIME,
OUTPUT_DEVICE_TYPE,
STATUS,
ELAPSED_SECONDS,
COMPRESSION_RATIO,
INPUT_BYTES_DISPLAY,
OUTPUT_BYTES_DISPLAY
FROM V$RMAN_BACKUP_JOB_DETAILS
ORDER BY START_TIME DESC ;
另外,如果我们需要查看RMAN备份的一些详细记录,在惜分飞的通过sql查询rman备份信息博客里面分享了下面一些经典的SQL语句。收录在此。
查看所有备份集详细信息:
SELECT A.RECID "BACKUP SET",
A.SET_STAMP,
DECODE (B.INCREMENTAL_LEVEL,
'', DECODE (BACKUP_TYPE, 'L', 'Archivelog', 'Full'),
1, 'Incr-1级',
0, 'Incr-0级',
B.INCREMENTAL_LEVEL)
"Type LV",
B.CONTROLFILE_INCLUDED "包含CTL",
DECODE (A.STATUS,
'A', 'AVAILABLE',
'D', 'DELETED',
'X', 'EXPIRED',
'ERROR')
"STATUS",
A.DEVICE_TYPE "Device Type",
A.START_TIME "Start Time",
A.COMPLETION_TIME "Completion Time",
A.ELAPSED_SECONDS "Elapsed Seconds",
A.BYTES/1024/1024/1024 "Size(G)",
A.COMPRESSED,
A.TAG "Tag",
A.HANDLE "Path"
FROM GV$BACKUP_PIECE A, GV$BACKUP_SET B
WHERE A.SET_STAMP = B.SET_STAMP AND A.DELETED = 'NO'
RDER BY A.COMPLETION_TIME DESC;
查找某个备份集中包含数据文件
SELECT DISTINCT c.file#,A.SET_STAMP, D.NAME, C.CHECKPOINT_CHANGE#, C.CHECKPOINT_TIME
FROM V$BACKUP_DATAFILE C, V$BACKUP_PIECE A, V$DATAFILE D
WHERE A.SET_STAMP = C.SET_STAMP
AND D.FILE# = C.FILE#
AND A.DELETED='NO'
AND c.set_stamp=&set_stamp
ORDER BY C.FILE#;
查询某个备份集中控制文件
SELECT DISTINCT A.SET_STAMP,
D.NAME,
C.CHECKPOINT_CHANGE#,
C.CHECKPOINT_TIME
FROM V$BACKUP_DATAFILE C, V$BACKUP_PIECE A, V$CONTROLFILE D
WHERE A.SET_STAMP = C.SET_STAMP
AND C.FILE# = 0
AND A.DELETED = 'NO'
AND C.SET_STAMP = &SET_STAMP;
查看某个备份集中归档日志:
SELECT DISTINCT B.SET_STAMP,
B.THREAD#,
B.SEQUENCE#,
B.FIRST_TIME,
B.FIRST_CHANGE#,
B.NEXT_TIME,
B.NEXT_CHANGE#
FROM V$BACKUP_REDOLOG B, V$BACKUP_PIECE A
WHERE A.SET_STAMP = B.SET_STAMP
AND A.DELETED = 'NO'
AND B.SET_STAMP = &SET_STAMP
ORDER BY THREAD#, SEQUENCE#;
查看某个备份集SPFILE
SELECT DISTINCT A.SET_STAMP, B.COMPLETION_TIME, HANDLE
FROM V$BACKUP_SPFILE B, V$BACKUP_PIECE A
WHERE A.SET_STAMP = B.SET_STAMP
AND A.DELETED = 'NO'
AND B.SET_STAMP = &SET_STAMP;
查看RMAN的配置信息
SELECT NAME,VALUE FROM V$RMAN_CONFIGURATION;
###参考文档6
https://www.cnblogs.com/DataArt/p/10018940.html
也就是RMAN调用了一个内部的包 sys.dbms_backup_restore.get_connect_identifier来获得在备库连接主库时需要用到的串。这时我们需要知道这个串是在哪里设置的,为何为空。
接下来,针对RMAN进行10046 trace:
[oracle@test1 ~]$ rman target / debug trace=/tmp/rman_debug
Recovery Manager: Release 11.2.0.4.0 - Production on Mon Mar 17 09:00:00 2014
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
RMAN-06568: connected to target database: R11204 (DBID=2001766638, not open)
RMAN> sql "alter session set tracefile_identifier=''rman_10046''";
RMAN-06009: using target database control file instead of recovery catalog
RMAN-06162: sql statement: alter session set tracefile_identifier=''rman_10046''
RMAN> sql "alter session set events ''10046 trace name context forever,level 12''";
RMAN-06162: sql statement: alter session set events ''10046 trace name context forever,level 12''
RMAN> backup archivelog all;
RMAN-03090: Starting backup at 25-MAR-14
RMAN-06820: WARNING: failed to archive current log at primary database
RMAN-06613: Connect identifier for DB_UNIQUE_NAME R11204 not configured
...
查看生成的trace file,这个文件在udump下:
$cd /u01/app/diag/rdbms/sdy/SDY/trace
$ls -ltr
-rw-r----- 1 oracle oinstall 1037463 Mar 25 14:11 SDY_ora_3792_rman_10046.trc
PARSING IN CURSOR #140366085001120 len=119 dep=0 uid=0 oct=47 lid=0 tim=1395736859520777 hv=3388798669 ad='7ec65738' sqlid='7pwt2c34
ztxqd'
begin :lprimary_db_cs := sys.dbms_backup_restore.get_connect_identifier (dbuname=> :primary_dbuname); end;
END OF STMT
PARSE #140366085001120:c=0,e=285,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=0,tim=1395736859520776
BINDS #140366085001120:
Bind#0
oacdty=01 mxl=2000(1536) mxlc=00 mal=00 scl=00 pre=00
oacflg=01 fl2=1000000 frm=01 csi=873 siz=2128 off=0
kxsbbbfp=7fa986a27f08 bln=2000 avl=00 flg=05
Bind#1
oacdty=01 mxl=128(90) mxlc=00 mal=00 scl=00 pre=00
oacflg=01 fl2=1000000 frm=01 csi=873 siz=0 off=2000
kxsbbbfp=7fa986a286d8 bln=128 avl=06 flg=01
value="R11204"
*** ACTION NAME:(0000018 STARTED189) 2014-03-25 14:10:59.521
WAIT #140366085001120: nam='control file sequential read' ela= 10 file#=0 block#=1 blocks=1 obj#=-1 tim=1395736859521532
WAIT #140366085001120: nam='control file sequential read' ela= 4 file#=0 block#=16 blocks=1 obj#=-1 tim=1395736859521566
WAIT #140366085001120: nam='control file sequential read' ela= 4 file#=0 block#=18 blocks=1 obj#=-1 tim=1395736859521580
WAIT #140366085001120: nam='control file sequential read' ela= 4 file#=0 block#=281 blocks=1 obj#=-1 tim=1395736859521594
WAIT #140366085001120: nam='control file sequential read' ela= 4 file#=0 block#=1 blocks=1 obj#=-1 tim=1395736859521614
WAIT #140366085001120: nam='control file sequential read' ela= 3 file#=0 block#=16 blocks=1 obj#=-1 tim=1395736859521627
WAIT #140366085001120: nam='control file sequential read' ela= 2 file#=0 block#=18 blocks=1 obj#=-1 tim=1395736859521638
WAIT #140366085001120: nam='control file sequential read' ela= 3 file#=0 block#=281 blocks=1 obj#=-1 tim=1395736859521650
krsd_get_primary_connect_string: found pcs '' by FAL_SERVER lookup <====================用FAL_SERVER找到了连接串''
所以这个10046 trace,很清楚地告诉我们它是从参数FAL_SERVER上获得了连接串''。
这时,连接到备库,查看参数FAL_SERVER,它的值的确为空:
SQL> show parameter fal
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
fal_client string
fal_server string
到此,我们通过RMAN debug和10046 trace,获得了我们想要的信息。
总结一下:
如果在执行RMAN命令后,遇到了性能问题或者需要深入跟踪一个错误,那么可以考虑使用rman debug:
$ rman target <connection> catalog <connection> debug trace=/tmp/rmanDebug.trc log=/tmp/rmanLog.txt
run {
...Run your backup commands here
}
如果还需要跟进一步的跟踪可以再使用10046 trace:
$ rman target <connection> catalog <connection> debug trace=/tmp/rmanDebug.trc log=/tmp/rmanLog.txt
RMAN> sql "alter session set tracefile_identifier=''rman_10046''";
RMAN> sql "alter session set events ''10046 trace name context forever,level 12''";
RMAN> run-your-commands;
RMAN> exit;
需要注意的是,上面的这些方法可能会生成大量文件,需要考虑对磁盘空间的压力以及对RMAN的性能的影响。
可以参考MOS文档:RMAN: Quick Debugging Guide (Doc ID 1198753.1)