客户联络说他观察到,每天的某个固定时刻,OEM会报告比较突出的 transport lag ,平时的 transport lag,几乎是没有的。
观察这个时间段的主库的 alert log,可以看出些问题:
比如,在问题尚未发生的 4/5 05:45 左右,开始生成 sequence# = 21595 的 archive log:
Web Apr 05 05:45:15 2021 Beginning log switch checkpoint up to RBA [0x86bf.2.10], SCN: 215805932158 Thread 1 advanced to log sequence 21595 (LGWR switch) *** Current log# 4 seq# 21595 mem# 0: +DATA1/orclp/onlinelog/group_3.3725.845417343 ......
而下一个 sequence 的 archive log,是什么时候开始生成的呢?
Web Apr 05 10:02:07 2021 Beginning log switch checkpoint up to RBA [0x76c1.2.10], SCN: 215811522963 Thread 1 advanced to log sequence 21596 (LGWR switch) *** Current log# 1 seq# 21596 mem# 0: +DATA1/orclp/onlinelog/group_4.3703.845417337 ...... Web Apr 05 10:02:30 2021 LNS: Completed archiving log 4 thread 1 sequence 21595 (orclp_1) LNS: Closing remote archive destination LOG_ARCHIVE_DEST_2: 'DG_STY' (orclp_1) LNS: Creating remote archive destination LOG_ARCHIVE_DEST_2: 'DG_STY' (thread 1 sequence 21596) (orclp_1) *** LNS: Standby redo logfile selected for thread 1 sequence 21596 for destination LOG_ARCHIVE_DEST_2 LNS: Standby redo logfile selected for thread 1 sequence 21596 for destination LOG_ARCHIVE_DEST_2 LNS: Beginning to archive log 1 thread 1 sequence 21596 (orclp_1) ......
已经是几个小时之后了。这正是客户反映开始出现 transport lag 的时刻。这之后,生成新的 sequence# 的 archive log 比较频繁了。
Sequence#= 21597 的 REDO :
Web Apr 05 10:05:22 2021 Beginning log switch checkpoint up to RBA [0x76c3.2.10], SCN: 215811692247 Thread 1 advanced to log sequence 21597 (LGWR switch) *** Current log# 2 seq# 21597 mem# 0: +DATA1/orclp/onlinelog/group_1.3709.845417339 ...... Web Apr 05 10:10:40 2021 LNS: Completed archiving log 1 thread 1 sequence 21596 (orclp_1) LNS: Closing remote archive destination LOG_ARCHIVE_DEST_2: 'DG_STY' (orclp_1) *** LNS: Creating remote archive destination LOG_ARCHIVE_DEST_2: 'DG_STY' (thread 1 sequence 21597) (orclp_1) *** LNS: Standby redo logfile selected for thread 1 sequence 21597 for destination LOG_ARCHIVE_DEST_2 LNS: Standby redo logfile selected for thread 1 sequence 21597 for destination LOG_ARCHIVE_DEST_2 LNS: Beginning to archive log 2 thread 1 sequence 21597 (orclp_1)
Sequence#=21598 的 REDO:
Web Apr 05 10:12:34 2021 Beginning log switch checkpoint up to RBA [0x87d3.2.10], SCN: 215811845211 Thread 1 advanced to log sequence 21598 (LGWR switch) *** Current log# 3 seq# 21598 mem# 0: +DATA1/orclp/onlinelog/group_3.3713.845417341 ...... Web Apr 05 10:16:17 2021 Incremental checkpoint up to RBA [0x76c3.2940d5.0], current log tail at RBA [0x87d3.2654a5.0] Web Apr 05 10:16:20 2021 LNS: Completed archiving log 2 thread 1 sequence 21597 (orclp_1) LNS: Closing remote archive destination LOG_ARCHIVE_DEST_2: 'DG_STY' (orclp_1) LNS: Creating remote archive destination LOG_ARCHIVE_DEST_2: 'DG_STY' (thread 1 sequence 21598) (orclp_1) *** LNS: Standby redo logfile selected for thread 1 sequence 21598 for destination LOG_ARCHIVE_DEST_2 LNS: Standby redo logfile selected for thread 1 sequence 21598 for destination LOG_ARCHIVE_DEST_2 LNS: Beginning to archive log 3 thread 1 sequence 21598 (orclp_1)
Sequence#=21599 的 REDO:
Web Apr 05 10:17:46 2021 Beginning log switch checkpoint up to RBA [0x95d2.2.10], SCN: 215812104538 Thread 1 advanced to log sequence 21599 (LGWR switch) *** Current log# 4 seq# 21599 mem# 0: +DATA1/orclp/onlinelog/group_3.3725.845417343 ...... Web Apr 05 10:22:44 2021 Completed checkpoint up to RBA [0x95d2.2.10], SCN: 215812104538 Web Apr 05 10:22:53 2021 LNS: Completed archiving log 3 thread 1 sequence 21598 (orclp_1) LNS: Closing remote archive destination LOG_ARCHIVE_DEST_2: 'DG_STY' (orclp_1) LNS: Creating remote archive destination LOG_ARCHIVE_DEST_2: 'DG_STY' (thread 1 sequence 21599) (orclp_1) *** LNS: Standby redo logfile selected for thread 1 sequence 21599 for destination LOG_ARCHIVE_DEST_2 LNS: Standby redo logfile selected for thread 1 sequence 21599 for destination LOG_ARCHIVE_DEST_2 LNS: Beginning to archive log 4 thread 1 sequence 21599 (orclp_1)
在主库端快速生成大量REDO 的时候,备库端需要开启RFS进程,利用 Standby REDO 来接收这些 REDO数据,一时之间接收的速度赶不上产生REDO的数据,就会出现 transport lag。