一、问题现象
客户反馈,通过监控软件发现DB 在今天凌晨2-3点期间,DB 负载比较高,并且有一个显著异常 一个执行表空间使用率SQL执行了1个多小时!
二、问题分析
根据Oracle ash定位问题 sql or event
select to_char(SAMPLE_TIME,'yyyy-mm-dd hh24') as "date", event,sql_id,count(*) from gv$active_session_history where SAMPLE_TIME between to_date('2021-01-07 00','yyyy-mm-dd hh24') and to_date('2021-01-07 04','yyyy-mm-dd hh24') group by to_char(SAMPLE_TIME,'yyyy-mm-dd hh24') ,event,sql_id order by 4,1,2,3; date EVENT SQL_ID COUNT(*) ------------- ---------------------------------------------------------------- ------------- ---------- 2021-01-07 03 db file sequential read 1tu4z2y5hywu4 39 2021-01-07 03 enq: TX - contention gjm43un5cy843 39 2021-01-07 03 3nkd3g3ju5ph1 63 2021-01-07 03 enq: TX - contention 12bxyygja63sp 3120 33 rows selected.
可以发现造成本次DB TIME 上涨的原因是行锁
进一步分析行锁的情况
select count(*),to_char(SAMPLE_TIME,'yyyy-mm-dd hh24') as "date",inst_id,SESSION_ID,SESSION_SERIAL#, user_id,SQL_ID,BLOCKING_SESSION,BLOCKING_SESSION_SERIAL#,BLOCKING_INST_ID from gv$active_session_history where SAMPLE_TIME between to_date('2021-01-07 00','yyyy-mm-dd hh24') and to_date('2021-01-07 04','yyyy-mm-dd hh24') and event='enq: TX - contention' group by to_char(SAMPLE_TIME,'yyyy-mm-dd hh24'),inst_id,SESSION_ID,SESSION_SERIAL#, user_id,SQL_ID,BLOCKING_SESSION,BLOCKING_SESSION_SERIAL#,BLOCKING_INST_ID order by 1; COUNT(*) date INST_ID SESSION_ID SESSION_SERIAL# USER_ID SQL_ID BLOCKING_SESSION BLOCKING_SESSION_SERIAL# BLOCKING_INST_ID ---------- ------------- ---------- ---------- --------------- ---------- ------------- ---------------- ------------------------ ---------------- 24 2021-01-07 03 1 75 19617 144 12bxyygja63sp 698 42689 1 24 2021-01-07 03 1 257 25783 144 12bxyygja63sp 698 42689 1 24 2021-01-07 03 1 823 33555 144 12bxyygja63sp 698 42689 1 24 2021-01-07 03 1 892 36567 144 12bxyygja63sp 698 42689 1 24 2021-01-07 03 1 1017 47891 144 12bxyygja63sp 698 42689 ····· 24 2021-01-07 03 1 2968 4809 144 12bxyygja63sp 698 42689 1 24 2021-01-07 03 1 512 39289 144 12bxyygja63sp 698 42689 1 24 2021-01-07 03 1 759 28393 0 gjm43un5cy843 698 42689 1 24 2021-01-07 03 1 830 52047 144 12bxyygja63sp 698 42689 1 24 2021-01-07 03 1 1263 48807 144 12bxyygja63sp 698 42689 1 24 2021-01-07 03 1 1390 4301 144 12bxyygja63sp 698 42689 1 24 2021-01-07 03 1 1579 58223 144 12bxyygja63sp 698 42689 1 24 2021-01-07 03 1 1708 22485 144 12bxyygja63sp 698 42689 1 24 2021-01-07 03 1 2090 46765 144 12bxyygja63sp 698 42689 1 81 rows selected.
行锁阻塞源头都是同一个session 定位该阻塞源头 select count(*),to_char(SAMPLE_TIME,'yyyy-mm-dd hh24') as "date",inst_id,SESSION_ID,SESSION_SERIAL#, user_id,SQL_ID,BLOCKING_SESSION,BLOCKING_SESSION_SERIAL#,BLOCKING_INST_ID from gv$active_session_history where SAMPLE_TIME between to_date('2021-01-07 00','yyyy-mm-dd hh24') and to_date('2021-01-07 04','yyyy-mm-dd hh24') and inst_id=1 and SESSION_ID=698 and SESSION_SERIAL#=42689 group by to_char(SAMPLE_TIME,'yyyy-mm-dd hh24'),inst_id,SESSION_ID,SESSION_SERIAL#, user_id,SQL_ID,BLOCKING_SESSION,BLOCKING_SESSION_SERIAL#,BLOCKING_INST_ID order by 1; COUNT(*) date INST_ID SESSION_ID SESSION_SERIAL# USER_ID SQL_ID BLOCKING_SESSION BLOCKING_SESSION_SERIAL# BLOCKING_INST_ID ---------- ------------- ---------- ---------- --------------- ---------- ------------- ---------------- ------------------------ ---------------- 14 2021-01-07 03 1 698 42689 0 1t68u12gc1mzh
检查相关SQL信息! select sql_id,sql_fulltext from v$sql where sql_id in('12bxyygja63sp','gjm43un5cy843','1t68u12gc1mzh'); SQL_ID SQL_FULLTEXT ------------- -------------------------------------------------------------------------------- gjm43un5cy843 SELECT SUM(USED), SUM(TOTAL) FROM (SELECT /*+ ORDERED */ SUM(D.BYTES)/(1024*1024 12bxyygja63sp select a.tablespace_name,round((1-a.use_byte/b.max_byte)*100,2) freepercent, 1t68u12gc1mzh BEGIN SYS.KUPW$WORKER.MAIN('SYS_IMPORT_TABLE_01', 'SYS', 0); END;
三、问题总结
经过与客户沟通确认SQL文本,可以得到如下信息:
阻塞源头, sys用户执行impdp导入某个表!
被阻塞的会话,执行dba_segments,dba_data_files视图!现象是行锁!
基本上可以理解为,数据泵在执行某个对象表的导入过程中, SQL查询dba_segments 涉及了这个表段的统计,由于导入过程中段大小是存在变换,因此存在行锁现象!
正常情况下,select 与insert并不冲突,因此selct dba_segments时,有内置操作,这个我们不太清楚,mos并未搜到这种情况,无法认为Bug.
建议查询与导入错开,或者手工kill select 查询语句!