与数据库打过交道的人,大多都知道:长时间未提交的事务,会导致更多的锁冲突,而且也会导致主从复制的延时高。
针对这个问题,我曾经想过,能否通过分析slow query log来定位?
首先,我从官方文档中看到如下描述:
Property | Value |
---|---|
Command-Line Format | --long-query-time=# |
System Variable | long_query_time |
Scope | Global, Session |
Dynamic | Yes |
Type | Numeric |
Default Value | 10 |
Minimum Value | 0 |
If a query takes longer than this many seconds, the server increments the Slow_queries
status variable. If the slow query log is enabled, the query is logged to the slow query log file. This value is measured in real time, not CPU time, so a query that is under the threshold on a lightly loaded system might be above the threshold on a heavily loaded one. The minimum and default values of long_query_time
are 0 and 10, respectively. The value can be specified to a resolution of microseconds. See Section 5.4.5, “The Slow Query Log”.
上述被红线标示出来的内容,只是说long_query_time参数的值,指的是真实花费的时间,不是CPU时间,并没有说明是否包含等待锁的时间,以及持有锁的时间;
另外,请注意"If a query takes ..."这句话,说明slow query log记录的是以query为单位,而不是以transaction为单位。
然后,我通过简单的实验进行论证:
实验前提: MySQL版本:5.7.12 on Windows 10 long_query_time=0.1 slow_query_log=ON 表信息: CREATE TABLE `t1` ( `id` INT(11) NOT NULL AUTO_INCREMENT, `c2` VARCHAR(100) NULL DEFAULT NULL, PRIMARY KEY (`id`) ) COLLATE='utf8mb4_general_ci' ENGINE=InnoDB AUTO_INCREMENT=196593 ; 执行insert into t1 (c2) select c2 from t1;往表里插入了约10万条数据。
1.长时间未提交的事务,会不会被记录?
--只开一个会话,执行如下SQL: set autocommit=0; delete from t1 where id=3; 等待数秒钟,然后去查看slow_query_log_file参数对应的文件中是否有记录, 结论是,没有
2.等待锁的时间花费的时间超过了long_query_time的阈值,但是DML语句本身没有超过阈值,会不会被记录?
会话1: | 会话2: |
set autocommit=0; |
set autocommit=0; |
delete from t1 where id=3; select sleep(2); |
|
delete from t1 where id=3; | |
rollback; | |
commit; |
结论是:slow query log中,只记录了会话1执行的select sleep(2),两个会话分别执行过的delete from t1 where id=3则没有被记录。
3.DML语句超过了阈值,但是回滚了,是否会被记录?
--只需要开启一个会话: set autocommit=0; insert into t1 (c2) select c2 from t1; rollback; /* Affected rows: 98,305 已找到记录: 0 警告: 0 持续时间 3 queries: 00:03:38 */ 结论:被回滚的DML语句,如果超过了阈值,会被记录到slow query log中
4.同一个事务中的多条语句,部分语句超过了阈值,那么是记录所有的语句,还是超过阈值的那部分语句?
--只需要开启一个会话: set autocommit=0; insert into t1 (c2) select c2 from t1; insert into t1(c2) values('aaa'); commit; 结论:只有超过了阈值的语句会被记录,未超过的则不会被记录
最后,去源码中追根溯源:
去源码中搜索long_query_time关键字,首先可以看到sql/sys_vars.cc中有如下相关的定义:
1 static Sys_var_double Sys_long_query_time( 2 "long_query_time", 3 "Log all queries that have taken more than long_query_time seconds " 4 "to execute to file. The argument will be treated as a decimal value " 5 "with microsecond precision", 6 SESSION_VAR(long_query_time_double), 7 CMD_LINE(REQUIRED_ARG), VALID_RANGE(0, LONG_TIMEOUT), DEFAULT(10), 8 NO_MUTEX_GUARD, NOT_IN_BINLOG, ON_CHECK(0), 9 ON_UPDATE(update_cached_long_query_time)); 10 11 static bool update_cached_long_query_time(sys_var *self, THD *thd, 12 enum_var_type type) 13 { 14 if (type == OPT_SESSION) 15 thd->variables.long_query_time= 16 double2ulonglong(thd->variables.long_query_time_double * 1e6); 17 else 18 global_system_variables.long_query_time= 19 double2ulonglong(global_system_variables.long_query_time_double * 1e6); 20 return false; 21 } 22 23 24 可以看到,MySQL使用long_query_time参数的值,来更新thd->variables.long_query_time(会话级) 或 global_system_variables.long_query_time(全局级)变量的值
接着,使用variables.long_query_time关键字查找,可以看到在sql/sql_class.h中有如下定义:
1 /** 2 Update server status after execution of a top level statement. 3 4 Currently only checks if a query was slow, and assigns 5 the status accordingly. 6 Evaluate the current time, and if it exceeds the long-query-time 7 setting, mark the query as slow. 8 */ 9 void update_server_status() 10 { 11 ulonglong end_utime_of_query= current_utime(); 12 if (end_utime_of_query > utime_after_lock + variables.long_query_time) 13 server_status|= SERVER_QUERY_WAS_SLOW; 14 } 15 16 17 可以看到,判断逻辑是if (end_utime_of_query > utime_after_lock + variables.long_query_time),也就是说,如果query结束的时间 > 锁等待的时间 + long_query_time变量设置的阈值,就修改server_status为SERVER_QUERY_WAS_SLOW
那么,update_server_status()函数在哪里被调用呢?通过搜索,发现在sql/sql_parse.cc中的dispatch_command()函数会调用update_server_status():
1 bool dispatch_command(THD *thd, const COM_DATA *com_data, 2 enum enum_server_command command) 3 { 4 ...省略若干行 5 /* Finalize server status flags after executing a statement. */ 6 thd->update_server_status(); 7 ...省略若干行 8 log_slow_statement(thd); 9 ...省略若干行 10 } 11 12 可以看到,在这里调用了update_server_status(),然后据thd->server_status 是否包含 SERVER_QUERY_WAS_SLOW 标志,决定是否写入慢查询日志。 13 14 sql/log.cc中log_slow_statement()函数很简短: 15 16 void log_slow_statement(THD *thd) 17 { 18 if (log_slow_applicable(thd)) 19 log_slow_do(thd); 20 } 21 22 其中,在log_slow_applicable()函数中,根据log_slow_admin_statements参数和log_queries_not_using_indexes参数的设置,判断是否记录未使用索引的语句和管理语句。 23 然后调用log_slow_do(): 24 void log_slow_do(THD *thd) 25 { 26 THD_STAGE_INFO(thd, stage_logging_slow_query); 27 //修改MySQL Slow_queries状态值 28 thd->status_var.long_query_count++; 29 30 //根据rewritten_query变量的值,判断是否需要改写语句,比如一些涉及到password的管理语句,则需要把密码替换成hash值
//调用query_logger.slow_log_write,将slow query log写入表,或者日志文件 31 if (thd->rewritten_query.length()) 32 query_logger.slow_log_write(thd, 33 thd->rewritten_query.c_ptr_safe(), 34 thd->rewritten_query.length()); 35 else 36 query_logger.slow_log_write(thd, thd->query().str, thd->query().length); 37 }
最后,如果log_output设置的是FILE,则会调用sql/log.cc的如下函数:(*current_handler++)->log_slow()函数 -> Log_to_file_event_handler::log_slow() -> mysql_slow_log.write_slow()函数,最终调用my_b_write宏,来将slow query log持久化到磁盘文件。
题外话:如何定位或解决长时间未提交的事务呢?
方法有几种:
1、如果使用的是Percona分支,则可以使用innodb_kill_idle_transaction或kill_idle_transaction参数来自动杀掉长时间未提交的事务。 参考:https://www.percona.com/doc/percona-server/5.6/management/innodb_kill_idle_trx.html https://www.percona.com/doc/percona-server/5.7/management/innodb_kill_idle_trx.html#kill_idle_transaction
2、如果未使用Percona分支或MariaDB分支, 而使用的是官方原版的话,可以使用如下语句找出来,然后手动kill:
SELECT a.trx_id, trx_state, trx_started, b.id AS thread_id, b.info, b.user, b.host, b.db, b.command, b.state FROM information_schema.`INNODB_TRX` a, information_schema.`PROCESSLIST` b WHERE a.trx_mysql_thread_id = b.id and trx_state='RUNNING' and b.COMMAND='sleep' ORDER BY a.trx_started;
3、上述SQL只能找到是哪个会话导致的问题,却不知道执行的是什么SQL,如果是5.7以上的版本,并且开启了performance_schema的话,可以使用sys视图,参考:https://www.cnblogs.com/gaogao67/p/10790520.html 例如:
## 查看未提交的事务(3秒内未操作的事务) SELECT p.ID AS conn_id, P.USER AS login_user, P.HOST AS login_host, p.DB AS database_name, P.TIME AS trx_sleep_seconds, TIME_TO_SEC(TIMEDIFF(NOW(),T.trx_started)) AS trx_open_seconds, T.trx_started, T.trx_isolation_level, T.trx_tables_locked, T.trx_rows_locked, t.trx_state, p.COMMAND AS process_state, ( SELECT GROUP_CONCAT(REPLACE(REPLACE(REPLACE(T1.`SQL_TEXT`,' ',' '),' ',' '),' ',' ') SEPARATOR '; ') FROM performance_schema.events_statements_history AS T1 INNER JOIN performance_schema.threads AS T2 ON T1.`THREAD_ID`=T2.`THREAD_ID` WHERE T2.`PROCESSLIST_ID`=P.id ) AS trx_sql_text FROM `information_schema`.`INNODB_TRX` t INNER JOIN `information_schema`.`PROCESSLIST` p ON t.trx_mysql_thread_id=p.id WHERE t.trx_state='RUNNING' AND p.COMMAND='Sleep' AND P.TIME>3 ORDER BY T.trx_started ASC G
4、使用工具分析binlog来定位,比如这个大佬写的infobin工具(https://github.com/gaopengcarl/infobin),就可以查出哪些是大事务,哪些是长时间未提交的事务。
具体使用方法,请参考他的文档。
本文参考链接: