在之前InooDB 加锁总结的文章中,讨论了大量在 RR 情况下 MySQL 如何加锁的案例。而这篇相较于前一篇,更偏重于实践,主要是在遇到锁等待或者死锁时,如何分析和解决问题。这篇讲解的案例全都基于 RC 隔离级别,MySQL 版本为 5.7.x.
我们知道,RC 相较于 RR 在很大程度上调高了并发性,降低了死锁发生的概率,因而作为大多数高并发场景的首选。
但是降低并不代表消除,如果设计的索引或者语句的写法不当,依旧会产生死锁等问题。在这篇文章中,将会围绕着一个实际案例进行讨论。
假设在数据库中有这样一张表结构:其中 ID 为主键索引,其余字段都没有索引。表中共有 6 条数据,对 id 有个印象,在后续分析时会用到。
mysql> desc device_management_service_mapping;
+---------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| dst_device_id | int(11) | YES | | NULL | |
| dst_ip | varchar(255) | YES | | NULL | |
| ipp_type | varchar(255) | YES | | NULL | |
| operation_id | int(11) | NO | | NULL | |
| packets | int(11) | YES | | NULL | |
| src_device_id | int(11) | NO | | NULL | |
| src_ip | varchar(255) | YES | | NULL | |
| type | varchar(255) | YES | | NULL | |
| created_at | datetime(6) | YES | | NULL | |
| updated_at | datetime(6) | YES | | NULL | |
| description | varchar(256) | YES | | NULL | |
+---------------+--------------+------+-----+---------+----------------+
mysql> SELECT id, src_device_id, operation_id FROM device_management_service_mapping;
+----+---------------+--------------+
| id | src_device_id | operation_id |
+----+---------------+--------------+
| 85 | 13 | 10001 |
| 86 | 13 | 10002 |
| 87 | 1 | 10001 |
| 88 | 1 | 10002 |
| 89 | 3 | 10001 |
| 90 | 3 | 10002 |
+----+---------------+--------------+
需要关注的仅是 id
, src_device_id
, operation_id
这三个字段,下面的案例将围绕这三个字段展开,分别讨论:
- 在没有索引的情况下,RC 的加锁过程。
- 在有二级索引的情况下,RC 的加锁过程。
- 以及 RC 如何通过
semi-consistant
提高并发。
准备步骤:
在分析案例前,需要收集一些日志信息,便于我们排错:
- 打开 InooDB 锁日志:
show variables like 'innodb_status_output';
SET GLOBAL innodb_status_output=ON;
show variables like 'innodb_status_output';
SET GLOBAL innodb_status_output_locks=ON;
show variables like '%tx_isolation%';
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;
show engine innodb statusG;
- 打开 SQL 记录,用于分析 transcation.
SHOW VARIABLES LIKE "general_log%";
SET GLOBAL general_log = 'ON';
案例一:无索引加锁情况
Session A | Session B |
---|---|
begin; | |
SELECT * FROM device_management_service_mapping where src_device_id=1 AND operation_id=10001 FOR UPDATE; | begin; |
Query ok. | SELECT * FROM device_management_service_mapping where src_device_id=13 AND operation_id=10001 FOR UPDATE; |
block. | |
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction |
由于这里的 src_device_id 和 operation_id 均没有索引,所以我们推测在 Session A 和 Session B 执行时,加锁的过程采用的是全表扫描的方式。
mysql> explain SELECT * FROM device_management_service_mapping where src_device_id=13 AND operation_id=10001 FOR UPDATE;
+----+-------------+-----------------------------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------------------------+------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | device_management_service_mapping | ALL | NULL | NULL | NULL | NULL | 6 | Using where |
+----+-------------+-----------------------------------+------+---------------+------+---------+------+------+-------------+
接着来分析下加锁过程:
Session A 执行 SELECT 语句成功后,对应加锁范围是:
---TRANSACTION 873007, ACTIVE 3 sec
2 lock struct(s), heap size 360, 1 row lock(s)
MySQL thread id 27209, OS thread handle 0x7fecd45e8700, query id 5258428 10.124.206.88 root
TABLE LOCK table `ipsla`.`device_management_service_mapping` trx id 873007 lock mode IX
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 873007 lock_mode X locks rec but not gap
Record lock, heap no 4 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
0: len 4; hex 80000057; asc W;;
1: len 6; hex 0000000d503d; asc P=;;
2: len 7; hex 3d000001ae0694; asc = ;;
3: len 4; hex 80000003; asc ;;
4: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
5: len 4; hex 49505030; asc IPP0;;
6: len 4; hex 80002711; asc ' ;;
7: len 4; hex 800000c8; asc ;;
8: len 4; hex 80000001; asc ;;
9: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
10: len 8; hex 696e7465726e6574; asc internet;;
11: len 8; hex 99a812da1a000000; asc ;;
12: SQL NULL;
13: len 1; hex 31; asc 1;;
可以看到,虽然是全表扫描,但在语句执行后,并没有锁住所有行。这是因为在 RC 级别下,在搜索过程中会对所有行加锁,之后在找到对应的记录后,会释放不符合条件的行。所以仅仅锁住了 id=87 的行。
接着 Session B 执行了 SELECT 语句,然后被阻塞,对应加锁范围是:
---TRANSACTION 873008, ACTIVE 14 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 360, 2 row lock(s)
MySQL thread id 27208, OS thread handle 0x7fecd4522700, query id 5258431 10.124.206.88 root Sending data
SELECT * FROM device_management_service_mapping where src_device_id=13 AND operation_id=10001 FOR UPDATE
------- TRX HAS BEEN WAITING 14 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 873008 lock_mode X locks rec but not gap waiting
Record lock, heap no 4 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
0: len 4; hex 80000057; asc W;;
1: len 6; hex 0000000d503d; asc P=;;
2: len 7; hex 3d000001ae0694; asc = ;;
3: len 4; hex 80000003; asc ;;
4: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
5: len 4; hex 49505030; asc IPP0;;
6: len 4; hex 80002711; asc ' ;;
7: len 4; hex 800000c8; asc ;;
8: len 4; hex 80000001; asc ;;
9: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
10: len 8; hex 696e7465726e6574; asc internet;;
11: len 8; hex 99a812da1a000000; asc ;;
12: SQL NULL;
13: len 1; hex 31; asc 1;;
------------------
TABLE LOCK table `ipsla`.`device_management_service_mapping` trx id 873008 lock mode IX
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 873008 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
0: len 4; hex 80000055; asc U;;
1: len 6; hex 0000000d5032; asc P2;;
2: len 7; hex 37000003351908; asc 7 5 ;;
3: len 4; hex 80000001; asc ;;
4: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
5: len 4; hex 49505030; asc IPP0;;
6: len 4; hex 80002711; asc ' ;;
7: len 4; hex 800000c8; asc ;;
8: len 4; hex 8000000d; asc ;;
9: len 12; hex 31302e3132342e302e313539; asc 10.124.0.159;;
10: len 8; hex 696e7465726e6574; asc internet;;
11: len 8; hex 99a812da14000000; asc ;;
12: SQL NULL;
13: len 2; hex 3131; asc 11;;
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 873008 lock_mode X locks rec but not gap waiting
Record lock, heap no 4 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
0: len 4; hex 80000057; asc W;;
1: len 6; hex 0000000d503d; asc P=;;
2: len 7; hex 3d000001ae0694; asc = ;;
3: len 4; hex 80000003; asc ;;
4: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
5: len 4; hex 49505030; asc IPP0;;
6: len 4; hex 80002711; asc ' ;;
7: len 4; hex 800000c8; asc ;;
8: len 4; hex 80000001; asc ;;
9: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
10: len 8; hex 696e7465726e6574; asc internet;;
11: len 8; hex 99a812da1a000000; asc ;;
12: SQL NULL;
13: len 1; hex 31; asc 1;;
---TRANSACTION 873007, ACTIVE 41 sec
2 lock struct(s), heap size 360, 1 row lock(s)
MySQL thread id 27209, OS thread handle 0x7fecd45e8700, query id 5258428 10.124.206.88 root
TABLE LOCK table `ipsla`.`device_management_service_mapping` trx id 873007 lock mode IX
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 873007 lock_mode X locks rec but not gap
Record lock, heap no 4 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
0: len 4; hex 80000057; asc W;;
1: len 6; hex 0000000d503d; asc P=;;
2: len 7; hex 3d000001ae0694; asc = ;;
3: len 4; hex 80000003; asc ;;
4: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
5: len 4; hex 49505030; asc IPP0;;
6: len 4; hex 80002711; asc ' ;;
7: len 4; hex 800000c8; asc ;;
8: len 4; hex 80000001; asc ;;
9: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
10: len 8; hex 696e7465726e6574; asc internet;;
11: len 8; hex 99a812da1a000000; asc ;;
12: SQL NULL;
13: len 1; hex 31; asc 1;;
原来 Session A 对应的是 thread id 27209
,加锁范围没有任何变化。
着重分析 Session B mySQL thread id 27208
.
Session B 由于没有索引,执行全表扫描。从 id=85 开始,这里由于 Session A 仅对 id=87 的行加上了写锁。所以 Session B 是可以获取 id=85 的 X 锁的,id=86 同理,由于不符合过滤条件,加锁后又被释放。接着遍历到 id=87, 出现了锁等待,被阻塞。
日志中这两点都可以证明,
------- **TRX HAS BEEN WAITING 14 SEC FOR THIS LOCK TO BE GRANTED:**
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 873008 lock_mode X locks rec but not gap waiting
Record lock, heap no 4 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
0: len 4; hex 80000057; asc W;;
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 873008 **lock_mode X locks rec but not gap waiting**
Record lock, heap no 4 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
0: len 4; hex 80000057; asc W;;
需要注意的是,在加锁的过程中,无论找没找到符和的行,都需要从第一行开始,一直到最后一行,因为没有索引,需要进行全表扫描。在搜索结束后,会将不符合条件的行进行释放。
案例二:无索引加锁,造成死锁
Session A | Session B |
---|---|
begin: | |
SELECT * FROM device_management_service_mapping where src_device_id=3 AND operation_id=10001 FOR UPDATE; | begin; |
Query ok. | |
UPDATE device_management_service_mapping SET description='test' WHERE id=89; | |
SELECT * FROM device_management_service_mapping where src_device_id=13 AND operation_id=10001 FOR UPDATE; | |
block; | |
SELECT * FROM device_management_service_mapping where src_device_id=3 AND operation_id=10002 FOR UPDATE; | |
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction. | |
Query ok. |
分析下过程:
- 首先 Session A 执行 SELECT * FROM table FOR UPDATE. 由于这里 src_device_id 和 operation_id 没有索引,会走全表扫描。会把主键索引所有的行加上 X 锁,在查询结束后,仅持有id=89的行锁。
- 接着对 id=89 的数据也就是(src_device_id=3 AND operation_id=10001)进行更新。
- 然后 Session B 需要对 src_device_id=13 AND operation_id=10001 进行查找,同样需要走全表扫描,期望为所有主键索引加上 X 写锁。但这时 id=89 已经被 Session A 持有,所以被阻塞。可能会问,明明第一行的数据已经找到了,为什么不停止搜索,这是因为没有索引,需要进行全表扫描。此时 Session B 持有的锁是 id=85 的写锁及其期待索引 id=89 的写锁,进而被阻塞。
- 紧接着,Session A 又执行了一次 For Update 语句,需要重新全表扫描。从 id=85 开始申请 X 锁,但由于已经被 Session B 锁持有,形成阻塞状态,等待 Session A 执行完成。
- 这样就形成了死锁,Session B 等待 Session A 释放 X 锁(id=89),Session A 又等待 Session B 释放 id=85 的 X 锁。进而抛出死锁异常。
接着,详细分析下死锁的日志:
------------------------
LATEST DETECTED DEADLOCK
------------------------
2020-12-09 13:30:34 7fecdd3eb700
*** (1) TRANSACTION:
TRANSACTION 872502, ACTIVE 0 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 360, 2 row lock(s)
MySQL thread id 27234, OS thread handle 0x7fec0feb5700, query id 5255123 10.124.207.150 root Sending data
SELECT `device_management_service_mapping`.`id`, `device_management_service_mapping`.`dst_device_id`, `device_management_se_management_service_mapping`.`operation_id`, `device_management_service_mapping`.`packets`, `device_management_service_management_service_mapping`.`type`, `device_management_service_mapping`.`created_at`, `device_management_service_mapping`.`uement_service_mapping` WHERE (`device_management_service_mapping`.`operation_id` = 10001 AND `device_management_service_ma
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id
Record lock, heap no 6 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
0: len 4; hex 80000059; asc Y;;
1: len 6; hex 0000000d5035; asc P5;;
2: len 7; hex 38000002731b5e; asc 8 s ^;;
3: len 4; hex 80000001; asc ;;
4: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
5: len 4; hex 49505030; asc IPP0;;
6: len 4; hex 80002711; asc ' ;;
7: len 4; hex 800000c8; asc ;;
8: len 4; hex 80000003; asc ;;
9: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
10: len 8; hex 696e7465726e6574; asc internet;;
11: len 8; hex 99a812da19000000; asc ;;
12: SQL NULL;
13: SQL NULL;
*** (2) TRANSACTION:
TRANSACTION 872501, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1
MySQL thread id 27233, OS thread handle 0x7fecdd3eb700, query id 5255127 10.124.207.150 root Sending data
SELECT `device_management_service_mapping`.`id`, `device_management_service_mapping`.`dst_device_id`, `device_management_se_management_service_mapping`.`operation_id`, `device_management_service_mapping`.`packets`, `device_management_service_management_service_mapping`.`type`, `device_management_service_mapping`.`created_at`, `device_management_service_mapping`.`uement_service_mapping` WHERE (`device_management_service_mapping`.`operation_id` = 10002 AND `device_management_service_ma
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id
Record lock, heap no 6 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
0: len 4; hex 80000059; asc Y;;
1: len 6; hex 0000000d5035; asc P5;;
2: len 7; hex 38000002731b5e; asc 8 s ^;;
3: len 4; hex 80000001; asc ;;
4: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
5: len 4; hex 49505030; asc IPP0;;
6: len 4; hex 80002711; asc ' ;;
7: len 4; hex 800000c8; asc ;;
8: len 4; hex 80000003; asc ;;
9: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
10: len 8; hex 696e7465726e6574; asc internet;;
11: len 8; hex 99a812da19000000; asc ;;
12: SQL NULL;
13: SQL NULL;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 164 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id
Record lock, heap no 2 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
0: len 4; hex 80000055; asc U;;
1: len 6; hex 0000000d5032; asc P2;;
2: len 7; hex 37000003351908; asc 7 5 ;;
3: len 4; hex 80000001; asc ;;
4: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
5: len 4; hex 49505030; asc IPP0;;
6: len 4; hex 80002711; asc ' ;;
7: len 4; hex 800000c8; asc ;;
8: len 4; hex 8000000d; asc ;;
9: len 12; hex 31302e3132342e302e313539; asc 10.124.0.159;;
10: len 8; hex 696e7465726e6574; asc internet;;
11: len 8; hex 99a812da14000000; asc ;;
12: SQL NULL;
13: len 2; hex 3131; asc 11;;
*** WE ROLL BACK TRANSACTION (1)
可以看到显示有两个事务,(1) TRANSACTION 和 (2) TRANSACTION,对应使用的 Thread id 是 27234 和 27233.
先看 (1) TRANSACTION:
(1) WAITING FOR THIS LOCK TO BE GRANTED:
表示处于阻塞状态,等待加锁。
想要加上的锁类型为:Record lock, heap no 6 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
对应就是 X 锁。
加锁的对象 id 为 0: len 4; hex 80000055; asc U;;
id=85 的对象。
再看 (2) TRANSACTION:
*** (2) HOLDS THE LOCK(S):
表示目前持有的锁。
Record lock, heap no 6 PHYSICAL RECORD: n_fields 14; compact format; info bits 0
加锁的范围是 X 锁。
持有锁的对象是:0: len 4; hex 80000059; asc Y;;
为 id=89 的对象。
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
表示想要加锁,目前被阻塞。同样想加 X 锁。
想加锁的对象为 0: len 4; hex 80000055; asc U;;
id=85.
进而产生死锁,MySQL 采取的方案是 rollback TRANSACTION (1)
, 让事务2 继续执行。
对应到上面的例子,事务2 就是 Session A,事务 1 是 Session B. 最后的结果就是 Session A 执行成功,Session B 被回滚。
那么有一个问题,为什么 TRANSACTION (1)
想要锁的对象是 id=85 的行呢?TRANSACTION (2)
的第二步为什么也想要锁住 id=85 的行呢?
原因就在于 operation_id 和 src_device_id 都是没有索引的,如果想要加锁的话,都需要从第一行 id=85 的行开始,进行全表扫描。
可见在 RC 情况下虽然已经减少了锁的类型和范围,但如果没对合适的字段设置索引,依然很容易出现死锁的情况。
案例三:半一致性读,提高 RC 并发
先看下官网给的定义,在 RC 级别下:
-
对于
UPDATE
或者DELETE
操作来说,InnoDB
仅仅会锁住更新或者删除的行。在 MySQL 根据 Where 条件,搜索后,不满足条件的行会被释放。这样做可以很好地降低死锁发生的概率,但仍然可以发生(比如案例二的例子)。 -
对于
UPDATE
操作来说,在 RC 级别下,如果一个行被锁上后,InooDB 会执行半一致性读的操作,通过返回最近的 commit 版本,来判断当前锁定的行是否符合 WHERE 条件。如果不匹配,不会对该记录加锁,如果匹配,会再次读取该行进行加锁或者阻塞来等待锁定该行。
来看一个具体的例子:
# 初始化一张表 t,
CREATE TABLE t (a INT NOT NULL, b INT) ENGINE = InnoDB;
INSERT INTO t VALUES (1,2),(2,3),(3,2),(4,3),(5,2);
COMMIT;
注意 a 和 b 都没有索引,在搜索时,会使用隐藏的聚簇索引(主键索引)进行搜索。
假设有这样两个 Session
Session A | Session B |
---|---|
START TRANSACTION; | |
UPDATE t SET b = 5 WHERE b = 3; | START TRANSACTION; |
UPDATE t SET b = 4 WHERE b = 2; |
对于 Session A 来说:会对全表的每一行进行加锁,然后在找到匹配的行后,释放其他不匹配的行的锁。
x-lock(1,2); unlock(1,2) # 释放锁
x-lock(2,3); update(2,3) to (2,5); retain x-lock # 持有锁
x-lock(3,2); unlock(3,2) # 释放锁
x-lock(4,3); update(4,3) to (4,5); retain x-lock # 持有锁
x-lock(5,2); unlock(5,2) # 释放锁
对于 Session B 来说:InooDB 会进行 semi-consistent
读(半一致性),首先回当前每一行的最近提交版本。然后通过 WHERE 条件判断需要更新的行是否能被锁上。发现 (1,2), (3,2), (5,2) 都可以获取到锁进行更新。
而对于(2,3) 和 (4,3) 这两条记录,由于不符合 WHERE 条件,进而对其不加锁,意味着和 Session A 持有的锁并不冲突,进而可以正常更新。
是 (2,3) 和 (4,3) 是因为 Session A 并没提交。
x-lock(1,2); update(1,2) to (1,4); retain x-lock
x-lock(2,3); unlock(2,3) # 释放锁
x-lock(3,2); update(3,2) to (3,4); retain x-lock
x-lock(4,3); unlock(4,3) # 释放锁
x-lock(5,2); update(5,2) to (5,4); retain x-lock
还记着案例一中,两条 FOR UPDATE
出现时,后面的被阻塞的例子吗。这里没有被阻塞,就是利用半一致性读对 UPDATE 操作做的优化,从而提高并发性。
这里再看另外一种情况:
CREATE TABLE t (a INT NOT NULL, b INT, c INT, INDEX (b)) ENGINE = InnoDB;
INSERT INTO t VALUES (1,2,3),(2,2,4);
COMMIT;
# Session A
START TRANSACTION;
UPDATE t SET b = 3 WHERE b = 2 AND c = 3;
# Session B
UPDATE t SET b = 4 WHERE b = 2 AND c = 4;
这里对 b 加上了一条二级索引后,结果就不一样了,半一致性读的效果就不能再生效。Session B 操作会被阻塞。
首先,InooDB 会根据 WHERE
条件找到 b 的索引树,然对 b=2 这行记录加锁。
然后 Session B 也会根据 b 的索引树,对 b=2 的每一行记录加锁,但在加锁过程中发现,由于 Session A 已经持有b=2的记录锁,所以被阻塞。
也就是说半一致性读在这里没有生效,仅会对聚簇索引(主键索引)生效。
半一致性读仅发生在 RC 或者开启 innodb_locks_unsafe_for_binlog 的情况下。
案例四:非唯一索引加锁情况
Session A | Session B |
---|---|
begin; | |
SELECT * FROM device_management_service_mapping where src_device_id=1 FOR UPDATE; | begin; |
Query ok. | SELECT * FROM device_management_service_mapping where src_device_id=13 FOR UPDATE; |
Query ok. | |
现在对 src_device_id 设置了二级索引,现在重新来看下加锁情况。
执行 Session A 后, 如下图,对 id=87 和 id=99 的主键索引加了 X 锁,对二级索引 src_device_id=1 的两条记录加了写锁。
---TRANSACTION 912995, ACTIVE 8 sec
3 lock struct(s), heap size 360, 4 row lock(s)
MySQL thread id 33924, OS thread handle 0x7fec0fe31700, query id 5483145 10.124.206.88 root
TABLE LOCK table `ipsla`.`device_management_service_mapping` trx id 912995 lock mode IX
RECORD LOCKS space id 166 page no 6 n bits 80 index `device_management_service_mapping_src_device_id_84c09d1d` of table `ipsla`.`device_management_service_mapping` trx id 912995 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 4; hex 80000001; asc ;;
1: len 4; hex 80000058; asc X;;
Record lock, heap no 8 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 4; hex 80000001; asc ;;
1: len 4; hex 80000057; asc W;;
RECORD LOCKS space id 166 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 912995 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 15; compact format; info bits 0
0: len 4; hex 80000058; asc X;;
1: len 6; hex 0000000decd0; asc ;;
2: len 7; hex ab0000026d0110; asc m ;;
3: len 4; hex 8000000d; asc ;;
4: len 12; hex 31302e3132342e302e313539; asc 10.124.0.159;;
5: len 4; hex 49505030; asc IPP0;;
6: len 4; hex 80002712; asc ' ;;
7: len 4; hex 800000c8; asc ;;
8: len 4; hex 80000001; asc ;;
9: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
10: len 8; hex 696e7465726e6574; asc internet;;
11: len 8; hex 99a814b22d000000; asc - ;;
12: SQL NULL;
13: SQL NULL;
14: len 7; hex 312c3130303032; asc 1,10002;;
Record lock, heap no 8 PHYSICAL RECORD: n_fields 15; compact format; info bits 0
0: len 4; hex 80000057; asc W;;
1: len 6; hex 0000000deccd; asc ;;
2: len 7; hex a90000015f0110; asc _ ;;
3: len 4; hex 80000003; asc ;;
4: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
5: len 4; hex 49505030; asc IPP0;;
6: len 4; hex 80002711; asc ' ;;
7: len 4; hex 800000c8; asc ;;
8: len 4; hex 80000001; asc ;;
9: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
10: len 8; hex 696e7465726e6574; asc internet;;
11: len 8; hex 99a814b22d000000; asc - ;;
12: SQL NULL;
13: SQL NULL;
14: len 7; hex 312c3130303031; asc 1,10001;;
执行 Session B,关注比较靠前事务,发现把 id 为 85 和 86 的主键索引加上了写锁,对二级索引 13 的两条记录加上了 X 锁。
---TRANSACTION 913004, ACTIVE 3 sec
3 lock struct(s), heap size 360, 4 row lock(s)
MySQL thread id 33925, OS thread handle 0x7fec0feb5700, query id 5483176 10.124.206.88 root
TABLE LOCK table `ipsla`.`device_management_service_mapping` trx id 913004 lock mode IX
RECORD LOCKS space id 166 page no 6 n bits 80 index `device_management_service_mapping_src_device_id_84c09d1d` of table `ipsla`.`device_management_service_mapping` trx id 913004 lock_mode X locks rec but not gap
Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 4; hex 8000000d; asc ;;
1: len 4; hex 80000055; asc U;;
Record lock, heap no 6 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 4; hex 8000000d; asc ;;
1: len 4; hex 80000056; asc V;;
RECORD LOCKS space id 166 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 913004 lock_mode X locks rec but not gap
Record lock, heap no 6 PHYSICAL RECORD: n_fields 15; compact format; info bits 0
0: len 4; hex 80000056; asc V;;
1: len 6; hex 0000000decdb; asc ;;
2: len 7; hex b1000001930110; asc ;;
3: len 4; hex 80000003; asc ;;
4: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
5: len 4; hex 49505030; asc IPP0;;
6: len 4; hex 80002712; asc ' ;;
7: len 4; hex 800000c8; asc ;;
8: len 4; hex 8000000d; asc ;;
9: len 12; hex 31302e3132342e302e313539; asc 10.124.0.159;;
10: len 8; hex 696e7465726e6574; asc internet;;
11: len 8; hex 99a814b227000000; asc ' ;;
12: SQL NULL;
13: SQL NULL;
14: len 8; hex 31332c3130303032; asc 13,10002;;
Record lock, heap no 10 PHYSICAL RECORD: n_fields 15; compact format; info bits 0
0: len 4; hex 80000055; asc U;;
1: len 6; hex 0000000decd8; asc ;;
2: len 7; hex af000001650110; asc e ;;
3: len 4; hex 80000001; asc ;;
4: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
5: len 4; hex 49505030; asc IPP0;;
6: len 4; hex 80002711; asc ' ;;
7: len 4; hex 800000c8; asc ;;
8: len 4; hex 8000000d; asc ;;
9: len 12; hex 31302e3132342e302e313539; asc 10.124.0.159;;
10: len 8; hex 696e7465726e6574; asc internet;;
11: len 8; hex 99a814b227000000; asc ' ;;
12: SQL NULL;
13: SQL NULL;
14: len 8; hex 31332c3130303031; asc 13,10001;;
---TRANSACTION 912995, ACTIVE 99 sec
3 lock struct(s), heap size 360, 4 row lock(s)
MySQL thread id 33924, OS thread handle 0x7fec0fe31700, query id 5483145 10.124.206.88 root
TABLE LOCK table `ipsla`.`device_management_service_mapping` trx id 912995 lock mode IX
RECORD LOCKS space id 166 page no 6 n bits 80 index `device_management_service_mapping_src_device_id_84c09d1d` of table `ipsla`.`device_management_service_mapping` trx id 912995 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 4; hex 80000001; asc ;;
1: len 4; hex 80000058; asc X;;
Record lock, heap no 8 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 4; hex 80000001; asc ;;
1: len 4; hex 80000057; asc W;;
RECORD LOCKS space id 166 page no 3 n bits 80 index `PRIMARY` of table `ipsla`.`device_management_service_mapping` trx id 912995 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 15; compact format; info bits 0
0: len 4; hex 80000058; asc X;;
1: len 6; hex 0000000decd0; asc ;;
2: len 7; hex ab0000026d0110; asc m ;;
3: len 4; hex 8000000d; asc ;;
4: len 12; hex 31302e3132342e302e313539; asc 10.124.0.159;;
5: len 4; hex 49505030; asc IPP0;;
6: len 4; hex 80002712; asc ' ;;
7: len 4; hex 800000c8; asc ;;
8: len 4; hex 80000001; asc ;;
9: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
10: len 8; hex 696e7465726e6574; asc internet;;
11: len 8; hex 99a814b22d000000; asc - ;;
12: SQL NULL;
13: SQL NULL;
14: len 7; hex 312c3130303032; asc 1,10002;;
Record lock, heap no 8 PHYSICAL RECORD: n_fields 15; compact format; info bits 0
0: len 4; hex 80000057; asc W;;
1: len 6; hex 0000000deccd; asc ;;
2: len 7; hex a90000015f0110; asc _ ;;
3: len 4; hex 80000003; asc ;;
4: len 12; hex 31302e3132342e302e313538; asc 10.124.0.158;;
5: len 4; hex 49505030; asc IPP0;;
6: len 4; hex 80002711; asc ' ;;
7: len 4; hex 800000c8; asc ;;
8: len 4; hex 80000001; asc ;;
9: len 12; hex 31302e3132342e302e313537; asc 10.124.0.157;;
10: len 8; hex 696e7465726e6574; asc internet;;
11: len 8; hex 99a814b22d000000; asc - ;;
12: SQL NULL;
13: SQL NULL;
14: len 7; hex 312c3130303031; asc 1,10001;;
这里比较特殊的是,Session B 在加锁时并没有被阻塞,原因在于 Session 先通过二级索引,进行树搜索找到 src_device_id=13 的记录。然后在此记录上开始进行遍历操作,也就是会加锁。
首先,第一个加锁的对象是 src_device_id=13, id = 85 的记录,由于并不是唯一索引,所以会继续遍历,给 src_device_id=13, id=86 的记录加锁。然后接着遍历,找到 id=87, src_device_id=1 的记录。发现不满足条件,就此结束。
其实在刚开始写这个例子时,第一想法是 Session B 会阻塞,原因在于在遍历到 id=87, src_device_id=1 时,Session A 已经写了 X 写锁,此时 Session B 应该无法读取,估计是 MySQL 做了优化,允许读取,并发现该值不匹配到 Where 条件的值,接着释放了。
死锁分析流程
下面来简单总结在,在死锁等情况出现时,如果排查故障:
- 抓出 SQL 日志,结合 Thread id 分析日志执行情况,简单写了个分析脚本,会把相同 Thread 的执行过程打印出来。
raw_str = """
201209 13:30:22 27225 Connect root@10.124.207.150 on ipsla
27225 Query SET autocommit=0
27225 Query SET autocommit=1
27225 Query SET SESSION TRANSACTION ISOLATI
...............................
"""
lines = raw_str.split('
')
number_dict = {}
for line in lines:
number = re.search(r's(ddddd)s', line)
if number:
# print(number.group())
number_dict[number.group()] = []
# print(number_dict)
for line in lines:
number = re.search(r's(ddddd)s', line)
if number:
if number.group() in number_dict:
number_dict[number.group()].append(line)
for key in number_dict:
for line in number_dict[key]:
print(line)
print('---------------- new Thread -------------------------------')
- 根据
show engine innodb statusG;
查到的死锁 thread id 和锁信息,对应到分析后的文件中,得出执行过程。 - 进行复现, 得出结论,做出优化。