如何解决MySQL的RR模式下死锁一列

119次阅读

共计 5128 个字符，预计需要花费 13 分钟才能阅读完成。

如何解决 MySQL 的 RR 模式下死锁一列，针对这个问题，这篇文章详细介绍了相对应的分析和解答，希望可以帮助更多想解决这个问题的小伙伴找到更简单易行的方法。

环境：版本 5.7.29 RR 隔离级别一、案例模拟

CREATE TABLE `t8` ( `id` bigint(20) NOT NULL AUTO_INCREMENT,
 `d_id` varchar(40) NOT NULL DEFAULT  ,
 `b_id` varchar(40) NOT NULL DEFAULT  ,
 `is_dropped` tinyint(1) NOT NULL DEFAULT  0 ,
 `u_c` varchar(10) NOT NULL DEFAULT  ,
 PRIMARY KEY (`id`),
 UNIQUE KEY `DealerAndBrokerAndDropped` (`d_id`,`b_id`,`is_dropped`)
) ENGINE=InnoDB ;
insert into t8 values(1,1,1,0, a 
insert into t8 values(2,2,2,0, a 
insert into t8 values(3,3,3,0, a 
insert into t8 values(4,4,4,0, a 
insert into t8 values(5,5,5,0, a 
insert into t8 values(6,6,6,0, a 
insert into t8 values(7,7,7,0, a 
insert into t8 values(8,8,8,0, a 
insert into t8 values(9,9,9,0, a 
insert into t8 values(10,10,10,0, a 
insert into t8 values(11,11,11,0, a

执行语句如下：
|S1|S2|
|-|-|
|begin||
|select u_c from t8 where d_id=’1’and b_id=’1’and is_dropped=0 for update;||
||select u_c from t8 where d_id=’1’and b_id=’1’and is_dropped=0 for update; 处于堵塞状态 |
|update t8 set u_c=’b’where d_id=’1’and b_id=’1’; —此时触发死锁 S2 回滚 ||

发生死锁记录如下：

二、死锁分析

仔细分析我们会发现 trx id 5679 最后被堵塞需要获取的锁为（lock_mode X waiting），堵塞发生在索引 DealerAndBrokerAndDropped 上，也就是这是一个 next key lock 且需要获取的模式为 LOCK_X，处于等待状态。
而我们来看 trx id 5679 前面获取的锁是什么呢？显然可以看到为（lock_mode X locks rec but not gap），获取发生在索引 DealerAndBrokerAndDropped 上，也就是这是一个 key lock 且获取模式为 LOCK_X。
但是我们需要知道 DealerAndBrokerAndDropped 明明是一个唯一索引，获取 key lock 我们很容易理解，但是为什么也会出现获取 next key lock 呢？这个问题我们先放一下，先来分析一下整个死锁的产生的过程

S1（select 操作）
通过唯一性索引定位索引数据获取了唯一索引 DealerAndBrokerAndDropped 上的
LOCK_REC_NOT_GAP|LOCK_X，获取成功记录就是 d_id=’1’ b_id=’1’ is_dropped= 0 这条数据。
S1（select 操作）
回表获取全部数据，这个时候需要主键上的相应的行锁。LOCK_REC_NOT_GAP|LOCK_X 获取成功

S2（select 操作）
通过唯一性索引定位索引数据试图获取了唯一索引 DealerAndBrokerAndDropped 上的
LOCK_REC_NOT_GAP|LOCK_X，获取失败记录就是 d_id=’1’ b_id=’1’ is_dropped= 0 这条数据，处于等待状态。

S1（update 操作）
通过索引 DealerAndBrokerAndDropped 查找数据（注意这里已经不是唯一性定位操作了，下面会做分析），这个时候首先需要通过查询条件获取出需要更新的第一条数据，实际上这个时候也是 d_id=’1’ b_id=’1’ is_dropped= 0 这条数据，需要获取的锁为 LOCK_ORDINARY[next_key_lock]|LOCK_X，这个时候我发现虽然 S1 之前获取了这条数据的锁，但是锁模式变化了（一致不会重新获取，下面会分析这种行为），因此这里需要重新获取，但是这显然是不行的，因为 S2 都还处于等待中，因此这里也发生了等待。

因此通过这个过程就出现死锁，S2 等 S1 S1 等 S2。

三、关于锁模式的变化

关于这里我们参考函数 lock_rec_lock_fast，这里会不进行行锁冲突验证而进行快速加锁，如果锁模式没有变化则也会再这里进行快速加锁（也就是直接跳过），当然如果块中一个 row lock 都没有也会在这里进行加锁，这是每个加行锁的操作都必须经历的判断，如果不能快速加锁则进入 slow 加锁方式，这里看一下下面的这段代码：

 if (lock_rec_get_next_on_page(lock)
 || lock- trx != trx
 || lock- type_mode != (mode | LOCK_REC)
 || lock_rec_get_n_bits(lock)  = heap_no) { 
 status = LOCK_REC_FAIL;
 }

这里的 lock- trx != trx 会判断本次加锁事务和上次加锁事务是否是同一个事务，lock- type_mode != (mode | LOCK_REC)会判断锁模式是否相同。如果不能满足条件则判定为 LOCK_REC_FAIL，进入 slow 加锁方式。

而我们这里 S1 加锁第一次是 LOCK_REC_NOT_GAP|LOCK_X，而第二次是 LOCK_ORDINARY[next_key_lock]|LOCK_X，显然变化了，因此进入 slow 加锁阶段，进行冲突验证，结果嘛也就冲突了。这是本死锁的一个原因。

四、关于 LOCK_ORDINARY[next_key_lock]来历

这是本死锁的一个最重要原因，知道了这个原因这个案例就理解了。首先我们先看这个 update 语句：

update t8 set u_c= b  where d_id= 1  and b_id= 1

我们发现这个时候唯一索引还少一个条件也就是 is_dropped 字段，这个时候本次定位查询不会判定为唯一性查询，而是普通的二级索引定位方式，这个时候 RR 模式出现 LOCK_ORDINARY[next_key_lock]就显得很自然了，下面是这个判断过程，代码位于 row_search_mvcc 中。

(match_mode == ROW_SEL_EXACT
   dict_index_is_unique(index)
   dtuple_get_n_fields(search_tuple)
 == dict_index_get_n_unique(index)
   (dict_index_is_clust(index)
 || !dtuple_contains_null(search_tuple)))

稍微解释一下，唯一性查找条件至少包含如下 3 点：

索引具有唯一性

查询的字段数量和索引唯一性字段数量相同

是主键或者查询条件中不包含 NULL 值

注意第 3 点源码说明如下：

 /* Note above that a UNIQUE secondary index can contain many
 rows with the same key value if one of the columns is the SQL
 null. A clustered index under MySQL can never contain null
 columns because we demand that all the columns in primary key
 are non-null. */

满足上面 4 点条件才能确认为唯一查找，本查询由于第 3 条不满足因此，因此判定失败。
不仅如此如果本条数据加锁成功，那么你会看到如下的结果：

---TRANSACTION 25830, ACTIVE 2 sec
4 lock struct(s), heap size 1160, 3 row lock(s), undo log entries 1
MySQL thread id 5, OS thread handle 140737101231872, query id 4115 localhost root starting
show engine innodb status
TABLE LOCK table `test`.`t8` trx id 25830 lock mode IX
RECORD LOCKS space id 1050 page no 4 n bits 80 index DealerAndBrokerAndDropped of table `test`.`t8` trx id 25830 lock_mode X
Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 1; hex 31; asc 1;;
 1: len 1; hex 31; asc 1;;
 2: len 1; hex 80; asc ;;
 3: len 8; hex 8000000000000001; asc ;;
RECORD LOCKS space id 1050 page no 3 n bits 80 index PRIMARY of table `test`.`t8` trx id 25830 lock_mode X locks rec but not gap
Record lock, heap no 2 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
 0: len 8; hex 8000000000000001; asc ;;
 1: len 6; hex 0000000064e6; asc d ;;
 2: len 7; hex 5f000000430110; asc _ C ;;
 3: len 1; hex 31; asc 1;;
 4: len 1; hex 31; asc 1;;
 5: len 1; hex 80; asc ;;
 6: len 1; hex 62; asc b;;
RECORD LOCKS space id 1050 page no 4 n bits 80 index DealerAndBrokerAndDropped of table `test`.`t8` trx id 25830 lock_mode X locks gap before rec
Record lock, heap no 11 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
 0: len 2; hex 3130; asc 10;;
 1: len 2; hex 3130; asc 10;;
 2: len 1; hex 80; asc ;;
 3: len 8; hex 800000000000000a; asc ;;

我们发现 DealerAndBrokerAndDropped 唯一索引的吓一跳记录也加了 gap lock，这完全是 RR 模式非唯一索引的加锁行为。

最后

如果我们将语句

update t8 set u_c= b  where d_id= 1  and b_id= 1

修改为

update t8 set u_c= b  where d_id= 1  and b_id= 1 and is_dropped=0;

那么死锁将不会触发了。原因就是第三部分我们说的，这里锁模式完全一致，不会导致加锁操作了。

关于如何解决 MySQL 的 RR 模式下死锁一列问题的解答就分享到这里了，希望以上内容可以对大家有一定的帮助，如果你还有很多疑惑没有解开，可以关注丸趣 TV 行业资讯频道了解更多相关知识。

正文完

id insert into t values

发表至：数据库

2023-07-28

转载说明：除特殊说明外本站除技术相关以外文章皆由网络搜集发布，转载请注明出处。

redis内部数据结构之SDS简单动态字符串的示例分析

纯SQL查询语句如何实现神经网络

数据库SQL语句优化总结

怎么部署SharePlex环境

如何理解ERROR中的Unknown storage engine 'MyISAM'