MYSQL5.6 5.7处理数据分布不均的问题分析

161次阅读

共计 4943 个字符，预计需要花费 13 分钟才能阅读完成。

本篇内容主要讲解“MYSQL5.6 5.7 处理数据分布不均的问题分析”，感兴趣的朋友不妨来看看。本文介绍的方法操作简单快捷，实用性强。下面就让丸趣 TV 小编来带大家学习“MYSQL5.6 5.7 处理数据分布不均的问题分析”吧!

处理数据分布不均，orace 数据库使用额外的统计数据直方图来完成，而 MYSQL
中统计数据只有索引的不同值这样一个统计数据，那么我们制出如下数据：
mysql select * from test.testf;
+——+———-+
| id | name |
+——+———-+
| 1 | gaopeng |
| 2 | gaopeng1 |
| 3 | gaopeng1 |
| 4 | gaopeng1 |
| 5 | gaopeng1 |
| 6 | gaopeng1 |
| 7 | gaopeng1 |
| 8 | gaopeng1 |
| 9 | gaopeng1 |
| 10 | gaopeng1 |
+——+———-+
10 rows in set (0.00 sec)
name 上有一个普通二级索引
mysql analyze table test.testf;
+————+———+———-+———-+
| Table | Op | Msg_type | Msg_text |
+————+———+———-+———-+
| test.testf | analyze | status | OK |
+————+———+———-+———-+
1 row in set (0.21 sec)

分别作出如下执行计划：
mysql explain select * from test.testf where name= gaopeng
+—-+————-+——-+————+——+—————+——+———+——-+——+———-+——-+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+—-+————-+——-+————+——+—————+——+———+——-+——+———-+——-+
| 1 | SIMPLE | testf | NULL | ref | name | name | 63 | const | 1 | 100.00 | NULL |
+—-+————-+——-+————+——+—————+——+———+——-+——+———-+——-+
1 row in set, 1 warning (0.00 sec)

mysql explain select * from test.testf where name= gaopeng1
+—-+————-+——-+————+——+—————+——+———+——+——+———-+————-+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+—-+————-+——-+————+——+—————+——+———+——+——+———-+————-+
| 1 | SIMPLE | testf | NULL | ALL | name | NULL | NULL | NULL | 10 | 90.00 | Using where |
+—-+————-+——-+————+——+—————+——+———+——+——+———-+————-+
1 row in set, 1 warning (0.00 sec)

可以看到执行计划是正确的，name= gaopeng 的只有一行选择了索引，name= gaopeng1 的有 9 行走了全表。
按理说如果只是记录不同的那么这两个语句的选择均为 1 /2, 应该会造成执行计划错误，而 MYSQL 5.6 5.7 中
都做了正确的选择，那是为什么呢？
其实原因就在于 eq_range_index_dive_limit 这个参数，我们来看一下 trace
T@2: | | | | | | | | | | | opt: (null): gaopeng1 = name = | T@3: | | | | | | | | | | | opt: (null): gaopeng = name = g
T@2: | | | | | | | | | | | opt: ranges: ending struct | T@3: | | | | | | | | | | | opt: ranges: ending struct
T@2: | | | | | | | | | | | opt: index_dives_for_eq_ranges: 1 | T@3: | | | | | | | | | | | opt: index_dives_for_eq_ranges: 1
T@2: | | | | | | | | | | | opt: rowid_ordered: 1 | T@3: | | | | | | | | | | | opt: rowid_ordered: 1
T@2: | | | | | | | | | | | opt: using_mrr: 0 | T@3: | | | | | | | | | | | opt: using_mrr: 0
T@2: | | | | | | | | | | | opt: index_only: 0 | T@3: | | | | | | | | | | | opt: index_only: 0
T@2: | | | | | | | | | | | opt: rows: 9 | T@3: | | | | | | | | | | | opt: rows: 1
T@2: | | | | | | | | | | | opt: cost: 11.81 | T@3: | | | | | | | | | | | opt: cost: 2.21

我们可以看到 index_dives_for_eq_ranges 均为 1，rows: 9 rows: 1 都是正确的，那么可以确定是 index_dives_for_eq_ranges 的作用，实际上
这是一个参数 eq_range_index_dive_limit 来决定的(equality range optimization of many-valued comparisions)，默认为
mysql show variables like %eq%
+————————————–+——-+
| Variable_name | Value |
+————————————–+——-+
| eq_range_index_dive_limit | 200 |

在官方文档说这个取值是等值范围比较的时候有多少个需要比较的值
如：
id=1 or id=2 or id=3 那么他取值就是 3 +1=4
而这种方法会得到精确的数据，但是增加的是时间成本，如果将
eq_range_index_dive_limit 设置为 1：则禁用此功能
eq_range_index_dive_limit 设置为 0：则始终开启
eq_range_index_dive_limit 设置为 N：则满足 N - 1 个这样的域。
那么我们设置为 eq_range_index_dive_limit=1 后看看
mysql explain select * from test.testf where name= gaopeng1
+—-+————-+——-+————+——+—————+——+———+——-+——+———-+——-+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+—-+————-+——-+————+——+—————+——+———+——-+——+———-+——-+
| 1 | SIMPLE | testf | NULL | ref | name | name | 63 | const | 5 | 100.00 | NULL |
+—-+————-+——-+————+——+—————+——+———+——-+——+———-+——-+
1 row in set, 1 warning (0.00 sec)

mysql explain select * from test.testf where name= gaopeng
+—-+————-+——-+————+——+—————+——+———+——-+——+———-+——-+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+—-+————-+——-+————+——+—————+——+———+——-+——+———-+——-+
| 1 | SIMPLE | testf | NULL | ref | name | name | 63 | const | 5 | 100.00 | NULL |
+—-+————-+——-+————+——+—————+——+———+——-+——+———-+——-+
1 row in set, 1 warning (0.00 sec)

可以看到执行计划已经错误 name= gaopeng1 明显不应该使用索引，我们再来看看 trace
T@3: | | | | | | | | | | | opt: ranges: ending struct
T@3: | | | | | | | | | | | opt: index_dives_for_eq_ranges: 0
T@3: | | | | | | | | | | | opt: rowid_ordered: 1
T@3: | | | | | | | | | | | opt: using_mrr: 0
T@3: | | | | | | | | | | | opt: index_only: 0
T@3: | | | | | | | | | | | opt: rows: 5
T@3: | | | | | | | | | | | opt: cost: 7.01
index_dives_for_eq_ranges: 0 rows: 5 这个 5 就是 10*1/ 2 导致的, 而 index_dives_for_eq_ranges= 0 就是禁用了

到此，相信大家对“MYSQL5.6 5.7 处理数据分布不均的问题分析”有了更深的了解，不妨来实际操作一番吧！这里是丸趣 TV 网站，更多相关内容可以进入相关频道进行查询，关注我们，继续学习！

正文完

gaopeng mysql 数据分布

发表至：数据库

2023-07-19

版权声明：本站原创文章，由丸趣 2023-07-19发表，共计4943字。

转载说明：除特殊说明外本站除技术相关以外文章皆由网络搜集发布，转载请注明出处。

如何解析MySQL性能瓶颈排查定位

12条MySQL优化技巧分别是什么

MySQL 中 InnoDB 和 MyISAM 的区别是什么

在SQL Server中动态SQL是什么

怎么使用mysql性能指标tps/qps

TokuDB有哪些特点

评论（没有评论）