怎么解决InnoDB Persistent Statistics问题

173次阅读

共计 3519 个字符，预计需要花费 9 分钟才能阅读完成。

这篇文章主要讲解了“怎么解决 InnoDB Persistent Statistics 问题”，文中的讲解内容简单清晰，易于学习与理解，下面请大家跟着丸趣 TV 小编的思路慢慢深入，一起来研究和学习“怎么解决 InnoDB Persistent Statistics 问题”吧！

背景：
MySQL 的优化器是通过 innodb 收集到的数据来选择最优的执行计划，但因为这些数据会随着某些操作而重新计算，造成执行计划会多次变化，出现不精确和不稳定的问题。

这些导致重新计算的操作有：
1. 重启
2. 访问表
3. 表中数据改变（1/16 以上的 DML）
4.show table status 及 show index for table
5.analyze table
6.and so on
为了解决这个问题，在 mysql 5.6 时，加入了持续优化统计，不再自动重新统计，持续统计数据是作为系统表存储在 innodb_table_stats 和 innodb_index_stats 中的，在上次的分享中也有提到过。

如何进行持续优化统计：
mysql show variables like %innodb_stats%
+————————————–+————-+
| Variable_name | Value |
+————————————–+————-+
| innodb_stats_auto_recalc | ON |
| innodb_stats_method | nulls_equal |
| innodb_stats_on_metadata | OFF |
| innodb_stats_persistent | ON |
| innodb_stats_persistent_sample_pages | 20 |
| innodb_stats_sample_pages | 8 |
| innodb_stats_transient_sample_pages | 8 |
+————————————–+————-+

1、对于所有 innodb 表，可以设置全局参数
全局参数：
innodb_stats_persistent 是否开启统计
innodb_stats_auto_recalc 自动重新统计
innodb_stats_persistent_sample_pages 随机取样页数
innodb_stats_on_metadata
该参数主要为元数据索引统计分析，如查询 information_schema 中的某些表，还有 show table status
也会造成 innodb 随机提取数据，很容易导致查询性能大幅抖动，在 5.6 之后的版本该参数已经很鸡肋了，不开启完全不影响数据统计的准确性。
2、单表
(1) stats_persistent 对于 innodb 表是否保证持续统计
ALTER TABLE table_name stats_persistent=1
默认是由 innodb_stats_persistent 选项决定的
(2) stats_auto_recalc 对于 innodb 表是否自动计算持续统计
默认是由 innodb_stats_auto_recalc 选项决定的，为 1 时，

当有 10% 的数据发生改变时，就重新计算, 按照我的测试大概超过 10%
(3) stats_sample_pages 指定随机索引页的数量
example:

CREATE TABLE `t1` (
`id` int NOT NULL AUTO_INCREMENT,
`data` varchar(255) DEFAULT NULL,
`date` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_date` (`date`)
) ENGINE=InnoDB CHARSET=utf8 STATS_PERSISTENT=1 STATS_AUTO_RECALC=1 STATS_SAMPLE_PAGES=25

Innodb 统计示例：

mysql select * from t2 ;
+—-+——+——+——+——+
| a | b | c | d | e |
+—-+——+——+——+——+
| 1 | 1 | 1 | 1 | 1 |
| 2 | 1 | 1 | 2 | 2 |
| 3 | 1 | 1 | 3 | 3 |
| 4 | 1 | 1 | 4 | 4 |
| 5 | 1 | 1 | 5 | 5 |
| 6 | 1 | 1 | 6 | 6 |
| 7 | 1 | 1 | 7 | 7 |
| 8 | 1 | 1 | 8 | 8 |
| 9 | 1 | 1 | 9 | 9 |
| 10 | 1 | 1 | 10 | 10 |
+—-+——+——+——+——+
10 rows in set (0.01 sec)

mysql select * from mysql.innodb_table_stats \G
*************************** 1. row ***************************
database_name: test
table_name: t2
last_update: 2016-02-24 18:58:22
n_rows: 8
clustered_index_size: 1
sum_of_other_index_sizes: 2
1 row in set (0.00 sec)

使用 analyze table 立即更新统计数据

mysql select * from mysql.innodb_table_stats \G
*************************** 1. row ***************************
database_name: test
table_name: t2
last_update: 2016-02-24 19:00:23
n_rows: 10
clustered_index_size: 1
sum_of_other_index_sizes: 2
1 row in set (0.01 sec)
可以看到统计已经改变

取样页数量的影响
基于索引的相对选择度，mysql 查询优化器通过键的分布(即 cardinality) 统计来选择索引的执行计划，而使用 analyze table 会导致 innodb 从表上的每个索引取随机页来估计索引的选择度。

为了控制统计的准确性和稳定性，可以改变以下参数
innodb_stats_persistent_sample_pages 默认值是 20
统计并不精确，优化器选择的是理想的计划，如 explain，
精确的统计是通过比较索引的实际基数与索引统计表中的估计值，如 select distinct 在索引列

当然，如果开启了自动更新，在几秒钟，行变更达到 10% 的阀值也会更新的

innodb_stats_persistent_sample_pages
增加该值，虽然会使统计更加准确，但同时可能需要更多的磁盘读，会造成打开表或执行 show table status，而且对于 analyze
table 来说，也很慢，因为它的复杂性计算与该参数相关，innodb_stats_sample_pages * 索引列 *
分区数量；但也不能过小，比如 1 或 2，会导致统计不准确。

感谢各位的阅读，以上就是“怎么解决 InnoDB Persistent Statistics 问题”的内容了，经过本文的学习后，相信大家对怎么解决 InnoDB Persistent Statistics 问题这一问题有了更深刻的体会，具体使用情况还需要大家实践验证。这里是丸趣 TV，丸趣 TV 小编将为大家推送更多相关知识点的文章，欢迎关注！

正文完