怎么部署MySQL Group Replication

180次阅读

共计 9224 个字符，预计需要花费 24 分钟才能阅读完成。

这篇文章主要介绍“怎么部署 MySQL Group Replication”，在日常操作中，相信很多人在怎么部署 MySQL Group Replication 问题上存在疑惑，丸趣 TV 小编查阅了各式资料，整理出简单好用的操作方法，希望对大家解答”怎么部署 MySQL Group Replication”的疑惑有所帮助！接下来，请跟着丸趣 TV 小编一起来学习吧！

一、环境准备

名称
版本
备注
操作系统
RHEL6.5_X86_64

数据库
5.7.18-15
Percona 二进制版本
复制节点 10.26.7.129
node1

10.26.7.142
node2

10.26.7.166
node3

二、MGR 详细部署步骤
1、MYSQL5.7 安装
本次学习实验采用的是 Percona-Server-5.7.18-15-Linux.x86_64.ssl101 二进制版本，具体安装过程略
2、节点 1my.cnf 参数配置（主写节点）
#replicate
server-id=1001
skip-slave-start = false
read-only = false
expire_logs_days = 2
max_binlog_size = 1G
max_binlog_cache_size = 2G
log-bin = /home/mysql/mysql-bin
log-bin-index = /home/mysql/bin-index
binlog_format = row
log-slave-updates = 1
sync_binlog = 1
log-slow-slave-statements = 1
max-relay-log-size = 1G
relay-log = /home/mysql/mysql-relay
relay-log-index = /home/mysql/relay-index
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
binlog_checksum=NONE
#group replication
transaction_write_set_extraction =XXHASH64
loose-group_replication_group_name = 2faa74b8-66bb-11e7-ab7e-b8ca3a6a61ec #务必以 uuid 形式配置
loose-group_replication_start_on_boot =off
loose-group_replication_local_address = 10.26.7.129:24001 #不同节点配置不同节点本身的 IP 地址和端口，区分 MYSQL 自身的 3306 端口
loose-group_replication_group_seeds = 10.26.7.129:24001,10.26.7.142:24001,10.26.7.166:24001
loose-group_replication_bootstrap_group =off
3、创建复制账户（主写节点）
set sql_log_bin=0;
create user rpl_user@ %
grant replication slave on *.* to rpl_user@ % identified by rpl_pass
flush privileges;
set sql_log_bin=1;
change master to master_user= rpl_user ,master_password= rpl_pass for channel group_replication_recovery
4、安装组复制插件并启动组复制（主写节点）
安装插件
install plugin group_replication soname group_replication.so
检查插件是否正确安装
show plugins
+—————————–+———-+——————–+———————-+———+
| Name | Status | Type | Library | License |
+—————————–+———-+——————–+———————-+———+
| group_replication | ACTIVE | GROUP REPLICATION | group_replication.so | GPL |
+—————————–+———-+——————–+———————-+———+
启动组复制
set global group_replication_bootstrap_group=ON;
start group_replication;
set global group_replication_bootstrap_group=OFF;
检查组复制成员及状态
select * from performance_schema.replication_group_members;
+—————————+————————————–+————-+————-+————–+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+—————————+————————————–+————-+————-+————–+
| group_replication_applier | 2faa74b8-66bb-11e7-ab7e-b8ca3a6a61ec | node1 | 3306 | ONLINE |
5、添加组复制成员实例 node2 和 node3
*** 添加节点前，务必做 DNS 解析，如果没有配置 DNS 解析服务器，需要在每个节点配置 hosts 解析 /etc/hosts***
10.26.7.166 node3
10.26.7.142 node2
10.26.7.129 node1

*** 节点 my.cnf 参数文件 server-id 和 loose-group_replication_local_address = node2:24001 需要分别更改 ***

node2
set sql_log_bin=0;
create user rpl_user@ %
grant replication slave on *.* to rpl_user@ % identified by rpl_pass
flush privileges;

set sql_log_bin=1;
change master to master_user= rpl_user ,master_password= rpl_pass for channel group_replication_recovery

install plugin group_replication soname group_replication.so
show plugins

set global group_replication_allow_local_disjoint_gtids_join=ON;
start group_replication;

node3 同样执行上述命令

测试组复制是否正常：
(root:localhost:Sat Jul 15 13:26:33 2017)[(none)] create database dbtest;
Query OK, 1 row affected (0.01 sec)
(root:localhost:Sat Jul 15 13:26:40 2017)[(none)] use dbtest;
Database changed
(root:localhost:Sat Jul 15 13:26:45 2017)[dbtest] create table t1(id int primary key);
Query OK, 0 rows affected (0.01 sec)
(root:localhost:Sat Jul 15 13:26:54 2017)[dbtest] insert into t1 values(1);
Query OK, 1 row affected (0.00 sec)
node2 和 node3 执行查询
(root:localhost:Sat Jul 15 12:57:32 2017)[db01] use dbtest;
Database changed
(root:localhost:Sat Jul 15 13:27:26 2017)[dbtest] select * from t1;
+—-+
| id |
+—-+
| 1 |
+—-+
1 row in set (0.00 sec)
三、错误问题以及汇总：
1、错误案例 01
错误信息：2017-07-15T01:36:06.929941Z 4 [ERROR] Plugin group_replication reported: The group name group-replication-test is not a valid UUID
错误原因：loose-group_replication_group_name 参数没有按照 UUID 格式指定，被认为设置该参数无效
解决方案：更改 loose-group_replication_group_name 参数值为，loose-group_replication_group_name = 2faa74b8-66bb-11e7-ab7e-b8ca3a6a61ec
2、错误案例 02
错误信息：
2017-07-15T01:29:27.271909Z 0 [Warning] unknown variable loose-group_replication_group_name=group-replication-test
2017-07-15T01:29:27.271926Z 0 [Warning] unknown variable loose-group_replication_start_on_boot=off
2017-07-15T01:29:27.271930Z 0 [Warning] unknown variable loose-group_replication_local_address=10.26.7.129:3306
2017-07-15T01:29:27.271935Z 0 [Warning] unknown variable loose-group_replication_group_seeds=10.26.7.129:3306,10.26.7.142:3306,10.26.7.166:3306
2017-07-15T01:29:27.271939Z 0 [Warning] unknown variable loose-group_replication_bootstrap_group=off
错误原因：因为先设置了这些参数，而没有装 group_replication 插件，导致数据库实例无法识别这些参数
解决方案：安装 group replication 插件，install plugin group_replication soname group_replication.so（uninstall plugin group_replication 卸载，show plugins 查看）
3、错误案例 03
错误信息：
2017-07-15T01:54:54.447829Z 0 [Note] Plugin group_replication reported: Unable to bind to 0.0.0.0:3306 (socket=60, errno=98)!
2017-07-15T01:54:54.447948Z 0 [ERROR] Plugin group_replication reported: Unable to announce tcp port 3306. Port already in use?
2017-07-15T01:54:54.448101Z 0 [ERROR] Plugin group_replication reported: [GCS] Error joining the group while waiting for the network layer to become ready.
错误原因：配置的组复制监听端口和 MYSQL 实例端口冲突
解决方案：调整下面参数
loose-group_replication_local_address
= 10.26.7.129:24001
#不同节点配置不同节点本身的 IP 地址和端口，区分 MYSQL 自身的 3306 端口
loose-group_replication_group_seeds = 10.26.7.129:24001,10.26.7.142:24001,10.26.7.166:24001
4、错误案例 04
错误信息：
2017-07-15T04:20:01.249529Z 21 [ERROR] Slave I/O for channel group_replication_recovery : error connecting to master rpl_user@node2:3306 – retry-time: 60 retries: 1, Error_code: 2005
错误原因：没有配置 DNS 解析或者 hosts 解析，节点无法连接其他数据库
解决方案：配置 hosts 解析，每个节点 /etc/hosts 添加如下内容
10.26.7.166 node3
10.26.7.142 node2
10.26.7.129 node1
5、错误案例 05
错误信息
2017-07-15T03:42:45.395407Z 288 [ERROR] Slave SQL for channel group_replication_recovery : Error Can t create database db01 database exists on query. Default database: db01 . Query: create database db01 , Error_code: 1007
2017-07-15T03:42:45.395472Z 288 [Warning] Slave: Can t create database db01 database exists Error_code: 1007
2017-07-15T03:42:45.395503Z 288 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with SLAVE START . We stopped at log mysql-bin.000005 position 434
错误原因：这个错误是由于节点由于各种原因退出 mgr 组，后面又加入了 mgr 组，但之前存在的数据依旧存在
解决方案：删除要加入组复制节点存在的数据库即可，但其他节点不是主写节点，需要先调整参数 set global super_read_only=0; 然后执行 drop database db01; 再重新加入组
set global group_replication_allow_local_disjoint_gtids_join=ON;
start group_replication;
6、错误案例 06
错误信息：
2017-07-15T03:44:09.982428Z 18 [ERROR] Slave SQL for channel group_replication_recovery : Error Can t create database db01 database exists on query. Default database: db01 . Query: create database db01 , Error_code: 1007
2017-07-15T03:44:09.982493Z 18 [Warning] Slave: Can t create database db01 database exists Error_code: 1007
2017-07-15T03:44:09.982522Z 18 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with SLAVE START . We stopped at log mysql-bin.000005 position 434
错误原因：同上错误案例 05
解决方案：同上错误案例 05
7、错误案例 07
错误信息：
2017-07-15T03:49:10.370846Z 0 [ERROR] Plugin group_replication reported: This member has more executed transactions than those present in the group. Local transactions: 2faa74b8-66bb-11e7-ab7e-b8ca3a6a61ec:1-4,35e38786-66bb-11e7-bcc3-b8ca3a6a61a4:1 Group transactions: 2faa74b8-66bb-11e7-ab7e-b8ca3a6a61ec:1-16
错误原因：同上错误案例 05，在从库执行了多余的事务
解决方案：同上错误案例 05，直接重新加入节点即可
set global group_replication_allow_local_disjoint_gtids_join=ON;
start group_replication;
8、错误案例 08
错误信息
ERROR 3098 (HY000): The table does not comply with the requirements by an external plugin.
错误原因：由于主节点创建了表 t1，但没有指定主键（此时表结构可以复制到各节点，一旦插入数据 DML 操作即会报错）
解决方案：为表增加主键，然后做 DML 操作（MGR 需要各表都有主键）
alter table t1 add primary key(id);
insert into t1 values(1),(2);
9、错误案例 09
错误信息:
mysqldump -R -E –triggers –single-transaction –master-data=2 -B db01 db01.sql
Warning: A partial dump from a server that has GTIDs will by default include the GTIDs of all transactions, even those that changed suppressed parts of the database. If you don t want to restore GTIDs, pass –set-gtid-purged=OFF. To make a complete dump, pass –all-databases –triggers –routines –events.
mysqldump: Couldn t execute SAVEPOINT sp : The MySQL server is running with the –transaction-write-set-extraction!=OFF option so it cannot execute this statement (1290)
错误原因：mgr 不支持 mysqldump 的事务一致性备份，因为其不支持 savepoint
解决方案：通过 xtrabackup 或者不加 –single-transaction 备份
10、错误案例 10
错误信息：
create table t2 as select * from t1;
ERROR 1786 (HY000): Statement violates GTID consistency: CREATE TABLE … SELECT.
错误原因：配置 MGR，开启了 GTID，所有 GTID 不支持的操作，MGR 也不支持
解决方案：使用 create table t2 like t1; insert into t2 select * from t; 分开两个事务执行