mysql中MHA配置及切换方式有哪些

69次阅读
没有评论

共计 12535 个字符,预计需要花费 32 分钟才能阅读完成。

这篇文章主要介绍 mysql 中 MHA 配置及切换方式有哪些,文中介绍的非常详细,具有一定的参考价值,感兴趣的小伙伴们一定要看完!

master 节点 /MHA 管理节点:172.31.217.183
slave 节点 /MHA 成员节点:172.31.217.182
已开启半同步。

数据库版本为 5.7

配置免密码登录
master 节点:
root@bd-dev-mingshuo-183:/opt/soft#ssh-keygen -t rsa

Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
36:39:6b:1e:40:f2:85:31:db:d0:3e:ab:05:0e:fd:37 root@bd-dev-mingshuo-183
The key s randomart image is:
+–[RSA 2048]—-+
|  +.  |
|  B.  |
|  ..+.o  |
|  .+o.o.  |
|  oooSo  |
|  .o++E  |
|  o+. .  |
|  .o .  |
|  .  |
+—————–+
root@bd-dev-mingshuo-183:/opt/soft#ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.31.217.182
root@172.31.217.182 s password:

Now try logging into the machine, with ssh root@172.31.217.182 , and check in:

  .ssh/authorized_keys

to make sure we haven t added extra keys that you weren t expecting.

root@bd-dev-mingshuo-183:/u01#ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.31.217.183
root@172.31.217.183 s password:

Now try logging into the machine, with ssh root@172.31.217.183 , and check in:

  .ssh/authorized_keys

to make sure we haven t added extra keys that you weren t expecting.

slave 节点:
ssh-keygen -t rsa

ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.31.217.183
ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.31.217.182

slave 节点:
mysql set global read_only=1;
Query OK, 0 rows affected (0.00 sec)

mysql show variables like read_only \G
*************************** 1. row ***************************
Variable_name: read_only
  Value: ON
1 row in set (0.00 sec)

read_only 为 1 代表是只读,0 代表读写。从库只读不会影响 slave 的日志应用。但是不要把参数写入参数文件,因为可能当这个 slave 切换为 master 就会造成普通用户不能写入。当然这个参数在配置 mha 过程中是可选的。

部署安装包
manager 节点安装 manager 包
所有节点安装 node 包
先安装 node 包
rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm

yum install mha4mysql-manager-0.58-0.el7.centos.noarch.rpm

在 master 上创建 mha 管理账号
grant all privileges on *.* to mha@ 172.31.217.% identified by oracle
flush privileges;

创建目录,用于存放 mha 配置文件和 mha 日志
mkdir -p /u01/mha/log
chown mysql.mysql -R mha

编辑配置文件
vi /u01/mha/mha.cnf

[server default]
manager_log=/u01/mha/log/manager.log
manager_workdir=/u01/mha/log

master_binlog_dir=/u01/mysql/3306/data
user=mha
password=oracle
ping_interval=2  
repl_user=repl_user
repl_password=oracle
ssh_user=root

[server1]
hostname=172.31.217.183
port=3306

[server2]
hostname=172.31.217.182
port=3306

配置文件可选参数:
[server default]模块:
ping_interval=1  // 设置监控主库,发送 ping 包的时间间隔,默认是 3 秒,尝试三次没有回应的时候自动进行 railover
remote_workdir=/tmp  // 设置远端 mysql 在发生切换时 binlog 的保存位置
report_script=/usr/local/send_report  // 设置发生切换后发送的报警的脚本    
shutdown_script=   // 设置故障发生后关闭故障主机脚本(该脚本的主要作用是关闭主机放在发生脑裂, 这里没有使用)
从库模块:
candidate_master=1  // 设置为候选 master,如果设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个主库不是集群中事件最新的 slave
check_repl_delay=0  // 默认情况下如果一个 slave 落后 master 100M 的 relay logs 的话,MHA 将不会选择该 slave 作为一个新的 master,因为对于这个 slave 的恢复需要花费很长时间,通过设置 check_repl_delay=0,MHA 触发切换在选择一个新的 master 的时候将会忽略复制延时,这个参数对于设置了 candidate_master= 1 的主机非常有用,因为这个候选主在切换的过程中一定是新的 master

检测同步及 ssh 登录
masterha_check_ssh –conf=/u01/mha/mha.cnf
masterha_check_repl –conf=/u01/mha/mha.cnf

中间报了很多次错,部分解决方案:
ln -s /opt/mysql-5.7.23/bin/mysql /usr/bin/mysql
ln -s /opt/mysql-5.7.23/bin/mysqlbinlog /usr/bin/mysqlbinlog
卸载 mha4mysql-manager-0.58-0.el7.centos.noarch.rpm,安装 mha4mysql-manager-0.56-0.el6.noarch.rpm

启动 mha
nohup masterha_manager –conf=/u01/mha/mha.cnf /u01/mha/log/manager.log 2 1

检查 mha 状态
root@bd-dev-mingshuo-183:/opt/soft#masterha_check_status –conf=/u01/mha/mha.cnf

mha (pid:24910) is running(0:PING_OK), master:172.31.217.183
配置 VIP
在 server default 模块下面添加
master_ip_failover_script=/usr/local/bin/master_ip_failover

从源码包中将 master_ip_failover 拷贝到 /usr/local/bin/ 下面
cd /opt/soft/MHAsoft/mha4mysql-manager-0.56/samples/scripts
cp -ra master_ip_failover /usr/local/bin/master_ip_failover

修改 /usr/local/bin/master_ip_failover
my $vip = 172.31.217.203/24   #此处为你要设置的虚拟 ip
my $key = 1
my $ssh_start_vip = /sbin/ifconfig eth3:$key $vip #此处改为你的网卡名称
my $ssh_stop_vip = /sbin/ifconfig eth3:$key down
注:
my (
  $command,  $ssh_user,  $orig_master_host, $orig_master_ip,
  $orig_master_port, $new_master_host, $new_master_ip,  $new_master_port
);

将上面内容添加到这里

GetOptions(
  command=s   = \$command,
  ssh_user=s   = \$ssh_user,
  orig_master_host=s = \$orig_master_host,
  orig_master_ip=s   = \$orig_master_ip,
  orig_master_port=i = \$orig_master_port,
  new_master_host=s   = \$new_master_host,
  new_master_ip=s   = \$new_master_ip,
  new_master_port=i   = \$new_master_port,
);
配置网卡 VIP
ifconfig eth3:1 172.31.217.203/24

ifconfig

eth3  Link encap:Ethernet  HWaddr 54:0F:5D:2C:4D:77  
  inet addr:172.31.217.202  Bcast:172.31.217.255  Mask:255.255.255.0
  inet6 addr: fe80::560f:5dff:fe2c:4d77/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:74742667 errors:0 dropped:0 overruns:0 frame:0
  TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000

  RX bytes:52680755472 (49.0 GiB)  TX bytes:740 (740.0 b)

eth3:1  Link encap:Ethernet  HWaddr 54:0F:5D:2C:4D:77  
  inet addr:172.31.217.203  Bcast:172.31.217.255  Mask:255.255.255.0
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

停止 mha
masterha_stop –conf=/u01/mha/mha.cnf

再次开启 mha
nohup masterha_manager –conf=/u01/mha/mha.cnf /u01/mha/log/manager.log 2 1

报错:
Bareword FIXME_xxx not allowed while strict subs in use at /usr/local/bin/master_ip_failover line 98.
Execution of /usr/local/bin/master_ip_failover aborted due to compilation errors.
Mon Sep 17 10:56:04 2018 – [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln226]  Failed to get master_ip_failover_script status with return code 255:0.
Mon Sep 17 10:56:04 2018 – [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations.  at /usr/bin/masterha_manager line 50
Mon Sep 17 10:56:04 2018 – [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Mon Sep 17 10:56:04 2018 – [info] Got exit code 1 (Not master dead).

直接把 FIXME_xxx 相关行注释掉算了。

再次开启 mha
nohup masterha_manager –conf=/u01/mha/mha.cnf /u01/mha/log/manager.log 2 1
ok!

关闭主库
mysqladmin -uroot -poracle shutdown

检查备库
mysql show slave status;
Empty set (0.00 sec)

mysql show master status\G
*************************** 1. row ***************************
  File: slave-relay-bin.000002
  Position: 154
  Binlog_Do_DB:

 Binlog_Ignore_DB:

Executed_Gtid_Set:

1 row in set (0.00 sec)
备库已经自动切成了主库。停掉的主库上面的 mha 软件也自动停止了。
恢复之前的主从关系:
现在拉起停掉的主库,会发现主库没有主动加入到集群中去。
主库查询日志位置:
mysql show master status\G
*************************** 1. row ***************************
  File: master-bin.000005
  Position: 154
  Binlog_Do_DB:

 Binlog_Ignore_DB:

Executed_Gtid_Set:

1 row in set (0.00 sec)
备库:
change master to
master_host= bd-dev-mingshuo-183 ,
master_port=3306,
master_user= repl_user ,
master_password= oracle ,
master_log_file= master-bin.000005 ,
master_log_pos=154;

start slave;
主库启用 mha 软件,注意这里要加 -ignore_last_failover 参数,否则会报错:
Mon Sep 17 14:45:56 2018 – [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Sep 17 14:45:56 2018 – [info] Reading application default configuration from /u01/mha/mha.cnf..
Mon Sep 17 14:45:56 2018 – [info] Reading server configuration from /u01/mha/mha.cnf..
Mon Sep 17 14:45:56 2018 – [info] MHA::MasterMonitor version 0.56.
Mon Sep 17 14:45:56 2018 – [error][/usr/share/perl5/vendor_perl/MHA/ServerManager.pm, ln193] There is no alive slave. We can t do failover
Mon Sep 17 14:45:56 2018 – [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln424] Error happened on checking configurations.  at /usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm line 326
Mon Sep 17 14:45:56 2018 – [error][/usr/share/perl5/vendor_perl/MHA/MasterMonitor.pm, ln523] Error happened on monitoring servers.
Mon Sep 17 14:45:56 2018 – [info] Got exit code 1 (Not master dead).

开启 mha 软件:
nohup masterha_manager -ignore_last_failover –conf=/u01/mha/mha.cnf /u01/mha/log/manager.log 2 1

上面是自动 failover 的过程,后面再来测试一下手动 failover
停止 mha manager:
masterha_stop –conf=/u01/mha/mha.cnf

停止 master 数据库
mysqladmin -uroot -poracle shutdown

手动切换
masterha_master_switch –master_state=dead –conf=/u01/mha/mha.cnf –dead_master_host=172.31.217.183 –dead_master_port=3306 –new_master_host=172.31.217.182  –new_master_port=3306 –ignore_last_failover
上面是自动 failover 的过程,后面再来测试一下在线切换:
manager 节点:
停止 mha manager:
masterha_stop –conf=/u01/mha/mha.cnf

masterha_master_switch –conf=/u01/mha/mha.cnf  –master_state=alive –new_master_host=172.31.217.182 –new_master_port=3306 –orig_master_is_new_slave –running_updates_limit=100
Mon Sep 17 15:47:29 2018 – [info] MHA::MasterRotate version 0.56.
Mon Sep 17 15:47:29 2018 – [info] Starting online master switch..
Mon Sep 17 15:47:29 2018 – [info]

Mon Sep 17 15:47:29 2018 – [info] * Phase 1: Configuration Check Phase..
Mon Sep 17 15:47:29 2018 – [info]

Mon Sep 17 15:47:29 2018 – [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Sep 17 15:47:29 2018 – [info] Reading application default configuration from /u01/mha/mha.cnf..
Mon Sep 17 15:47:29 2018 – [info] Reading server configuration from /u01/mha/mha.cnf..
Mon Sep 17 15:47:29 2018 – [info] GTID failover mode = 0
Mon Sep 17 15:47:29 2018 – [info] Current Alive Master: 172.31.217.183(172.31.217.183:3306)
Mon Sep 17 15:47:29 2018 – [info] Alive Slaves:
Mon Sep 17 15:47:29 2018 – [info]  172.31.217.182(172.31.217.182:3306)  Version=5.7.23-log (oldest major version between slaves) log-bin:enabled
Mon Sep 17 15:47:29 2018 – [info]  Replicating from bd-dev-mingshuo-183(172.31.217.183:3306)

It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 172.31.217.183(172.31.217.183:3306)? (YES/no): YES
Mon Sep 17 15:47:33 2018 – [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Mon Sep 17 15:47:33 2018 – [info]  ok.
Mon Sep 17 15:47:33 2018 – [info] Checking MHA is not monitoring or doing failover..
Mon Sep 17 15:47:33 2018 – [info] Checking replication health on 172.31.217.182..
Mon Sep 17 15:47:33 2018 – [info]  ok.
Mon Sep 17 15:47:33 2018 – [info] 172.31.217.182 can be new master.
Mon Sep 17 15:47:33 2018 – [info]

From:
172.31.217.183(172.31.217.183:3306) (current master)
 +–172.31.217.182(172.31.217.182:3306)

To:
172.31.217.182(172.31.217.182:3306) (new master)
 +–172.31.217.183(172.31.217.183:3306)

Starting master switch from 172.31.217.183(172.31.217.183:3306) to 172.31.217.182(172.31.217.182:3306)? (yes/NO): yes
Mon Sep 17 15:47:55 2018 – [info] Checking whether 172.31.217.182(172.31.217.182:3306) is ok for the new master..
Mon Sep 17 15:47:55 2018 – [info]  ok.
Mon Sep 17 15:47:55 2018 – [info] 172.31.217.183(172.31.217.183:3306): SHOW SLAVE STATUS returned empty result. To check replication filtering rules, temporarily executing CHANGE MASTER to a dummy host.
Mon Sep 17 15:47:55 2018 – [info] 172.31.217.183(172.31.217.183:3306): Resetting slave pointing to the dummy host.
Mon Sep 17 15:47:55 2018 – [info] ** Phase 1: Configuration Check Phase completed.
Mon Sep 17 15:47:55 2018 – [info]

Mon Sep 17 15:47:55 2018 – [info] * Phase 2: Rejecting updates Phase..
Mon Sep 17 15:47:55 2018 – [info]

master_ip_online_change_script is not defined. If you do not disable writes on the current master manually, applications keep writing on the current master. Is it ok to proceed? (yes/NO): yes
Mon Sep 17 15:48:32 2018 – [info] Locking all tables on the orig master to reject updates from everybody (including root):
Mon Sep 17 15:48:32 2018 – [info] Executing FLUSH TABLES WITH READ LOCK..
Mon Sep 17 15:48:32 2018 – [info]  ok.
Mon Sep 17 15:48:32 2018 – [info] Orig master binlog:pos is master-bin.000007:154.
Mon Sep 17 15:48:32 2018 – [info]  Waiting to execute all relay logs on 172.31.217.182(172.31.217.182:3306)..
Mon Sep 17 15:48:32 2018 – [info]  master_pos_wait(master-bin.000007:154) completed on 172.31.217.182(172.31.217.182:3306). Executed 0 events.
Mon Sep 17 15:48:32 2018 – [info]  done.
Mon Sep 17 15:48:32 2018 – [info] Getting new master s binlog name and position..
Mon Sep 17 15:48:32 2018 – [info]  slave-relay-bin.000002:154
Mon Sep 17 15:48:32 2018 – [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST= 172.31.217.182 , MASTER_PORT=3306, MASTER_LOG_FILE= slave-relay-bin.000002 , MASTER_LOG_POS=154, MASTER_USER= repl_user , MASTER_PASSWORD= xxx
Mon Sep 17 15:48:32 2018 – [info] Setting read_only=0 on 172.31.217.182(172.31.217.182:3306)..
Mon Sep 17 15:48:32 2018 – [info]  ok.
Mon Sep 17 15:48:32 2018 – [info]

Mon Sep 17 15:48:32 2018 – [info] * Switching slaves in parallel..
Mon Sep 17 15:48:32 2018 – [info]

Mon Sep 17 15:48:32 2018 – [info] Unlocking all tables on the orig master:
Mon Sep 17 15:48:32 2018 – [info] Executing UNLOCK TABLES..
Mon Sep 17 15:48:32 2018 – [info]  ok.
Mon Sep 17 15:48:32 2018 – [info] Starting orig master as a new slave..
Mon Sep 17 15:48:32 2018 – [info]  Resetting slave 172.31.217.183(172.31.217.183:3306) and starting replication from the new master 172.31.217.182(172.31.217.182:3306)..
Mon Sep 17 15:48:32 2018 – [info]  Executed CHANGE MASTER.
Mon Sep 17 15:48:32 2018 – [info]  Slave started.
Mon Sep 17 15:48:32 2018 – [info] All new slave servers switched successfully.
Mon Sep 17 15:48:32 2018 – [info]

Mon Sep 17 15:48:32 2018 – [info] * Phase 5: New master cleanup phase..
Mon Sep 17 15:48:32 2018 – [info]

Mon Sep 17 15:48:32 2018 – [info]  172.31.217.182: Resetting slave info succeeded.
Mon Sep 17 15:48:32 2018 – [info] Switching master to 172.31.217.182(172.31.217.182:3306) completed successfully.

注意切换过程中会有一个地方询问你
master_ip_online_change_script is not defined. If you do not disable writes on the current master manually, applications keep writing on the current master. Is it ok to proceed? (yes/NO): yes
没有 disable 主库的写入,切换之后连接这的应用程序会继续往里面写入,这样 ok 吗?
这里我只是测试这个在线切换的过程的可用性,所以输入了 yes。
切换完成之后 mha 软件暂停了。

以上是“mysql 中 MHA 配置及切换方式有哪些”这篇文章的所有内容,感谢各位的阅读!希望分享的内容对大家有帮助,更多相关知识,欢迎关注丸趣 TV 行业资讯频道!

正文完
 
丸趣
版权声明:本站原创文章,由 丸趣 2023-07-26发表,共计12535字。
转载说明:除特殊说明外本站除技术相关以外文章皆由网络搜集发布,转载请注明出处。
评论(没有评论)