MySQL高可用架构中MHA的本质以及如何部署
发表于:2025-11-08 作者:千家信息网编辑
千家信息网最后更新 2025年11月08日,这篇文章将为大家详细讲解有关MySQL高可用架构中MHA的本质以及如何部署,文章内容质量较高,因此小编分享给大家做个参考,希望大家阅读完这篇文章后对相关知识有一定的了解。MySQL高可用架构之MHA1
千家信息网最后更新 2025年11月08日MySQL高可用架构中MHA的本质以及如何部署
这篇文章将为大家详细讲解有关MySQL高可用架构中MHA的本质以及如何部署,文章内容质量较高,因此小编分享给大家做个参考,希望大家阅读完这篇文章后对相关知识有一定的了解。
MySQL高可用架构之MHA
1、关于MHA
MHA(Master HA)是一款开源的MySQL的高可用程序,它为MySQL主从复制架构提供了automating master failover功能。MHA在监控到master节点故障时,会提升其中拥有的最新数据的slave节点成为新的master节点,在此期间,MHA会通过其它从节点获取额外信息来避免一致性方面的问题。MHA还提供了master节点的在线切换功能,即按需切换master/slave节点。
MHA服务有两种角色,MHA Manager(管理节点)和MHA Node(数据节点):
MHA Manager:通常单独部署在一台独立机器上管理多个master/slave集群,每个master/slave集群称为一个application;MHA node:运行在每台MySQL服务器上,它通过监控具备解析和清理log功能的脚本来加快故障转移
2、MHA组件说明
Manager节点:
-masterha_check_ssh:MHA依赖的SSH环境检测工具;-masterha_check_repl:MySQL复制环境检测工具;-masterha_manager:MHA服务主程序;-masterha_check_status:MHA运行状态探测工具;-masterha_master_monitor:MySQL master节点可用性检测工具;-masterha_master_switch:master节点切换工具;-masterha_conf_host:添加或删除配置的节点;-masterha_stop:关闭MHA服务的工具;
Node节点:
-save_binary_logs:保存和复制master的二进制日志;-apply_diff_relay_logs:识别差异的中继日志事件并用于其他slave;-fiter_mysqlbinlog:去除不必要的ROLLBACK事件(MHA已不再使用这个工具);-purge_relay_logs:清除中继日志(不会阻塞SQL线程);
自定义扩展:
-secondary_check_script:通过多条网络路由检测master的可用性;-master_ip_failover_script:更新appliction使用的masterip;-shutdown_script:强制关闭master节点;-report_script:发送报告;-init_conf_load_script:加载初始配置参数;-master_ip_online_change_script:更新master节点ip地址
3、部署及测试
实验拓扑:
node1:192.168.150.137 MHA managernode2:192.168.150.138 MHA node mariadb masternode3:192.168.150.139 MHA node mariadb slave candidatenode4:192.168.150.140 MHA node mariadb slave
配置过程:
1、修改每台服务器的hosts文件/etc/hosts127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4::1 localhost localhost.localdomain localhost6 localhost6.localdomain6192.168.150.137 node1.com node1192.168.150.138 node2.com node2192.168.150.139 node3.com node3192.168.150.140 node4.com node42、node2-node4进行mariadb的yum安装yum -y install mariadb-server3、配置ssh互信通信环境[root@node1 ~]# ssh-keygen -t rsa -P ''Generating public/private rsa key pair.Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'.Your identification has been saved in /root/.ssh/id_rsa.Your public key has been saved in /root/.ssh/id_rsa.pub.The key fingerprint is:a2:f2:10:28:cd:ea:7b:d8:f4:95:15:6e:73:a6:9d:4e root@node1.comThe key's randomart p_w_picpath is:+--[ RSA 2048]----+| || . || . . || + = o ||o + . S * . ||.. o . + . E ||. * o . o ||.. * . . || oo . |+-----------------+[root@node1 ~]# ls .ssh/id_rsaid_rsa id_rsa.pub [root@node1 ~]# cat .ssh/id_rsa.pub > .ssh/authorized_keys[root@node1 ~]# ssh node1The authenticity of host 'node1 (192.168.150.137)' can't be established.ECDSA key fingerprint is 2a:e3:03:52:8c:84:02:59:a2:26:a3:b2:f6:74:6c:3c.Are you sure you want to continue connecting (yes/no)? yesWarning: Permanently added 'node1,192.168.150.137' (ECDSA) to the list of known hosts.Last login: Wed Mar 29 15:06:52 2017 from 192.168.150.1[root@node1 ~]# ll .ssh/authorized_keys -rw-r--r-- 1 root root 396 3月 29 15:35 .ssh/authorized_keys[root@node1 ~]# chmod go= .ssh/authorized_keys [root@node1 ~]# scp -p .ssh/id_rsa .ssh/authorized_keys node2:/root/.sshroot@node2's password: id_rsa 100% 1675 1.6KB/s 00:00 authorized_keys 100% 396 0.4KB/s 00:00 [root@node1 ~]# scp -p .ssh/id_rsa .ssh/authorized_keys node3:/root/.sshThe authenticity of host 'node3 (192.168.150.139)' can't be established.ECDSA key fingerprint is 2a:e3:03:52:8c:84:02:59:a2:26:a3:b2:f6:74:6c:3c.Are you sure you want to continue connecting (yes/no)? yesWarning: Permanently added 'node3,192.168.150.139' (ECDSA) to the list of known hosts.root@node3's password: id_rsa 100% 1675 1.6KB/s 00:00 authorized_keys 100% 396 0.4KB/s 00:00 [root@node1 ~]# scp -p .ssh/id_rsa .ssh/authorized_keys node4:/root/.sshroot@node4's password: Permission denied, please try again.root@node4's password: id_rsa 100% 1675 1.6KB/s 00:00 authorized_keys 100% 396 0.4KB/s 00:00 [root@node1 ~]# ssh node2Last login: Wed Mar 29 15:07:05 2017 from 192.168.150.1[root@node2 ~]# exit登出Connection to node2 closed.[root@node1 ~]# ssh node3Last login: Wed Mar 29 15:07:18 2017 from 192.168.150.1[root@node3 ~]# exit登出Connection to node3 closed.[root@node1 ~]# ssh node3Last login: Wed Mar 29 15:40:05 2017 from node1.com[root@node3 ~]# exit登出Connection to node3 closed.[root@node1 ~]# ssh node4Last failed login: Wed Mar 29 15:39:53 CST 2017 from node1.com on ssh:nottyThere was 1 failed login attempt since the last successful login.Last login: Wed Mar 29 15:39:30 2017 from node1.com[root@node4 ~]# exit登出Connection to node4 closed.[root@node1 ~]# ssh node4Last login: Wed Mar 29 15:40:13 2017 from node1.com[root@node4 ~]# exit登出Connection to node4 closed.4、修改mysql参数master:[mysqld]innodb_file_per_table = 1skip_name_resolve = 1log-bin = master-binrelay-log = relay-binserver_id = 1slave:[mysqld]innode_file_per_table = 1skip_name_resolve = 1log-bin = master-binrelay-log = relay-binserver_id = 2read_only = 1relay_log_purge = 05、主库开启并创建授权账号master:MariaDB [(none)]> SHOW MASTER STATUS;+-------------------+----------+--------------+------------------+| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |+-------------------+----------+--------------+------------------+| master-bin.000003 | 245 | | |+-------------------+----------+--------------+------------------+1 row in set (0.00 sec)MariaDB [(none)]> GRANT REPLICATION SLAVE,REPLICATION CLIENT ON *.* TO 'repluser'@'192.168.%.%' IDENTIFIED BY 'replpass';MariaDB [(none)]> GRANT ALL ON *.* TO 'mhauser'@'192.168.%.%' IDENTIFIED BY 'mhapass'; #此为mha的管理账号MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'read_only'; #此时主库是可写可读+---------------+-------+| Variable_name | Value |+---------------+-------+| read_only | OFF |+---------------+-------+1 row in set (0.00 sec)6、从库进行主从功能开启 node3、node4操作相同[root@node3 ~]# mysqlWelcome to the MariaDB monitor. Commands end with ; or \g.Your MariaDB connection id is 2Server version: 5.5.52-MariaDB MariaDB ServerCopyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.MariaDB [(none)]> CHANGE MASTER TO MASTER_HOST='192.168.150.138',MASTER_USER='repluser',MSTER_PASSWORD='replpass',MASTER_LOG_FILE='master-bin.000003',MASTER_LOG_POS=245;Query OK, 0 rows affected (0.01 sec)MariaDB [(none)]> SHOW SLAVE STATUS\G*************************** 1. row *************************** Slave_IO_State: Master_Host: 192.168.150.138 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000003 Read_Master_Log_Pos: 245 Relay_Log_File: relay-bin.000001 Relay_Log_Pos: 4 Relay_Master_Log_File: master-bin.000003 Slave_IO_Running: No Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 245 Relay_Log_Space: 245 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULLMaster_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 01 row in set (0.00 sec)MariaDB [(none)]> START SLAVE;Query OK, 0 rows affected (0.00 sec)MariaDB [(none)]> SHOW SLAVE STATUS\G*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.150.138 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000003 Read_Master_Log_Pos: 497 Relay_Log_File: relay-bin.000002 Relay_Log_Pos: 782 Relay_Master_Log_File: master-bin.000003 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 497 Relay_Log_Space: 1070 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 11 row in set (0.00 sec)MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'read_only'; #此时从库是只读模式+---------------+-------+| Variable_name | Value |+---------------+-------+| read_only | ON |+---------------+-------+1 row in set (0.00 sec)7、此时一主两从架构已经配置完成,安装MHA包manager节点(node1):安装mha4mysql-manager-0.56-0.el6.noarch.rpm和mha4mysql-node-0.56-0.el6.noarch.rpmyum install mha4mysql* -ynode节点(node2-node4):安装mha4mysql-node-0.56-0.el6.noarch.rpmyum -y iinstall mha4mysql-node-0.56-0.el6.noarch.rpm8、初始化MHA创建配置目录及配置文件(在node1上执行)[root@node1 ~]# mkdir /etc/masterha[root@node1 ~]# vim /etc/masterha/app1.cnf [server default]user=mhauserpassword=mhapassmanager_workdir=/data/masterha/app1master_log=/data/masterha/app1/manager.logremote_workdir=/data/masterha/app1ssh_user=rootrepl_user=repluserrepl_password=replpassping_interval=1[server1]hostname=192.168.150.138candidate_master=1[server2]hostname=192.168.150.139candidate_master=1[server3]hostname=192.168.150.1409、启动前检测ssh互信配置是否OK[root@node1 ~]# masterha_check_ssh --conf=/etc/masterha/app1.cnf......Warning: Permanently added '192.168.150.139' (ECDSA) to the list of known hosts.Wed Mar 29 17:03:03 2017 - [debug] ok.Wed Mar 29 17:03:03 2017 - [info] All SSH connection tests passed successfully.mysql复制集群的连接配置是否OK[root@node1 ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf......Wed Mar 29 17:04:40 2017 - [info] 192.168.150.138(192.168.150.138:3306) (current master) +--192.168.150.139(192.168.150.139:3306) +--192.168.150.140(192.168.150.140:3306)Wed Mar 29 17:04:40 2017 - [info] Checking replication health on 192.168.150.139..Wed Mar 29 17:04:40 2017 - [info] ok.Wed Mar 29 17:04:40 2017 - [info] Checking replication health on 192.168.150.140..Wed Mar 29 17:04:40 2017 - [info] ok.Wed Mar 29 17:04:40 2017 - [warning] master_ip_failover_script is not defined.Wed Mar 29 17:04:40 2017 - [warning] shutdown_script is not defined.Wed Mar 29 17:04:40 2017 - [info] Got exit code 0 (Not master dead).MySQL Replication Health is OK.10、启动MHA[root@node1 ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf > /data/masterha/app1/manager.log 2>&1 &[1] 16989[root@node1 ~]# tail -f /data/masterha/app1/manager.log 192.168.150.138(192.168.150.138:3306) (current master) +--192.168.150.139(192.168.150.139:3306) +--192.168.150.140(192.168.150.140:3306)Wed Mar 29 21:51:58 2017 - [warning] master_ip_failover_script is not defined.Wed Mar 29 21:51:58 2017 - [warning] shutdown_script is not defined.Wed Mar 29 21:51:58 2017 - [info] Set master ping interval 1 seconds.Wed Mar 29 21:51:58 2017 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.Wed Mar 29 21:51:58 2017 - [info] Starting ping health check on 192.168.150.138(192.168.150.138:3306)..Wed Mar 29 21:51:58 2017 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..11、启动后查看master节点状态[root@node1 ~]# masterha_check_status --conf=/etc/masterha/app1.cnfapp1 (pid:16623) is running(0:PING_OK), master:192.168.150.13812、进行故障转移测试(1)maser节点关闭mariadb[root@node2 ~]# killall mysqld mysqld_safe(2)此时在manager上可以看到转移的日志----- Failover Report -----app1: MySQL Master failover 192.168.150.138(192.168.150.138:3306) to 192.168.150.139(192.168.150.139:3306) succeededMaster 192.168.150.138(192.168.150.138:3306) is down!Check MHA Manager logs at node1.com:/data/masterha/app1/manager.log for details.Started automated(non-interactive) failover.The latest slave 192.168.150.139(192.168.150.139:3306) has all relay logs for recovery.Selected 192.168.150.139(192.168.150.139:3306) as a new master.192.168.150.139(192.168.150.139:3306): OK: Applying all logs succeeded.192.168.150.140(192.168.150.140:3306): This host has the latest relay log events.Generating relay diff files from the latest slave succeeded.192.168.150.140(192.168.150.140:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.150.139(192.168.150.139:3306)192.168.150.139(192.168.150.139:3306): Resetting slave info succeeded.Master failover to 192.168.150.139(192.168.150.139:3306) completed successfully.故障转移后,manager会自动停止,此时查看master状态[root@node1 ~]# masterha_check_status --conf=/etc/masterha/app1.cnfapp1 is stopped(2:NOT_RUNNING).(3)查看其它两个库状态node3 已成功接管master,并可读写MariaDB [(none)]> SHOW MASTER STATUS;+-------------------+----------+--------------+------------------+| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |+-------------------+----------+--------------+------------------+| master-bin.000003 | 245 | | |+-------------------+----------+--------------+------------------+1 row in set (0.00 sec)MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'read_only';+---------------+-------+| Variable_name | Value |+---------------+-------+| read_only | OFF |+---------------+-------+1 row in set (0.00 sec)node4MariaDB [(none)]> SHOW SLAVE STATUS\G*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.150.139 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000003 Read_Master_Log_Pos: 245 Relay_Log_File: relay-bin.000002 Relay_Log_Pos: 530 Relay_Master_Log_File: master-bin.000003 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 245 Relay_Log_Space: 818 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 21 row in set (0.00 sec)MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'read_only';+---------------+-------+| Variable_name | Value |+---------------+-------+| read_only | ON |+---------------+-------+1 row in set (0.00 sec)(4)提供新的从节点已修复复制集群master界定啊故障后,需要重新准备好一个新的MySQL节点。基于来自于master节点的备份恢复后,将其重新配置为mster的从节点即可。新加入节点IP为原master节点IP,否则还得修改appl.cnf中相应的设置,最后再次启动manager,并再次检查状态。[root@node2 ~]# rm -rf /var/lib/mysql/*[root@node2 ~]# vim /etc/my.cnf添加从库两选项read_only = 1relay_log_purge = 0[root@node2 ~]# systemctl start mariadb.service[root@node2 ~]# mysqlWelcome to the MariaDB monitor. Commands end with ; or \g.Your MariaDB connection id is 2Server version: 5.5.52-MariaDB MariaDB ServerCopyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.MariaDB [(none)]> GRANT REPLICATION SLAVE,REPLICATION CLIENT ON *.* TO 'repluser'@'192.168.%.%' IDENTIFIED BY 'replpass';Query OK, 0 rows affected (0.00 sec)MariaDB [(none)]> GRANT ALL ON *.* TO 'mhauser'@'192.168.%.%' IDENTIFIED BY 'mhapass';Query OK, 0 rows affected (0.00 sec)MariaDB [(none)]> FLUSH PRIVILEGES;Query OK, 0 rows affected (0.00 sec)MariaDB [(none)]> CHANGE MASTER TO MASTER_HOST='192.168.150.139',MASTER_USER='repluser',MASTER_PASSWORD='replpass',MASTER_LOG_FILE='master-bin.000003',MASTER_LOG_POS=245;Query OK, 0 rows affected (0.01 sec)MariaDB [(none)]> SHOW SLAVE\GERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '' at line 1MariaDB [(none)]> START SLAVE;Query OK, 0 rows affected (0.01 sec)MariaDB [(none)]> SHOW SLAVE STATUS\G*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.150.139 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000003 Read_Master_Log_Pos: 245 Relay_Log_File: relay-bin.000002 Relay_Log_Pos: 530 Relay_Master_Log_File: master-bin.000003 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 245 Relay_Log_Space: 818 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 21 row in set (0.00 sec)再次开启manager查看状态,状态全部OK,主库变更为node3[root@node1 ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf > /data/masterha/app1/manager.log 2>&1 &[root@node1 ~]# masterha_check_status --conf=/etc/masterha/app1.cnfapp1 (pid:17229) is running(0:PING_OK), master:192.168.150.139
补充:
提供额外机制,防止对master的监控做出误判、VIP添加、在进行故障转移时对原有master节点执行STONITH操作避免脑裂,可通过shutdown_scrip实现、必要时,进行在线master节点转换;
关于MySQL高可用架构中MHA的本质以及如何部署就分享到这里了,希望以上内容可以对大家有一定的帮助,可以学到更多知识。如果觉得文章不错,可以把它分享出去让更多的人看到。
节点
配置
工具
状态
故障
架构
服务
检测
功能
日志
集群
再次
环境
切换
监控
管理
本质
主从
事件
内容
数据库的安全要保护哪些东西
数据库安全各自的含义是什么
生产安全数据库录入
数据库的安全性及管理
数据库安全策略包含哪些
海淀数据库安全审计系统
建立农村房屋安全信息数据库
易用的数据库客户端支持安全管理
连接数据库失败ssl安全错误
数据库的锁怎样保障安全
中国企业积极应对网络安全挑战
2003服务器驱动下载官网
酒店数据库表
网络安全公司360排名多少
考研英语作文网络安全
物理服务器没有安全组
电脑辅助翻译软件开发
上海网络技术开发介绍
网络安全行为感悟
我的世界寒假必玩生存服务器推荐
计算机网络技术专业百度贴吧
计算机网络安全hill密码例题
数据库灾备的意义
农信社网络安全教育心得体会
中学生安全教育与网络安全直播
数据库连接出错08001
奇妙海域网络技术
浪潮服务器 硬盘红灯
深圳巨人网络技术有限公司
网络技术考试题
电脑数据库打开出错
我的世界时空之门服务器
在服务器管理器中执行
群晖显示服务器已满
迷你主机当文件服务器
MySQL数据库数据库检测
网络安全能力培训方案
飞卢服务器异常
浙江阿里服务器
软件开发职业的薪资