Environmental Science:
192.168.205.37: as MHA server
192.168.205.47: as Master server
192.168.205.57: as Slave1 server
192.168.205.67: as Slave2 server
Edition:
OS: centos 7 1810 with mini install
mariadb-server 5.5.60
mha4mysql-manager-0.56-0.el6.noarch.rpm
mha4mysql-node-0.56-0.el6.noarch.rpm
Purpose:
MHA uses a program written in perl script. When the master node downs, if the replication from the slave node is not completed, when it becomes the master node, it needs to try to copy the binary log of the master node to the new master node, so as to ensure the integrity of the data, so it needs sshkey verification. We use one as the management node, three of which are database nodes, one master and two slaves, to simulate the master downloader to see how the slave is promoted to the master node.
mount this database
- Installing mariadb uses the following scripts:
[root@master data]#cat maridb_yum.sh #!/bin/bash ID=`ip a show dev eth0 | sed -r '3!d;s@(.*inet)(.*)(/.*)@\2@' | cut -d. -f4` rpm -q mariadb-server ||yum install -y mariadb-server [ -d /data/mysql ] || mkdir -p /data/mysql [ -d /data/logs ] || mkdir -p /data/logs chown mysql:mysql /data/{mysql,logs} sed -i 's@datadir=/var/lib/mysql@datadir=/data/mysql@' /etc/my.cnf grep "log-bin" /etc/my.cnf || sed -i '/\[mysqld\]/a log-bin=/data/logs/bin' /etc/my.cnf grep "innodb_file_per_table" /etc/my.cnf || sed -i '/\[mysqld\]/a innodb_file_per_table = on' /etc/my.cnf grep "skip_name_resolve" /etc/my.cnf || sed -i '/\[mysqld\]/a skip_name_resolve = on' /etc/my.cnf grep "server-id" /etc/my.cnf || sed -i "/\[mysqld\]/a server-id=$ID" /etc/my.cnf service mariadb restart
- Modify the configuration files of two slave nodes
[root@slave1 data]#vi /etc/my.cnf [mysqld] read_only relay_log_purge=0 #Do not delete relay logs
-
Establish the replication location and account number of the primary node
MariaDB [(none)]> show master logs; +------------+-----------+ | Log_name | File_size | +------------+-----------+ | bin.000001 | 30373 | | bin.000002 | 1038814 | | bin.000003 | 245 | +------------+-----------+ 3 rows in set (0.00 sec) MariaDB [(none)]> grant replication slave on *.* to repluser@'192.168.205.%' identified by 'centos'; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> grant all on *.* to mhauser@'192.168.205.%' identified by 'centos'; Query OK, 0 rows affected (0.00 sec)
-
Configure the change master to from the slave node and start the thread
MariaDB [(none)]> CHANGE MASTER TO -> MASTER_HOST='192.168.205.47', -> MASTER_USER='repluser', -> MASTER_PASSWORD='centos', -> MASTER_PORT=3306, -> MASTER_LOG_FILE='bin.000003', -> MASTER_LOG_POS=245; Query OK, 0 rows affected (0.01 sec) MariaDB [(none)]> start slave; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.205.47 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: bin.000003 Read_Master_Log_Pos: 401 Relay_Log_File: mariadb-relay-bin.000002 Relay_Log_Pos: 679 Relay_Master_Log_File: bin.000003 Slave_IO_Running: Yes Slave_SQL_Running: Yes
MHA Implementation
- Installation package, mha relies on epel source, we need to use epel
[root@MHA ~]#yum install mha4mysql-manager-0.56-0.el6.noarch.rpm mha4mysql-node-0.56-0.el6.noarch.rpm
- Install node packages on all sql nodes
[root@master ~]#yum install mha4mysql-node-0.56-0.el6.noarch.rpm [root@slave1 data]#yum install mha4mysql-node-0.56-0.el6.noarch.rpm [root@slave2 data]#yum install mha4mysql-node-0.56-0.el6.noarch.rpm
- Configure ssh key validation
[root@MHA ~]#ssh-keygen [root@MHA ~]#ssh-copy-id 192.168.205.37 [root@MHA ~]#cat .ssh/authorized_keys [root@MHA ~]#scp -r .ssh 192.168.205.47:/root/ [root@MHA ~]#scp -r .ssh 192.168.205.57:/root/ [root@MHA ~]#scp -r .ssh 192.168.205.67:/root/
- Generate configuration MHA file
[root@MHA ~]#mkdir /etc/mha [root@MHA ~]#vim /etc/mha/app1.cnf [server default] master_binlog_dir=/data/logs/ user=mhauser password=centos manager_workdir=/data/mastermha/app1/ manager_log=/data/mastermha/app1/manager.log remote_workdir=/data/mastermha/app1/ ssh_user=root repl_user=repluser repl_password=centos ping_interval=1 [server1] hostname=192.168.205.47 candidate_master=1 [server2] hostname=192.168.205.57 [server3] hostname=192.168.205.67 candidate_master=1
- Check ssh before starting, repl replication is ready
[root@MHA ~]#masterha_check_ssh --conf=/etc/mha/app1.cnf [root@MHA ~]#masterha_check_repl --conf=/etc/mha/app1.cnf
- Start the process. This process runs in the foreground. When the primary node fails, it terminates after the switch is completed, so to continue restarting the process
[root@MHA ~]#masterha_manager --conf=/etc/mha/app1.cnf Mon Aug 12 23:33:22 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Mon Aug 12 23:33:22 2019 - [info] Reading application default configuration from /etc/mha/app1.cnf.. Mon Aug 12 23:33:22 2019 - [info] Reading server configuration from /etc/mha/app1.cnf..
test
-
When we simulate a write operation, the primary node down to see if the server can switch normally and not replicate the lost data.
MariaDB [(none)]> use test; Database changed MariaDB [test]> create table testlog (id int auto_increment primary key,name char(10),age int default 20); Query OK, 0 rows affected (0.01 sec) MariaDB [test]> MariaDB [test]> delimiter $$ MariaDB [test]> MariaDB [test]> create procedure sp_testlog() -> begin -> declare i int; -> set i = 1; -> while i <= 100000 -> do insert into testlog(name,age) values (concat('wang',i),i); -> set i = i +1; -> end while; -> end$$ Query OK, 0 rows affected (0.00 sec) MariaDB [test]> MariaDB [test]> delimiter ; MariaDB [test]> call sp_testlog;
- While writing the data, we shut down 47 servers in time. MHA detected the nodes and responded, and successfully upgraded the other main exit programs.
[root@MHA ~]#masterha_manager --conf=/etc/mha/app1.cnf Mon Aug 12 23:33:22 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Mon Aug 12 23:33:22 2019 - [info] Reading application default configuration from /etc/mha/app1.cnf.. Mon Aug 12 23:33:22 2019 - [info] Reading server configuration from /etc/mha/app1.cnf.. Mon Aug 12 23:45:16 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Mon Aug 12 23:45:16 2019 - [info] Reading application default configuration from /etc/mha/app1.cnf.. Mon Aug 12 23:45:16 2019 - [info] Reading server configuration from /etc/mha/app1.cnf.. [root@MHA ~]#
-
We can check the log to see what happened. Slve2 upgrade is the main thing.
[root@MHA data]#cat mastermha/app1/manager.log ----- Failover Report ----- app1: MySQL Master failover 192.168.205.47(192.168.205.47:3306) to 192.168.205.67(192.168.205.67:3306) succeeded Master 192.168.205.47(192.168.205.47:3306) is down! Check MHA Manager logs at MHA:/data/mastermha/app1/manager.log for details. Started automated(non-interactive) failover. The latest slave 192.168.205.57(192.168.205.57:3306) has all relay logs for recovery. Selected 192.168.205.67(192.168.205.67:3306) as a new master. 192.168.205.67(192.168.205.67:3306): OK: Applying all logs succeeded. 192.168.205.57(192.168.205.57:3306): This host has the latest relay log events. Generating relay diff files from the latest slave succeeded. 192.168.205.57(192.168.205.57:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.205.67(192.168.205.67:3306) 192.168.205.67(192.168.205.67:3306): Resetting slave info succeeded. Master failover to 192.168.205.67(192.168.205.67:3306) completed successfully.
- Let's check, look at slave status on slave 2 and find that it's gone, because it's the master.
MariaDB [(none)]> show slave status\G Empty set (0.00 sec)
- Check slave1 and find that the primary node points to slave2
MariaDB [test]> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.205.67 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: bin.000005 Read_Master_Log_Pos: 245 Relay_Log_File: mariadb-relay-bin.000002 Relay_Log_Pos: 523 Relay_Master_Log_File: bin.000005 Slave_IO_Running: Yes Slave_SQL_Running: Yes
- Look at the new master node to see if read_only off is removed and find that it is, but don't forget to change back to the configuration file, which will become read_only if restarted.
MariaDB [(none)]> show variables like 'read_only'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | read_only | OFF | +---------------+-------+ 1 row in set (0.00 sec)