1, MHA overview
● a set of excellent software for failover and master-slave replication in MySQL high availability environment
● during MySQL fault, MHA can automatically complete fault switching within 0-30 seconds
1. Composition of MHA
① MHA Manager (management node)
② MHA node
manager function: with the help of node components, health check and failover functions can be realized
node component (existing in all mysql servers) functions:
1) . regularly report the status of mysql service to the master manager node
2) . monitor local mysql service functions
2, MHA features
● during automatic failover, MHA tries to save binary logs from the down primary server to ensure that data is not lost to the greatest extent, but this is not always feasible. For example, if the primary server hardware fails or cannot be accessed through ssh, MHA cannot save binary logs and can only perform failover and lose the latest data
● using semi synchronous replication can greatly reduce the risk of data loss
● at present, MHA supports a master-slave architecture with at least three servers, i.e. one master and two slaves
● use MySQL 5 5, which can greatly reduce the risk of data loss. MHA can be combined with semi synchronous replication. If only one slave has received the latest binary log, MHA can apply the latest binary log to all other slave servers, so it can ensure the data consistency of all nodes. Sometimes, it can be set that the slave node is slower than the master node, In case of accidental database deletion and data loss, it can be recovered from the node binary log.
1. Automatic failover process
● failover (intrinsic vip drift)
● auto - trigger Auto with script
3, Build MHA
Based on the previous master-slave environment, continue
1. Experimental environment
MHA Manager: 192.168.153.225 manage the node and install the manager component and node component
Three mysql servers:
Main: 192.168.153.200 installing node components
Install node components from: 192.168.153.215
Install node components from: 192.168.153.220
Define node server name hostnamectl set-hostname manager hostnamectl set-hostname master hostnamectl set-hostname slave1 hostnamectl set-hostname slave2
Installation and compilation dependent environment(Three sets MySQL Server) yum -y install ncurses-devel gcc-c++ perl-Module-Install
master,slave1,slave2 Make two soft links respectively⭐⭐⭐ The first is mysql Command, second mysqlbinlog Node recovery [root@master ~]# ln -s /usr/local/mysql/bin/mysql /usr/sbin [root@master ~]# ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin
Start in the background mysql Service( mysql_safe Safe start,Equivalent to one more daemon,mysqld If you hang up, you'll automatically mysqld Process pull up)⭐ /usr/local/mysql/bin/mysqld_safe --user=mysql &
to configure MySQL One master and two slaves⭐⭐ Two users are authorized on all database nodes. One is used synchronously from the library, and the other is manager use mysql> grant replication slave on *.* to 'myslave'@'192.168.153.%' identified by '123'; mysql> grant all privileges on *.* to 'mha'@'192.168.153.%' identified by 'manager';
The following three authorizations need not be added in theory, but the experimental environment passes MHA inspect MySQL Master slave error reporting, Report that two slave databases cannot connect to the master database through the host name, so all databases need to add the following authorization adopt mha When checking, it is monitored in the form of host name, which is easy to report errors⭐⭐ grant all privileges on *.* to 'mha'@'master' identified by 'manager'; grant all privileges on *.* to 'mha'@'slave1' identified by 'manager'; grant all privileges on *.* to 'mha'@'slave2' identified by 'manager';
Enable read-only function from all libraries (not for super administrators) super Effective, ordinary user) mysql> set global read_only=1; At all mysql Refresh permissions on Database mysql> flush privileges; stay master View binaries and synchronization points on mysql> show master status; Synchronize between two node servers mysql> change master to master_host='192.168.153.200',master_user='myslave',master_password='123',master_log_file='master-bin.000001',master_log_pos=5147; Turn on the master-slave function and check whether the status is successful mysql> start slave; mysql> show slave status\G; You can do a read-only function verification on a slave server mysql -umyslave -p123 -h 192.168.153.200 -P 3306 At this time, an error will be directly reported for the written content, and the read-only function will be enabled successfully
Configure MHA
Install on all servers MHA Environment dependent, install first epel Source, only if the source is installed perl There will be install epel Source, and no gph inspect yum install epel-release --nogpgcheck -y yum install -y perl-DBD-MySQL \ perl-Config-Tiny \ perl-Log-Dispatch \ perl-Parallel-ForkManager \ perl-ExtUtils-CBuilder \ perl-ExtUtils-MakeMaker \ perl-CPAN notes: yum install -y perl-DBD-MySQL \ #perl for mysql database perl-Config-Tiny \ #Fetch the value from the configuration file perl-Log-Dispatch \ #Log log perl-Parallel-ForkManager \ #Multithreading management perl-ExtUtils-CBuilder \ #Extension tool CBuilder perl-ExtUtils-MakeMaker \ #Extension tool MakeMaker perl-CPAN #Database in cpan perl
PS: the MHA software package is different for each operating system version. Here, centos7 4. You must select version 0.57. You must first install node components on all servers, and finally install manager components on the MHA manager node, because manager depends on node components.
All machines synchronize with alicloud time server ntpdate ntp.aliyun.com crontab -e */10 * * * * /usr/sbin/ntpdate ntp.aliyun.com
Upload on all machines node assembly [root@master ~]# mount.cifs //192.168.153.1/LAMP-C7 /mnt Password for root@//192.168.153.1/LAMP-C7: mha4mysql-manager-0.57.tar.gz Upload to/opt Unzip under tar zxvf mha4mysql-node-0.57.tar.gz -C /root cd /root cd mha4mysql-node-0.57 perl Makefile.PL make && make install
Install the manager component on the MHA node
(node must be installed before installing the manager component)
stay manager Machine upload manger assembly take mha4mysql-manager-0.57.tar.gz Upload and decompress tar zxvf mha4mysql-manager-0.57.tar.gz -C /root cd /root/mha4mysql-manager-0.57 perl Makefile.PL make && make install
After manager is installed, several tools will be generated in the / usr/local/bin directory, mainly including the following:
masterha-check_ssh inspect MHA of SSH Configuration status (no interaction) masterha_check_repl inspect MySQL Replication status masterha_manager start-up manager Script for masterha_check_status Check current MHA running state masterha_master_monitor inspect master Is it down masterha_master_switch Control failover((automatic or manual) masterha_conf_host Add or remove configured server information masterha_stop close manager
After node is installed, several scripts are generated in / usr/local/bin directory, which need to be checked, mainly including the following:
save_binary_logs Save and copy master Binary log for apply_diff_relay_logs Identify the differential relay log time and apply the differential events to other events slave filter_mysqlbinlog Remove unnecessary ROLLBACK Rollback event( MHA (this tool is no longer applicable) purge_relay_logs Clear relay log (no blocking) SQL Thread)
Configure password less authentication
All four servers are configured with password less authentication (one-way enter). Enter the following 5 commands for each server ssh-keygen -t rsa ssh-copy-id 192.168.153.200 ssh-copy-id 192.168.153.215 ssh-copy-id 192.168.153.220 ssh-copy-id 192.168.153.225
Configure mha
stay MHA Copy related scripts on node to/usr/local/bin catalogue [root@manager ~]# cp -ra /root/mha4mysql-manager-0.57/samples/scripts/ /usr/local/bin -a: This option is usually used when copying a directory. It retains links, file properties, and copies everything under the directory -r: recursion The four executive files after copying are as follows: ll /usr/local/bin/scripts/ master_ip_failover #Script for VIP management during automatic switching ⭐⭐⭐ master_ip_online_change #VIP management during online switching power_manager #Script to shut down the host after a failure send_report #Script for sending alarms after failover Copy the above master_ip_failover reach/usr/local/bin Directory, where script management is used VIP cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin
Modify master_ip_failover content (delete the original content, add new content and copy directly)
Operate on manager
vim master_ip_failover vim /usr/local/bin/master_ip_failover #!/usr/bin/env perl use strict; use warnings FATAL => 'all'; use Getopt::Long; my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port ); my $vip = '192.168.153.100'; #Floating IP ⭐⭐⭐ my $brdc = '192.168.153.255'; #Broadcast address ⭐⭐⭐ my $ifdev = 'ens33'; #The network card used is ens33 ⭐⭐⭐ my $key = '1'; #International serial number my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip"; #Use the ifoconfig command to start it and set the floating address my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down"; #You can use the ifconfig command to down it my $exit_code = 0; #Normal exit (return status code) #my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;"; #my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key"; GetOptions( 'command=s' => \$command, 'ssh_user=s' => \$ssh_user, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, ); exit &main(); sub main { print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n"; if ( $command eq "stop" || $command eq "stopssh" ) { my $exit_code = 1; eval { print "Disabling the VIP on old master: $orig_master_host \n"; &stop_vip(); $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { my $exit_code = 10; eval { print "Enabling the VIP - $vip on the new master - $new_master_host \n"; &start_vip(); $exit_code = 0; }; if ($@) { warn $@; exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { print "Checking the Status of the script.. OK \n"; exit 0; } else { &usage(); exit 1; } } sub start_vip() { `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`; } # A simple system call that disable the VIP on the old_master sub stop_vip() { `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`; } sub usage { print "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n"; } #remove#At the beginning, notice that the first line needs to be added# % s/^#//⭐⭐⭐
Create MHA software directory and copy configuration files
Operate on manager to delete the original template and insert it
[root@manager ~]# mkdir /etc/masterha [root@manager ~]# cp /root/mha4mysql-manager-0.57/samples/conf/app1.cnf /etc/masterha/ [root@manager ~]# vim /etc/masterha/app1.cnf [server default] #log file manager_log=/var/log/masterha/app1/manager.log #working directory manager_workdir=/var/log/masterha/app1 #Binary file master_binlog_dir=/home/mysql #Tools for failover master_ip_failover_script=/usr/local/bin/master_ip_failover #Online switching VIP tool management master_ip_online_change_script=/usr/local/bin/master_ip_online_change #The following is the password account management configuration password=manager ping_interval=1 remote_workdir=/tmp repl_password=123 repl_user=myslave secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.153.215 -s 192.168.153.220 shutdown_script="" ssh_user=root user=mha [server1] hostname=192.168.153.200 port=3306 [server2] candidate_master=1 hostname=192.168.153.215 check_repl_delay=0 port=3306 [server3] hostname=192.168.153.220 port=3306
Document notes: #manager working directory manager_log=/var/log/masterha/app1/manager.log #manager log manager_workdir=/var/log/masterha/app1 #The location where the master saves binlog. The path here should be the binlog configured in the master master_binlog_dir=/home/mysql #Switch scripts when setting automatic failover, that is, the last configured script master_ip_failover_script=/usr/local/bin/master_ip_failover #Set the switching script for manual switching master_ip_online_change_script=/usr/local/bin/master_ip_online_change #The password here is the password of the previously created monitoring user password=manager #Set monitoring user user=mha #Set the event interval for monitoring the main database and sending ping packets. The default is 3 seconds. When there is no response for three attempts, the automatic failover is performed ping_interval=1 #Set the save location of binlog in case of remote MySQL switching remote_workdir=/tmp #Set the password of the replication user repl_password=123 #Set the account of the replication user repl_user=myslave #Set the script of the alarm sent after switching report_script=/usr/local/send_report #Set the script to check from the server secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.153.215 -s 192.168.153.220 #Set the script to close the failed host after the failure occurs shutdown_script="" #Set ssh login user name ssh_user=root [server2] #Set the candidate master. If this parameter is set, this library will be promoted to the master library after sending the master-slave switch candidate_master=1 #Turn off the check of replication. By default, if a slave lags behind the relay logs of the master by 100m, MHA will not select the slave as the new master. If it is set to 0, this rule will be ignored check_repl_delay=0 hostname=192.168.153.215 port=3306
Test no password authentication
If normal, it will output successfully
[root@manager ~]# masterha_check_ssh -conf=/etc/masterha/app1.cnf [info] All SSH connection tests passed successfully.
Test master-slave replication
masterha_check_repl -conf=/etc/masterha/app1.cnf Note: the first configuration needs to be in master Manually turn on virtual on node IP(ok Success is success) /sbin/ifconfig ens33:1 192.168.153.100
test
Start mha in the background
[root@manager ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 & [2] 121361 [root@manager ~]# jobs [1]- In operation nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
Parameter details: Explanation: ohup ##option masterha_manager ##open –conf=/etc/masterha/app1.cnf ##Specify profile –remove_dead_master_conf ##When the master server fails, the old master ip will be deleted from the master configuration file after the master-slave switch –ignore_last_failover ##Ignore failover, ignore servers that are always down and unreliable By default, if MHA If continuous downtime is detected and the interval between two outages is less than 8 hours, it will not be Failove The reason for this restriction is to avoid ping-pong effect. This parameter represents the number of times ignored MHA Start to switch the generated file. By default MHA After switching, it will be recorded in the log, that is, the log set above app1.failover.complete Documents, If the file is found in the directory during the next switch, the switch is not allowed to be triggered unless the file is deleted after the first switch:For convenience, set here as–ianore_last_failover < /dev/null> ##All generated information will be exported to nul1 or / var / log / masterha / app1 / manager Log file 2>&1& ##Change the output of 2 error from directional to standard output, and "&" starts the background operation
View the current master node
[root@manager ~]# masterha_check_status --conf=/etc/masterha/app1.cnf app1 (pid:120311) is running(0:PING_OK), master:192.168.153.200
View current log information
[root@manager ~]# cat /var/log/masterha/app1/manager.log
Simulate the fault on the original master and check the changes of the master
[root@manager ~]# tail -f /var/log/masterha/app1/manager.log
Check whether the MySQL VIP address 192.168.226.100 exists. This VIP address will not disappear because the manager node stops MHA service
[root@master ~]# ifconfig inet 192.168.153.200 netmask 255.255.255.0 broadcast 192.168.153.255 inet 192.168.153.100 netmask 255.255.255.0 broadcast 192.168.153.255
Fault repair
repair db,Add relay logs and notes to allow synchronization [root@master ~]# vim /etc/my.cnf log_bin=master-bin #log_slave=updates=true server_id=11 relay-log=relay-log-bin relay-log-index=slave-relay-bin.index Restart the server [root@master ~]# systemctl restart mysqld
Repair master-slave: on the master-slave server mysql> show master status; +-------------------+----------+--------------+------------------+-------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +-------------------+----------+--------------+------------------+-------------------+ | master-bin.000006 | 1006 | | | | +-------------------+----------+--------------+------------------+-------------------+
Repaired master: mysql> change master to master_host='192.168.153.215',master_user='myslave',master_password='123',master_log_file='master-bin.000006',master_log_pos=1006; mysql> start slave; mysql> set global read_only=1; mysql> flush privileges;
Modify add profile [root@slave1 ~]# vim /etc/my.cnf [server1] hostname=192.168.153.215 port=3306
start-up manager(stay manager (on that machine) [root@manager ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 & View current master Who is the node masterha_check_status --conf=/etc/masterha/app1.cnf Solve the problem of incompatible Chinese and English characters and error reporting dos2unix /usr/local/bin/master_ip_failover