Keepalived high availability for Linux

1, High availability introduction

1.1 what is high availability

Generally, it means that two machines start the same business system. When one machine goes down, the other server can quickly take over, which is insensitive to the accessed users.

1.2 common tools

Hardware commonly used: F5
Software usually uses: Keepalived

1.3 how does keepalived achieve high availability?

1.3.1 nouns involved

keepalived software is based on VRRP protocol. VRRP virtual routing redundancy protocol is mainly used to solve the problem of single point of failure

ARP: Broadcast
VRRP protocol: broadcast in a LAN
vip: responsible for IP drift
vmac: responsible for notifying ARP broadcast to modify mac address

1.3.2 examples

For example, the company's network accesses the Internet through the gateway. What if the router fails and the gateway can't forward messages, and everyone can't access the Internet at this time?

The usual approach is to add a standby node to the router, but the problem is that if our primary gateway master fails, users need to manually point to backup. If users modify too many, it will be very troublesome.

Question 1: suppose that the user changes the pointing to the backup router, what if the master router is repaired?
Question 2: suppose the master gateway fails, can we configure the backup gateway as the ip of the master gateway?

In fact, it is not possible, because after finding the MAC address and IP address of the master gateway through the ARP broadcast for the first time, the PC will write the information to the ARP cache table. Then, the PC will connect through the information in the cache table, and then forward the data packet. Even if we modify the IP, the MAC address is unique, and the PC data packet will still be sent to the master. (unless the ARP cache table of the PC expires, the MAC address and IP address corresponding to the new backup can be obtained when the ARP broadcast is initiated again)

How can we achieve automatic failover? At this time, VRRP appears. Our VRRP actually adds a virtual MAC address (VMAC) and virtual IP address (VIP) outside the Master and Backup in the form of software or hardware. In this case, when the PC requests VIP, whether it is processed by the Master or Backup, PC will only record VMAC and VIP information in ARP cache table.

1.4 core concept of high availability keepalived

How to determine who is the primary node and who is the standby node (election, voting, priority)
If the Master fails and Backup takes over automatically, will the Master seize power after recovery (preemptive and non preemptive)
What happens if both servers think they are masters (brain crack problem)

2, keepalived

2.1 environmental preparation

host	IP	identity
lb01	192.168.15.5	keepalived master
lb02	192.168.15.6	keepalived backup
web01	172.16.1.7	web side
web02	172.16.1.8	web side
db01	172.16.1.61	database
	192.168.15.3	VIP

2.2 installation of Keepalived

[root@lb01 conf.d]# yum install keepalived -y

2.3 configuring keepalived

Find profile

[root@lb01 ~]# rpm -qc keepalived
/etc/keepalived/keepalived.conf

Configure the configuration file of the master node LoadBalance01

[root@lb01 ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived

# Global configuration
global_defs {
   # Unique identifier of the current keepalived
   router_id LoadBalance01
}

# Configure VRRP protocol
vrrp_instance VI_1 {
    # Status, MASTER and BACKUP
    state MASTER
    # Binding network card
    interface eth0
    # Virtual route marking can be understood as grouping
    virtual_router_id 50
    # priority
    priority 100
    # Monitor heartbeat interval
    advert_int 1
    # Configuration authentication
    authentication {
        # Certification Type
        auth_type PASS
        # Password for authentication
        auth_pass 1111
    }
    # Set up VIP
    virtual_ipaddress {
        # Virtual VIP address
        192.168.15.3
    }
}

Configure the configuration file of standby node LoadBalance01

[root@lb01 ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived

# Global configuration
global_defs {
   # Unique identifier of the current keepalived
   router_id LoadBalance02
}

# Configure VRRP protocol
vrrp_instance VI_1 {
    # Status, MASTER and BACKUP
    state BACKUP
    # Binding network card
    interface eth0
    # Virtual route marking can be understood as grouping
    virtual_router_id 50
    # priority
    priority 80
    # Monitor heartbeat interval
    advert_int 1
    # Configuration authentication
    authentication {
        # Certification Type
        auth_type PASS
        # Password for authentication
        auth_pass 1111
    }
    # Set up VIP
    virtual_ipaddress {
        # Virtual VIP address
        192.168.15.3
    }
}

2.4 startup service (self startup)

[root@lb01 ~]# systemctl enable --now keepalived
[root@lb02 ~]# systemctl enable --now keepalived

2.5 kept open log

#Configure keepalived
[root@lb01 ~]# vim /etc/sysconfig/keepalived
KEEPALIVED_OPTIONS="-D -d -S 0"
 
#Configure rsyslog to grab logs
[root@lb01 ~]# vim /etc/rsyslog.conf
local0.*        /var/log/keepalived.log
 
#Restart service
[root@lb01 ~]# systemctl restart keepalived rsyslog

3, Preemptive and non preemptive of Keepalived

3.1 when both nodes are started

#When both nodes are started, only node 1 has VIP because node 1 has higher priority than node 2
[root@lb01 ~]# ip addr | grep 192.168.15.3
    inet 192.168.15.3/32 scope global eth0
 
[root@lb02 ~]# ip addr | grep 192.168.15.3

3.2 stop master node

[root@lb01 ~]# systemctl stop keepalived
[root@lb01 ~]# ip addr | grep 192.168.15.3
 
#Since the keepalived of node 1 hangs up, node 2 will automatically take over the work of node 1, that is, VIP
[root@lb02 ~]# ip addr | grep 192.168.15.3
    inet 192.168.15.3/32 scope global eth0

3.3 restart the master node

#Start master node
[root@lb01 ~]# systemctl start keepalived
[root@lb01 ~]# ip addr | grep 192.168.15.3
    inet 192.168.15.3/32 scope global eth0

#Since node 1 has higher priority than node 2, when node 1 recovers, the VIP will be preempted back

3.4 configure non preemptive

Master node configuration (LoadBalance01)

[root@lb01 ~]# vim /etc/keepalived/keepalived.conf 
... ...
vrrp_instance VI_1 {
    #Status, MASTER and BACKUP
    state BACKUP
    # Open non preemptive
    nopreempt
    #Binding network card
    interface eth0
    #Virtual route marking can be understood as grouping
    virtual_router_id 50
    #priority
    priority 100
... ...
}
 
[root@lb01 ~]# systemctl restart keepalived

Standby node configuration (LoadBalance02)

[root@lb02 ~]# vim /etc/keepalived/keepalived.conf 
... ...
vrrp_instance VI_1 {
    #Status, MASTER and BACKUP
    state BACKUP
    # Open non preemptive
    nopreempt
    #Binding network card
    interface eth0
    #Virtual route marking can be understood as grouping
    virtual_router_id 50
    #priority
    priority 90
... ...
}
 
[root@lb02 ~]# systemctl restart keepalived.service

Configuration considerations
1. The state of both nodes must be configured as BACKUP;
2. Both nodes must be configured with nopreempt;
3. The priority of one node must be higher than that of the other node;
After nopreempt is enabled for both servers, the role status must be modified to BACKUP. The only difference is priority.

4, Keepalived cerebral fissure

For some reasons, the two keepalived high availability servers cannot detect each other's heartbeat within the specified time, and each obtains the ownership of resources and services. At this time, both high availability servers are still alive.

4.1 fault of cerebral fissure

Loose network cable, network fault
Server hardware failure
Firewalls are turned on between servers

4.2 brain fissure simulation

Turn on the firewall

[root@lb01 ~]# systemctl start firewalld
[root@lb01 ~]# ip addr | grep 192.168.15.3
    inet 192.168.15.3/32 scope global eth0
 
[root@lb02 ~]# systemctl start firewalld
[root@lb02 ~]# ip addr | grep 192.168.15.3
    inet 192.168.15.3/32 scope global eth0

Visit website

#Because firewalld firewall is enabled, all connections are rejected by default. Port 80 should be enabled
[root@lb01 ~]# firewall-cmd --add-service=http
success
[root@lb02 ~]# firewall-cmd --add-service=http
success
 
[root@lb01 ~]# firewall-cmd --add-service=https
success
[root@lb02 ~]# firewall-cmd --add-service=https
success
 
#There are no problems accessing the page

Turn off firewall

[root@lb02 ~]# systemctl stop firewalld 
[root@lb02 ~]# ip addr | grep 192.168.15.3
 
[root@lb01 ~]# systemctl stop firewalld
[root@lb01 ~]# ip addr | grep 192.168.15.3
    inet 192.168.15.3/32 scope global eth0

4.3 solutions to cerebral fissure

#If a brain crack occurs, kill one at random
#Write a test script on the standby node. If the test can ping the primary node and the standby node has a VIP, it is considered that a brain fissure has occurred
[root@lb02 ~]# cat check_split_brain.sh
#!/bin/sh
vip=192.168.15.3
lb01_ip=192.168.15.5
while true;do
    ping -c 2 $lb01_ip &>/dev/null
    if [ $? -eq 0 -a `ip add|grep "$vip"|wc -l` -eq 1 ];then
        echo "ha is split brain.warning."
    else
        echo "ha is ok"
    fi
sleep 5
done
 
[root@lb02 ~]# vim check_keepalive.sh 
#!/bin/sh
vip=192.168.15.3
lb01_ip=172.16.1.5
while true;do
    ssh $lb01_ip 'ip addr | grep 10.0.0.3' &>/dev/null
    if [ $? -eq 0 -a `ip add|grep "$vip"|wc -l` -eq 1 ];then
        echo "ha is split brain.warning."
    else
        echo "ha is ok"
    fi
sleep 3
done

5, High availability Keepalived and Nginx

Nginx listens to all IP addresses by default, and the VIP will float to one node, which is equivalent to that nginx has an additional network card such as VIP, so you can access the machine where nginx is located

But If nginx goes down, the user's request will fail, but the keepalived does not hang up and will not be switched. Therefore, a script needs to be written to detect the survival status of nginx. If it does not survive, kill keepalived

5.1 Nginx failover script

[root@lb01 ~]# vim check_web.sh
#!/bin/sh
nginxpid=$(ps -ef | grep [n]ginx | wc -l)
 
#1. Judge whether Nginx is alive. If not, try to start Nginx
if [ $nginxpid -eq 0 ];then
    systemctl start nginx &>/dev/null
    sleep 3
    #2. Wait for 3 seconds and get the Nginx status again
    nginxpid=$(ps -ef | grep [n]ginx | wc -l) 
    #3. Judge again. If Nginx does not survive, stop Keepalived, drift the address, and exit the script  
    if [ $nginxpid -eq 0 ];then
        systemctl stop keepalived
    fi
fi
 
#Add executable permissions to scripts
[root@lb01 ~]# chmod +x /root/check_web.sh

5.2 using the keepalived configuration file to call the nginx switch script

When configuring preemptive

#It only needs to be configured on the master node
[root@lb01 ~]# vim /etc/keepalived/keepalived.conf 
global_defs {
    router_id LoadBalance01
}
 
#Execute the script every 5 seconds. The execution content of the script cannot exceed 5 seconds, otherwise the script will be interrupted and re executed again
vrrp_script check_web {
    script "/root/check_web.sh"
    interval 5
}
 
vrrp_instance VI_1 {
    state MASTER
    nopreempt
    interface eth0
    virtual_router_id 50
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.15.3
    }
    #Script to invoke the schedule
    track_script {
        check_web
    }
}

When configuring non preemptive

#When configuring non preemptive, configure scripts on both sides
[root@lb01 ~]# scp check_web.sh 172.16.1.6:/root
 
#Spare nodes should also be configured
[root@lb02 ~]# cat /etc/keepalived/keepalived.conf 
global_defs {
    router_id LoadBalance02
}
 
vrrp_script check_web {
    script "/root/check_web.sh"
    interval 5
}
 
vrrp_instance VI_1 {
    state BACKUP
    nopreempt
    interface eth0
    virtual_router_id 50
    priority 90
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.15.3
    }
    track_script {
        check_web
    }
}

5.3 testing

Error modifying the configuration file of nginx on the VIP machine
Stop nginx
Check whether VIP switches

Keywords: Linux

Added by Petty_Crim on Mon, 10 Jan 2022 15:53:48 +0200

Programming VIP