Wan Da #12, how can the MGR cluster automatically select the master without manual intervention

Welcome to the MySQL technical articles shared by the great SQL community. If you have any questions or want to learn, you can leave a message in the comment area below. After seeing it, you will answer it

This article is reproduced from the official account of WeChat, "old leaf teahouse".

After the whole MGR cluster is hung up, the master can be selected automatically without manual intervention

When all nodes in the cluster are down and the cluster is started again, can the primary node be selected automatically?

1. After all nodes are down, can the cluster restart automatically

This is a question from group friends.

Firstly, MySQL service can start automatically after failure by using systemd. Note the following configuration:

[root@GreatSQL ~]# cat /usr/lib/systemd/system/greatsql.service
...
Restart=on-failure

Secondly, after the mysqld process is started, it is also possible to realize the automatic master selection and self startup of MGR by using MySQL Shell, for example:

[root@GreatSQL ~]# mysqlsh --uri greatsql@yejr-mgr3:3306
...
-- Whatever you do, look first help,This is fun Linux If you don't know anything, find a man first( man)
-- Note that there is such a method rebootClusterFromCompleteOutage(),Looks like he didn't run away
 MySQL  yejr-mgr3:3306 ssl  JS > \help dba
      rebootClusterFromCompleteOutage([clusterName][, options])
            Brings a cluster back ONLINE when all members are OFFLINE.

-- Run a try
 MySQL  yejr-mgr3:3306 ssl  JS > dba.rebootClusterFromCompleteOutage()
Restoring the default cluster from complete outage...

The instance 'yejr-mgr4:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y/N]: y

The instance 'yejr-mgr2:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y/N]: y

Dba.rebootClusterFromCompleteOutage: The active session instance (yejr-mgr3:3306) isn't the most updated in comparison with the ONLINE instances of the Cluster's metadata. Please use the most up to date instance: 'yejr-mgr4:3306'. (RuntimeError)

You can see the error message that we cannot start MGR directly because there is no latest data on the current node. The error message also provides suggestions on which node to start, so we instead pull up MGR on yejr-mgr4 node:

[root@GreatSQL ~]# mysqlsh --uri greatsql@yejr-mgr3:3306
...
 MySQL  yejr-mgr4:3306 ssl  JS > dba.rebootClusterFromCompleteOutage()
Restoring the default cluster from complete outage...

The instance 'yejr-mgr3:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y/N]: y

The instance 'yejr-mgr2:3306' was part of the cluster configuration.
Would you like to rejoin it to the cluster? [y/N]: y

yejr-mgr4:3306 was restored.
Rejoining 'yejr-mgr3:3306' to the cluster.
Rejoining instance 'yejr-mgr3:3306' to cluster 'GreatSQLMGR'...
The instance 'yejr-mgr3:3306' was successfully rejoined to the cluster.

Rejoining 'yejr-mgr2:3306' to the cluster.
Rejoining instance 'yejr-mgr2:3306' to cluster 'GreatSQLMGR'...
The instance 'yejr-mgr2:3306' was successfully rejoined to the cluster.

The cluster was successfully rebooted.

You can see that the MGR cluster has been started normally.

The above is to start a failed MGR cluster using MySQL Shell. What should I do if it is manual?

First, execute the following SQL at each node to confirm the current transaction execution of each node:

-- yejr-mgr2 node
root@GreatSQL [none]> select RECEIVED_TRANSACTION_SET from performance_schema.replication_connection_status where 
channel_name = 'group_replication_applier' union all 
select variable_value from performance_schema.global_variables where 
variable_name = 'gtid_executed'\G
*************************** 1. row ***************************
RECEIVED_TRANSACTION_SET:
*************************** 2. row ***************************
RECEIVED_TRANSACTION_SET: 1c293e90-3bdc-11ec-bca1-525400e2078a:1-4537605,
4b7b3b88-3b13-11ec-86e9-525400e2078a:1

-- yejr-mgr3 node
...
*************************** 1. row ***************************
RECEIVED_TRANSACTION_SET:
*************************** 2. row ***************************
RECEIVED_TRANSACTION_SET: 1c293e90-3bdc-11ec-bca1-525400e2078a:1-4542304,
4b7b3b88-3b13-11ec-86e9-525400e2078a:1

-- yejr-mgr4 node
...
*************************** 1. row ***************************
RECEIVED_TRANSACTION_SET:
*************************** 2. row ***************************
RECEIVED_TRANSACTION_SET: 1c293e90-3bdc-11ec-bca1-525400e2078a:1-4652391,
4b7b3b88-3b13-11ec-86e9-525400e2078a:1

As can be seen from the above results, the GTID value of the completed transaction on the yejr-mgr4 node is the largest: 4652391 > 4542304 > 4537605. Therefore, the yejr-mgr4 node should be selected as the Primary node.

Set the node to boot mode and start the MGR service:

[root@GreatSQL ~]# mysql -hyejr-mgr4 -P3306 -ugreatsql -p
...
greatsql@mgr4:3306 [(none)]>set global group_replication_bootstrap_group=ON;

greatsql@mgr4:3306 [(none)]>start group_replication;

-- Start up finished MGR Remember to set it to OFF
greatsql@mgr4:3306 [(none)]>set global group_replication_bootstrap_group=OFF;

On other nodes, just start the MGR service directly. Remember not to set the boot mode again, otherwise it will become the Primary node of a new MGR cluster.

Well, the automatic and manual methods of pulling up a fault MGR cluster have been introduced.

Enjoy GreatSQL :)

Article recommendation:

Technology sharing | MGR best practice
https://mp.weixin.qq.com/s/66...

Technology sharing | the way to repair MGR Bug in Wanli database
https://mp.weixin.qq.com/s/Ia...

Macos system compiler percona and the difference of some functions on Macos system
https://mp.weixin.qq.com/s/jA...

Technology sharing | using systemd to manage single machine and multiple instances of MySQL
https://mp.weixin.qq.com/s/iJ...

Product | GreatSQL to create a better MGR ecology
https://mp.weixin.qq.com/s/By...

Product | GreatSQL MGR optimization reference
https://mp.weixin.qq.com/s/5m...

About GreatSQL

GreatSQL is a MySQL branch maintained by Wanli database. It focuses on improving the reliability and performance of MGR and supports the parallel query feature of InnoDB. It is a branch version of MySQL suitable for financial applications.

Gitee:
https://gitee.com/GreatSQL/Gr...

GitHub:
https://github.com/GreatSQL/G...

Wechat & QQ group:

You can scan code to add GreatSQL community assistant wechat friends, send verification information "add group" to join GreatSQL/MGR exchange wechat group, or directly scan code to join GreatSQL/MGR exchange QQ group.

This article is composed of blog one article multi posting platform OpenWrite release!

Keywords: Database MySQL SQL

Added by carichod on Tue, 07 Dec 2021 15:20:57 +0200