PostgreSQl 12 master-slave stream replication and archiving configuration

WeChat official account: operation and development story, author: Jiang Zong

Hello, I'm Xiao Jiang. A low-level operation and maintenance engineer.

The previous article talked about the source code deployment of PostgreSQl 12. Here we talk about the master-slave stream replication and archiving configuration of PostgreSQl 12.

Two forms of master-slave replication

1) File based log shipping

Creating a high availability (HA) cluster configuration can adopt continuous archiving. In the cluster, the primary server works in continuous archiving mode, the standby server works in continuous recovery mode (one or more can take over the primary server at any time), and the standby server continuously reads WAL files from the primary server. Continuous archiving does not require any changes to the database tables, which can effectively reduce the management overhead and has a relatively low impact on the performance of the primary server. Moving WAL records directly from one database server to another is called log transfer. PostgreSQL realizes file based log transfer through WAL record transfer of one file (WAL segment) at a time. The bandwidth required for log delivery varies according to the transaction rate of the primary server; Log transfer is asynchronous, that is, WAL records are transferred only after the transaction is committed. In a window period, if the primary server fails catastrophically, data will be lost, and the transactions that have not been transferred will be lost; The data loss window can be accessed by using the parameter archive_timeout can be limited as low as a few seconds, but it will increase the bandwidth required for file transfer at the same time. archive_timeout forces an archive after N seconds. If the setting is too small, it will soon exceed WAL_ keep_ The value of segments leads to the loss of data coverage, so don't set it blindly.

2) Stream replication

PostgreSQL in 9 After X, the master-slave stream replication mechanism is introduced. The so-called stream replication is that the standby server synchronizes the corresponding data from the primary server through the tcp stream. When the WAL records are generated, the primary server streams them to the standby server without waiting for the WAL files to be filled. By default, stream replication is asynchronous. In this case, there is a short delay between the submission of a transaction on the primary server and the change becoming visible on the standby server, but this delay is still much smaller than the file based log transfer method. On the premise that the capacity of the standby server meets the load, the delay is usually less than one second; In stream replication, the standby server has smaller data loss window than using file based log transfer, and archive is not required_ Timeout to reduce the data loss window; Starting from PostgreSQL 12, when configuring the primary and standby databases through stream replication, there is no need to configure additional recovery Conf file. Instead, configure a standby under the $PGDATA path of the standby database environment Signal file. Note that this file is an ordinary text file with empty content. It is understood that this file is an identification file. If the standby database passes PG_ If CTL promote is promoted to the main library, the file will disappear automatically.

Note: all operations are performed by the postgres user.

Configure master-slave stream replication and archiving

1) Two machines do secret free login

The archive we use during backup and restore_command and restore_ The command command is run by the postgres user, so we need to implement ssh password free login for the postgres user.

# Log in to the primary pgsql server with the postgres user
ssh-keygen -t rsa  # All the way back
scp /home/postgres/.ssh/id_rsa.pub postgres@10.10.22.152:/home/postgres/.ssh/authorized_keys

Or copy id_rsa.pub File to from pgsql On, and then execute the following command from

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys 


# Login to view permissions from pgsql server
chmod 700 /home/postgres/.ssh
chmod 600 /home/postgres/.ssh/authorized_keys

# Test login
ssh postgres@10.10.22.152


# From pgsql Library
ssh-keygen -t rsa  # All the way back
scp /home/postgres/.ssh/id_rsa.pub postgres@10.10.22.151:/home/postgres/.ssh/authorized_keys

# Login to view permissions from pgsql server
chmod 700 /home/postgres/.ssh
chmod 600 /home/postgres/.ssh/authorized_keys

# Test login
ssh postgres@10.10.22.152

2) Master library configuration

# Create user
ceate user replica with replication login password 'replication';
alter user replica with password 'replication';

# Modify pg_hba.conf
host  replication  replica 10.10.0.0/16  md5


# Modify profile
$ vim /data/postgresql-12/data/postgresql.conf

# Listen to all IP
listen_addresses = "0.0.0.0"
# The maximum number of connections. It is said that the slave needs to be greater than or equal to this value
max_connections = 200
# Set the main pgsql as the host for generating wal. There was no hot since 9.6_ Standby (hot standby mode)
wal_level = replica

# Open continuous archiving
archive_mode = on
#Archive command- o "StrictHostKeyChecking no" is used to cancel the first connection. Enter yes or no
archive_command = 'scp -o "StrictHostKeyChecking no" %p pgslave.ptcloud.t.home:/data/postgresql-12/archive/%f'
# archive_command = 'test ! -f /data/postgresql-12/archive/%f && scp %p pgslave.ayunw.cn:/data/postgresql-12/archive/%f'
archive_cleanup_command = '/usr/local/postgresql-12/bin/pg_archivecleanup -d /data/postgresql-12/data/pg_wal %r >> /data/postgresql-12/log/archive_cleanup.log 2>&1'
# There are up to 16 stream replication connections.
max_wal_senders = 16
# Set the maximum number of wal (xlog) files retained by the streaming service
wal_keep_segments = 256
# Maximum process for heap cleanup
autovacuum_max_workers = 2
max_worker_processes = 16
max_logical_replication_workers = 10
# log setting
log_destination = 'stderr'
logging_collector = on
log_directory = '/data/postgresql-12/log'
log_filename = 'postgresql-%w.log'
log_file_mode = 0600
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 1GB

log_min_messages = error
# sql statements executed for more than 300ms will be recorded in the log file of pgsql, which is similar to slow log
# Generally, 300ms is set. The slow log will hit the pgsql log file to facilitate problem checking
log_min_duration_statement = 300
log_checkpoints = on
log_connections = on
log_disconnections = on
log_error_verbosity = verbose
log_hostname = on
log_line_prefix = '%m [%p] '
log_lock_waits = on 
log_statement = 'ddl'

# After the main database is set, the root user needs to restart the PG service to make the above configuration take effect
systemctl daemon-reload
systemctl restart postgresql


su - postgres
psql

# The master library creates a user replica with flow replication permission
CREATE user replica login replication encrypted password 'replication';

postgres=# \du;
                                   List of roles
 Role name |                         Attributes                         | Member of
-----------+------------------------------------------------------------+-----------
 postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
 replica   | Replication                                                | {}


# The master database increases the trust access of master-slave replication (pg_hba.conf)

$ vim /data/postgresql-12/data/pg_hba.conf
# replication privilege.
host    replication     replica         10.10.0.0/16            trust


# Restart service
systemctl daemon-reload
systemctl restart postgresql
systemctl status postgresql

3) Configure from library

# Stop from the library and delete the data from the pgsql data directory

# Backup data directory
mkdir -p /opt/pgsqldata_backup
mv /data/postgresql-12/data/* /opt/pgsqldata_backup

pg_ctl -D /data/postgresql-12/data -l logfile stop
rm -rf /data/postgresql-12/data/*

4) Basic backup from library

Copying data from the master server to the slave server is called "basic backup"

su - postgres

$ pg_basebackup -h 10.10.22.151 -p 5432 -U replica -W -R -Fp -Xs -Pv -D /data/postgresql-12/data/
Password:
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/8000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_13370"
31384/31384 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 0/8000100
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: base backup completed

Parameter Description:

  • -h address of the main database started

  • -p primary database port

  • -U stream replication user

  • -W use password authentication. Use the password of replica

  • -Fp backup outputs the normal database directory

  • -Xs uses stream replication for replication

  • -Pv outputs details of the replication process

  • -R create recovery for the standby database Conf file. However, this file is not required for pgsql versions after pgsql 10.

  • -D specifies the database directory of the created standby database

5) Configure profiles from libraries

Note: at this time, from the PostgreSQL. Net under the database data directory The conf file is just synchronized from the main pgsql, not the pgsql configuration file. You need to take the old configuration file from the library for use.

$ cd /data/postgresql-12/data/
$ mv postgres.conf  postgres.conf_master.bak

$ cp /opt/pgsqldata_backup/postgres.conf postgres.conf

$ vim /data/postgresql-12/data/postgres.conf

# Listen to all IP
listen_addresses = "0.0.0.0"

# The maximum number of connections. The slave pgsql needs to be greater than or equal to the value of the master
max_connections = 300

restore_command = 'cp /data/postgresql-12/archive/%f %p'
archive_cleanup_command = '/usr/local/postgresql-12/bin/pg_archivecleanup -d /data/postgresql-12/data/pg_wal %r && /usr/local/postgresql-12/bin/pg_archivecleanup -d /data/postgresql-12/archive %r >> /data/postgresql-12/log/archive_cleanup.log 2>&1'

# 9.6 no hot at first_ Standby (hot standby mode)
wal_level = replica
# There are up to 16 stream replication connections.
max_wal_senders = 16

# The setting is larger than the main library and can be set to twice the value
wal_keep_segments = 512
max_logical_replication_workers = 10

autovacuum_max_workers = 2
# Just keep consistent with the value of the master
max_worker_processes = 16

# It shows that this machine is not only used for data archiving, but also for data query
hot_standby = on
#Maximum latency for streaming backups
max_standby_streaming_delay = 30s 
# The interval between reporting local status to the host
wal_receiver_status_interval = 10s 
# In case of replication error, feedback to the host
hot_standby_feedback = on


# log setting
log_destination = 'stderr'
logging_collector = on
log_directory = '/data/postgresql-12/log'
log_filename = 'postgresql-%w.log'
log_file_mode = 0600
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 1GB

log_min_messages = error
# sql statements executed for more than 300 ms will be recorded in the log file of pgsql
log_min_duration_statement = 300
log_checkpoints = on
log_connections = on
log_disconnections = on
log_error_verbosity = verbose
log_hostname = on
log_line_prefix = '%m [%p] '
log_lock_waits = on 
log_statement = 'ddl'

6) Restart slave Library

Ensure that the data directory of the slave database is the master and group of postgres, and the permission is 0700

su - postgres
pg_ctl -D /data/postgresql-12/data restart

Verifying pgsql master-slave

# Log in to the main library
su - postgres

postgres=# psql

postgres=# select client_addr,sync_state from pg_stat_replication;
 client_addr  | sync_state
--------------+------------
 10.10.22.152 | async
(1 row)


select pid, usename, application_name, client_addr, 
      backend_start, client_port, state, sync_state from pg_stat_replication;

  pid  | usename | application_name | client_addr |         backend_start         | client_port |   state   | sync_state
-------+---------+------------------+-------------+-------------------------------+-------------+-----------+------------
 28356 | repl    | walreceiver      | 10.10.22.152 | 2021-12-30 17:00:59.357653+08 |       48660 | streaming | async
(1 row)


As described above, 10.10.22.152 the server is a slave node that receives asynchronous stream replication

Here, the mainstream replication and archiving configuration is complete.

Official account: operation and development story

github: https://github.com/orgs/sunsharing-note/dashboard

Love life, love operation and maintenance

If you think the article is good, please click on the top right corner to send it to your friends or forward it to your circle of friends. Your support and encouragement is my greatest motivation. If you like, please pay attention to me~

Scanning QR code

Pay attention to me and maintain high-quality content from time to time

                                          ........................

Keywords: Database PostgreSQL server

Added by private_guy on Wed, 19 Jan 2022 14:12:29 +0200