WeChat official account: operation and development story, author: Jiang Zong
Hello, I'm Xiao Jiang. A low-level operation and maintenance engineer.
The previous article talked about the source code deployment of PostgreSQl 12. Here we talk about the master-slave stream replication and archiving configuration of PostgreSQl 12.
Two forms of master-slave replication
1) File based log shipping
Creating a high availability (HA) cluster configuration can adopt continuous archiving. In the cluster, the primary server works in continuous archiving mode, the standby server works in continuous recovery mode (one or more can take over the primary server at any time), and the standby server continuously reads WAL files from the primary server. Continuous archiving does not require any changes to the database tables, which can effectively reduce the management overhead and has a relatively low impact on the performance of the primary server. Moving WAL records directly from one database server to another is called log transfer. PostgreSQL realizes file based log transfer through WAL record transfer of one file (WAL segment) at a time. The bandwidth required for log delivery varies according to the transaction rate of the primary server; Log transfer is asynchronous, that is, WAL records are transferred only after the transaction is committed. In a window period, if the primary server fails catastrophically, data will be lost, and the transactions that have not been transferred will be lost; The data loss window can be accessed by using the parameter archive_timeout can be limited as low as a few seconds, but it will increase the bandwidth required for file transfer at the same time. archive_timeout forces an archive after N seconds. If the setting is too small, it will soon exceed WAL_ keep_ The value of segments leads to the loss of data coverage, so don't set it blindly.
2) Stream replication
PostgreSQL in 9 After X, the master-slave stream replication mechanism is introduced. The so-called stream replication is that the standby server synchronizes the corresponding data from the primary server through the tcp stream. When the WAL records are generated, the primary server streams them to the standby server without waiting for the WAL files to be filled. By default, stream replication is asynchronous. In this case, there is a short delay between the submission of a transaction on the primary server and the change becoming visible on the standby server, but this delay is still much smaller than the file based log transfer method. On the premise that the capacity of the standby server meets the load, the delay is usually less than one second; In stream replication, the standby server has smaller data loss window than using file based log transfer, and archive is not required_ Timeout to reduce the data loss window; Starting from PostgreSQL 12, when configuring the primary and standby databases through stream replication, there is no need to configure additional recovery Conf file. Instead, configure a standby under the $PGDATA path of the standby database environment Signal file. Note that this file is an ordinary text file with empty content. It is understood that this file is an identification file. If the standby database passes PG_ If CTL promote is promoted to the main library, the file will disappear automatically.
Note: all operations are performed by the postgres user.
Configure master-slave stream replication and archiving
1) Two machines do secret free login
The archive we use during backup and restore_command and restore_ The command command is run by the postgres user, so we need to implement ssh password free login for the postgres user.
# Log in to the primary pgsql server with the postgres user ssh-keygen -t rsa # All the way back scp /home/postgres/.ssh/id_rsa.pub postgres@10.10.22.152:/home/postgres/.ssh/authorized_keys Or copy id_rsa.pub File to from pgsql On, and then execute the following command from cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys # Login to view permissions from pgsql server chmod 700 /home/postgres/.ssh chmod 600 /home/postgres/.ssh/authorized_keys # Test login ssh postgres@10.10.22.152 # From pgsql Library ssh-keygen -t rsa # All the way back scp /home/postgres/.ssh/id_rsa.pub postgres@10.10.22.151:/home/postgres/.ssh/authorized_keys # Login to view permissions from pgsql server chmod 700 /home/postgres/.ssh chmod 600 /home/postgres/.ssh/authorized_keys # Test login ssh postgres@10.10.22.152
2) Master library configuration
# Create user ceate user replica with replication login password 'replication'; alter user replica with password 'replication'; # Modify pg_hba.conf host replication replica 10.10.0.0/16 md5 # Modify profile $ vim /data/postgresql-12/data/postgresql.conf # Listen to all IP listen_addresses = "0.0.0.0" # The maximum number of connections. It is said that the slave needs to be greater than or equal to this value max_connections = 200 # Set the main pgsql as the host for generating wal. There was no hot since 9.6_ Standby (hot standby mode) wal_level = replica # Open continuous archiving archive_mode = on #Archive command- o "StrictHostKeyChecking no" is used to cancel the first connection. Enter yes or no archive_command = 'scp -o "StrictHostKeyChecking no" %p pgslave.ptcloud.t.home:/data/postgresql-12/archive/%f' # archive_command = 'test ! -f /data/postgresql-12/archive/%f && scp %p pgslave.ayunw.cn:/data/postgresql-12/archive/%f' archive_cleanup_command = '/usr/local/postgresql-12/bin/pg_archivecleanup -d /data/postgresql-12/data/pg_wal %r >> /data/postgresql-12/log/archive_cleanup.log 2>&1' # There are up to 16 stream replication connections. max_wal_senders = 16 # Set the maximum number of wal (xlog) files retained by the streaming service wal_keep_segments = 256 # Maximum process for heap cleanup autovacuum_max_workers = 2 max_worker_processes = 16 max_logical_replication_workers = 10 # log setting log_destination = 'stderr' logging_collector = on log_directory = '/data/postgresql-12/log' log_filename = 'postgresql-%w.log' log_file_mode = 0600 log_truncate_on_rotation = on log_rotation_age = 1d log_rotation_size = 1GB log_min_messages = error # sql statements executed for more than 300ms will be recorded in the log file of pgsql, which is similar to slow log # Generally, 300ms is set. The slow log will hit the pgsql log file to facilitate problem checking log_min_duration_statement = 300 log_checkpoints = on log_connections = on log_disconnections = on log_error_verbosity = verbose log_hostname = on log_line_prefix = '%m [%p] ' log_lock_waits = on log_statement = 'ddl' # After the main database is set, the root user needs to restart the PG service to make the above configuration take effect systemctl daemon-reload systemctl restart postgresql su - postgres psql # The master library creates a user replica with flow replication permission CREATE user replica login replication encrypted password 'replication'; postgres=# \du; List of roles Role name | Attributes | Member of -----------+------------------------------------------------------------+----------- postgres | Superuser, Create role, Create DB, Replication, Bypass RLS | {} replica | Replication | {} # The master database increases the trust access of master-slave replication (pg_hba.conf) $ vim /data/postgresql-12/data/pg_hba.conf # replication privilege. host replication replica 10.10.0.0/16 trust # Restart service systemctl daemon-reload systemctl restart postgresql systemctl status postgresql
3) Configure from library
# Stop from the library and delete the data from the pgsql data directory # Backup data directory mkdir -p /opt/pgsqldata_backup mv /data/postgresql-12/data/* /opt/pgsqldata_backup pg_ctl -D /data/postgresql-12/data -l logfile stop rm -rf /data/postgresql-12/data/*
4) Basic backup from library
Copying data from the master server to the slave server is called "basic backup"
su - postgres $ pg_basebackup -h 10.10.22.151 -p 5432 -U replica -W -R -Fp -Xs -Pv -D /data/postgresql-12/data/ Password: pg_basebackup: initiating base backup, waiting for checkpoint to complete pg_basebackup: checkpoint completed pg_basebackup: write-ahead log start point: 0/8000028 on timeline 1 pg_basebackup: starting background WAL receiver pg_basebackup: created temporary replication slot "pg_basebackup_13370" 31384/31384 kB (100%), 1/1 tablespace pg_basebackup: write-ahead log end point: 0/8000100 pg_basebackup: waiting for background process to finish streaming ... pg_basebackup: syncing data to disk ... pg_basebackup: base backup completed
❝
Parameter Description:
-
-h address of the main database started
-
-p primary database port
-
-U stream replication user
-
-W use password authentication. Use the password of replica
-
-Fp backup outputs the normal database directory
-
-Xs uses stream replication for replication
-
-Pv outputs details of the replication process
-
-R create recovery for the standby database Conf file. However, this file is not required for pgsql versions after pgsql 10.
-
-D specifies the database directory of the created standby database
❞
5) Configure profiles from libraries
Note: at this time, from the PostgreSQL. Net under the database data directory The conf file is just synchronized from the main pgsql, not the pgsql configuration file. You need to take the old configuration file from the library for use.
$ cd /data/postgresql-12/data/ $ mv postgres.conf postgres.conf_master.bak $ cp /opt/pgsqldata_backup/postgres.conf postgres.conf $ vim /data/postgresql-12/data/postgres.conf # Listen to all IP listen_addresses = "0.0.0.0" # The maximum number of connections. The slave pgsql needs to be greater than or equal to the value of the master max_connections = 300 restore_command = 'cp /data/postgresql-12/archive/%f %p' archive_cleanup_command = '/usr/local/postgresql-12/bin/pg_archivecleanup -d /data/postgresql-12/data/pg_wal %r && /usr/local/postgresql-12/bin/pg_archivecleanup -d /data/postgresql-12/archive %r >> /data/postgresql-12/log/archive_cleanup.log 2>&1' # 9.6 no hot at first_ Standby (hot standby mode) wal_level = replica # There are up to 16 stream replication connections. max_wal_senders = 16 # The setting is larger than the main library and can be set to twice the value wal_keep_segments = 512 max_logical_replication_workers = 10 autovacuum_max_workers = 2 # Just keep consistent with the value of the master max_worker_processes = 16 # It shows that this machine is not only used for data archiving, but also for data query hot_standby = on #Maximum latency for streaming backups max_standby_streaming_delay = 30s # The interval between reporting local status to the host wal_receiver_status_interval = 10s # In case of replication error, feedback to the host hot_standby_feedback = on # log setting log_destination = 'stderr' logging_collector = on log_directory = '/data/postgresql-12/log' log_filename = 'postgresql-%w.log' log_file_mode = 0600 log_truncate_on_rotation = on log_rotation_age = 1d log_rotation_size = 1GB log_min_messages = error # sql statements executed for more than 300 ms will be recorded in the log file of pgsql log_min_duration_statement = 300 log_checkpoints = on log_connections = on log_disconnections = on log_error_verbosity = verbose log_hostname = on log_line_prefix = '%m [%p] ' log_lock_waits = on log_statement = 'ddl'
6) Restart slave Library
Ensure that the data directory of the slave database is the master and group of postgres, and the permission is 0700
su - postgres pg_ctl -D /data/postgresql-12/data restart
Verifying pgsql master-slave
# Log in to the main library su - postgres postgres=# psql postgres=# select client_addr,sync_state from pg_stat_replication; client_addr | sync_state --------------+------------ 10.10.22.152 | async (1 row) select pid, usename, application_name, client_addr, backend_start, client_port, state, sync_state from pg_stat_replication; pid | usename | application_name | client_addr | backend_start | client_port | state | sync_state -------+---------+------------------+-------------+-------------------------------+-------------+-----------+------------ 28356 | repl | walreceiver | 10.10.22.152 | 2021-12-30 17:00:59.357653+08 | 48660 | streaming | async (1 row)
As described above, 10.10.22.152 the server is a slave node that receives asynchronous stream replication
Here, the mainstream replication and archiving configuration is complete.
Official account: operation and development story
github: https://github.com/orgs/sunsharing-note/dashboard
Love life, love operation and maintenance
If you think the article is good, please click on the top right corner to send it to your friends or forward it to your circle of friends. Your support and encouragement is my greatest motivation. If you like, please pay attention to me~
Scanning QR code
Pay attention to me and maintain high-quality content from time to time
........................