pg_basebackup backup and recovery

1,pg_basebackup

1.1,pg_basebackup principle

Official document: pg_basebackup (postgres.cn)

1) Create a checkpoint, open FPW (full page writes), and create a backup label (store checkpoint location, time and other information)

2) Establish a connection with the database through the stream replication protocol, and the WAL Sender process sends a message to pg_basebackup sends database physical files

3)pg_basebackup writes to the target location (compressed or uncompressed) after receiving the file

be careful:

1. The connection must be made by a superuser or one with REPLICATION permission( Section 21.2 )User creation.

2. And pg_hba.conf must explicitly allow this replication connection.

3. The server must also max_wal_senders Set it high enough to allow at least one session for backup and one for WAL streams if streams are used.

4,pg_basebackup: from the beginning of backup to the end of backup, wal logs will be backed up together; dml during backup can also be restored together.

5. In the process of copying, the copy slot at zero time will be used, so ensure max_replication_slots are enough.

pg_basebackup can create a basic backup not only from the master computer but also from the backup computer. To get a backup from the backup machine, set the backup machine so that it can accept replication connections (that is, set max_wal_senders and hot_standby , and configure Host based authentication ). You will also need to enable it on the master full_page_writes.

Note that there are some limitations in the online backup from the backup machine:

  • Backup history files are not created in the database cluster being backed up.
  • If - X none is being used, it is not guaranteed that all WAL files required for backup will be archived at the end of backup.
  • If the backup machine is promoted to the master during the online backup, the backup will fail.
  • All WAL records required for backup must contain enough full page writes, which requires you to enable full on the host computer_ page_ Writes and does not use a similar PG_ The tool of compresslog is archive_command removes full page writes from the WAL file.

1.2 grammatical interpretation

[paas-sotc-telepgtest-002][telepg][/app/backup]$pg_basebackup --help
pg_basebackup takes a base backup of a running PostgreSQL server.

Usage:
  pg_basebackup [OPTION]...

Options controlling the output:
  -D, --pgdata=DIRECTORY receive base backup into directory
  -F, --format=p|t       output format (plain (default), tar)
  -r, --max-rate=RATE    maximum transfer rate to transfer data directory
                         (in kB/s, or use suffix "k" or "M")
  -R, --write-recovery-conf
                         write configuration for replication
  -T, --tablespace-mapping=OLDDIR=NEWDIR
                         relocate tablespace in OLDDIR to NEWDIR
      --waldir=WALDIR    location for the write-ahead log directory
  -X, --wal-method=none|fetch|stream
                         include required WAL files with specified method
  -z, --gzip             compress tar output
  -Z, --compress=0-9     compress tar output with given compression level

General options:
  -c, --checkpoint=fast|spread
                         set fast or spread checkpointing
  -C, --create-slot      create replication slot
  -l, --label=LABEL      set backup label
  -n, --no-clean         do not clean up after errors
  -N, --no-sync          do not wait for changes to be written safely to disk
  -P, --progress         show progress information
  -S, --slot=SLOTNAME    replication slot to use
  -v, --verbose          output verbose messages
  -V, --version          output version information, then exit
      --no-slot          prevent creation of temporary replication slot
      --no-verify-checksums
                         do not verify checksums
  -?, --help             show this help, then exit

Connection options:
  -d, --dbname=CONNSTR   connection string
  -h, --host=HOSTNAME    database server host or socket directory
  -p, --port=PORT        database server port number
  -s, --status-interval=INTERVAL
                         time between status packets sent to server (in seconds)
  -U, --username=NAME    connect as specified database user
  -w, --no-password      never prompt for password
  -W, --password         force password prompt (should happen automatically)

Report bugs to <pgsql-bugs@lists.postgresql.org>.
[paas-sotc-telepgtest-002][telepg][/app/backup]$
  • F: t tar format output; p plain output
  • z: The output tar backup is gzip compressed
  • P: Real time print backup progress
  • v: Detailed mode. When - P is used, the information of which specific file is being backed up will also be printed
  • l: Specify an ID label for the backup
  • R: write configuration for replication
  • 10: Backup mode, serial and parallel
    • f: fetch serial copy. After data is copied, copy the wal log. If tar format is used, the pre written log file will be written to base Tar file.
    • s: stream parallel replication, synchronous replication of data and wal logs. If tar format is used, the pre written log file is written to a separate file named pg_wal.tar file; This value is the default.

1.3,pg_basebackup for backup

1.3.1 setting pg_hba.conf

Set pg_hba.conf, which allows the client machine to initiate the stream replication link

host    replication     all             0.0.0.0/0               md5

1.3.2 user settings for backup

Backup users need superuser or replication

must be superuser or replication role to start walsender

create role backup REPLICATION LOGIN PASSWORD 'backup';

1.3.3. tar package of backup Library

1. First create database and table in the source library

postgres=# create database test_rhl1;
CREATE DATABASE
postgres=# \c test_rhl1;
You are now connected to database "test_rhl1" as user "postgres".
test_rhl1=# create table tb1 (id int,name varchar(20));
CREATE TABLE
test_rhl1=# insert into tb1 values (1,'asd'),(2,'qwe'),(3,'ert');
INSERT 0 3
test_rhl1=# select * from tb1;
 id | name 
----+------
  1 | asd
  2 | qwe
  3 | ert
(3 rows)

2. Backup into tar package

[paas-sotc-telepgtest-002][telepg][/app/backup]$pg_basebackup -h134.108.68.72 -p5432 -Ubackup  -D /app/backup -Xs -F t -z -v -P -l pg_backup
Password: 
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/DD000028 on timeline 2
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_128957"
12652680/12652680 kB (100%), 1/1 tablespace                                         
pg_basebackup: write-ahead log end point: 0/DD000328
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: syncing data to disk ...
pg_basebackup: base backup completed

Because the - Xs stream mode is used and the tar format is used, the pre written log file is written to a separate file named pg_wal.tar file

[paas-sotc-telepgtest-002][telepg][/app/backup]$ll
total 777148
-rw------- 1 telepg telepg 795776386 Feb 11 15:02 base.tar.gz
-rw------- 1 telepg telepg     18542 Feb 11 15:02 pg_wal.tar.gz
[paas-sotc-telepgtest-002][telepg][/app/backup]$

3. When the backup is not completed, perform dml in the original database and check whether it will also be backed up after recovery

test_rhl1=# insert into tb1 values(4,'asdasdsd');
INSERT 0 1
test_rhl1=# insert into tb1 values(5,'asdsd');   
INSERT 0 1
test_rhl1=# select * from tb1;
 id |   name   
----+----------
  1 | asd
  2 | qwe
  3 | ert
  4 | asdasdsd
  5 | asdsd
(5 rows)

1.3.4 backup into slave database

pg_basebackup -h 134.108.68.72 -p 5432 -U repl -l bk20210727 -F p -P -R -D /app/telepg/pg_data

1.4. Database recovery

1.4.1. Create a restored directory

mkdir -p  /app/telepg/pg_data_new

1.4.2 unzip the backup file to the restored directory

1. Unzip base tar. GZ package

[paas-sotc-telepgtest-002][telepg][/app/backup]$ll
total 777148
-rw------- 1 telepg telepg 795776386 Feb 11 15:02 base.tar.gz
-rw------- 1 telepg telepg     18542 Feb 11 15:02 pg_wal.tar.gz
[paas-sotc-telepgtest-002][telepg][/app/backup]$tar -xvf base.tar.gz  -C /app/telepg/pg_data_new
  • View the recovered directory
[paas-sotc-telepgtest-002][telepg][/app/backup]$cd /app/telepg/pg_data_new
[paas-sotc-telepgtest-002][telepg][/app/telepg/pg_data_new]$ll
total 64
-rw------- 1 telepg telepg   210 Feb 11 14:59 backup_label
drwx------ 7 telepg telepg    67 Feb 11 14:52 base
-rw------- 1 telepg telepg    30 Feb 11 00:00 current_logfiles
drwx------ 2 telepg telepg  4096 Feb 11 15:28 global
drwx------ 2 telepg telepg  4096 Jul 29  2021 log
drwx------ 2 telepg telepg     6 Jun 15  2021 pg_commit_ts
drwx------ 2 telepg telepg     6 Jun 15  2021 pg_dynshmem
-rw------- 1 telepg telepg  4826 Jun 15  2021 pg_hba.conf
-rw------- 1 telepg telepg  1636 Jun 15  2021 pg_ident.conf
drwx------ 4 telepg telepg    68 Feb 11 14:59 pg_logical
drwx------ 4 telepg telepg    36 Jun 15  2021 pg_multixact
drwx------ 2 telepg telepg     6 Nov  1 15:26 pg_notify
drwx------ 2 telepg telepg     6 Feb 11 14:59 pg_replslot
drwx------ 2 telepg telepg     6 Jun 15  2021 pg_serial
drwx------ 2 telepg telepg     6 Jun 15  2021 pg_snapshots
drwx------ 2 telepg telepg     6 Nov  1 15:26 pg_stat
drwx------ 2 telepg telepg     6 Feb 11 14:59 pg_stat_tmp
drwx------ 2 telepg telepg     6 Aug 19 17:51 pg_subtrans
drwx------ 2 telepg telepg     6 Jun 15  2021 pg_tblspc
drwx------ 2 telepg telepg     6 Jun 15  2021 pg_twophase
-rw------- 1 telepg telepg     3 Jun 15  2021 PG_VERSION
drwx------ 3 telepg telepg    28 Feb 11 15:27 pg_wal
drwx------ 2 telepg telepg    90 Aug 19 17:30 pg_xact
-rw------- 1 telepg telepg    88 Jun 15  2021 postgresql.auto.conf
-rw------- 1 telepg telepg 26814 Jul 28  2021 postgresql.conf
-rw------- 1 telepg telepg     0 Feb 11 14:59 tablespace_map
  • Check the wal log. It's empty
[paas-sotc-telepgtest-002][telepg][/app/telepg/pg_data_new]$cd pg_wal/
[paas-sotc-telepgtest-002][telepg][/app/telepg/pg_data_new/pg_wal]$ll
total 0
drwx------ 2 telepg telepg 6 Feb 11 14:59 archive_status
[paas-sotc-telepgtest-002][telepg][/app/telepg/pg_data_new/pg_wal]$cd archive_status/
[paas-sotc-telepgtest-002][telepg][/app/telepg/pg_data_new/pg_wal/archive_status]$ll
total 0
[paas-sotc-telepgtest-002][telepg][/app/telepg/pg_data_new/pg_wal/archive_status]$
  • One more backup_label
[paas-sotc-telepgtest-002][telepg][/app/telepg/pg_data_new]$cat backup_label 
START WAL LOCATION: 0/DD000028 (file 0000000200000000000000DD)
CHECKPOINT LOCATION: 0/DD000060
BACKUP METHOD: streamed
BACKUP FROM: master
START TIME: 2022-02-11 14:59:18 CST
LABEL: pg_backup
START TIMELINE: 2

2. Unzip pg_wal.tar.gz package

[paas-sotc-telepgtest-002][telepg][/app/backup]$mkdir wal_back
[paas-sotc-telepgtest-002][telepg][/app/backup]$tar -xvf pg_wal.tar.gz  -C /app/backup/wal_back/
00000002.history
archive_status/00000002.history.done
0000000200000000000000DD
[paas-sotc-telepgtest-002][telepg][/app/backup]$cd wal_back/
[paas-sotc-telepgtest-002][telepg][/app/backup/wal_back]$ll
total 16388
-rw------- 1 telepg telepg 16777216 Feb 11 14:59 0000000200000000000000DD
-rw------- 1 telepg telepg       41 Feb 11 14:59 00000002.history
drwxr-x--- 2 telepg telepg       35 Feb 11 15:33 archive_status

1.4.3. Set restore_command

  • restore_command
    • It tells PostgreSQL how to get archived WAL file segments. And archive_ Similar to command, this is also a shell command string. It can contain% f (to be replaced by the desired log file name) and% p (to be replaced by the destination pathname where the log file is copied). (the pathname is relative to the current working directory, that is, the data directory of the cluster). If you need to embed a real% character in the command, you can write it as%%.

Modify profile

port = 54321 ---The configuration of this 5432 port has been modified
restore_command = 'cp /app/backup/wal_back/%f %p'

1.4.4 start the standby warehouse

1. Direct start error

pg_ctl start -D /app/telepg/pg_data_new
waiting for server to start....2022-02-11 15:39:26.607 CST [131871] LOG:  starting PostgreSQL 12.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2022-02-11 15:39:26.607 CST [131871] LOG:  listening on IPv4 address "0.0.0.0", port 54321
2022-02-11 15:39:26.608 CST [131871] LOG:  listening on Unix socket "/tmp/.s.PGSQL.54321"
2022-02-11 15:39:26.619 CST [131871] LOG:  redirecting log output to logging collector process
2022-02-11 15:39:26.619 CST [131871] HINT:  Future log output will appear in directory "log".
 stopped waiting
pg_ctl: could not start server
Examine the log output.

journal:

2022-02-11 15:39:26.703 CST,,,131873,,620612ae.20321,3,,2022-02-11 15:39:26 CST,,0,FATAL,XX000,"could not locate required checkpoint record",,"If you are restoring from a backup, touch ""/app/telepg/pg_data_new/recovery.signal"" and add required recovery options.
If you are not restoring from a backup, try removing the file ""/app/telepg/pg_data_new/backup_label"".
Be careful: removing ""/app/telepg/pg_data_new/backup_label"" will result in a corrupt cluster if restoring from a backup.",,,,,,,""

2. Follow the prompts to create a recovery signal

touch /app/telepg/pg_data_new/recovery.signal

3. Start the database again and start it successfully

[paas-sotc-telepgtest-002][telepg][/app/telepg/pg_data_new]$pg_ctl start -D /app/telepg/pg_data_new
waiting for server to start....2022-02-11 15:46:10.219 CST [132432] LOG:  starting PostgreSQL 12.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2022-02-11 15:46:10.219 CST [132432] LOG:  listening on IPv4 address "0.0.0.0", port 54321
2022-02-11 15:46:10.221 CST [132432] LOG:  listening on Unix socket "/tmp/.s.PGSQL.54321"
2022-02-11 15:46:10.234 CST [132432] LOG:  redirecting log output to logging collector process
2022-02-11 15:46:10.234 CST [132432] HINT:  Future log output will appear in directory "log".
 done
server started
[paas-sotc-telepgtest-002][telepg][/app/telepg/pg_data_new]$ps -ef|grep postgres
telepg   129080 128997  0 15:00 pts/2    00:00:00 psql -Upostgres
telepg   129091 349224  0 15:00 ?        00:00:00 postgres: postgres test_rhl1 [local] idle
telepg   132432      1  0 15:46 ?        00:00:00 /app/telepg/pg12.5/bin/postgres -D /app/telepg/pg_data_new
telepg   132433 132432  0 15:46 ?        00:00:00 postgres: logger   
telepg   132438 132432  0 15:46 ?        00:00:00 postgres: checkpointer   
telepg   132439 132432  0 15:46 ?        00:00:00 postgres: background writer   
telepg   132441 132432  0 15:46 ?        00:00:00 postgres: stats collector   
telepg   132445 132432  0 15:46 ?        00:00:00 postgres: walwriter   
telepg   132446 132432  0 15:46 ?        00:00:00 postgres: autovacuum launcher   
telepg   132447 132432  0 15:46 ?        00:00:00 postgres: archiver   last was 00000003.history
telepg   132448 132432  0 15:46 ?        00:00:00 postgres: logical replication launcher   
telepg   132462 130738  0 15:46 pts/0    00:00:00 grep --color=auto postgres
telepg   349224      1  0  2021 ?        00:01:47 /app/telepg/pg12.5/bin/postgres -D /app/telepg/pg12.5_data
telepg   349225 349224  0  2021 ?        00:00:00 postgres: logger   
telepg   349227 349224  0  2021 ?        00:00:04 postgres: checkpointer   
telepg   349228 349224  0  2021 ?        00:02:08 postgres: background writer   
telepg   349229 349224  0  2021 ?        00:02:13 postgres: walwriter   
telepg   349230 349224  0  2021 ?        00:02:34 postgres: autovacuum launcher   
telepg   349231 349224  0  2021 ?        00:00:32 postgres: archiver   last was 0000000200000000000000DD.00000028.backup
telepg   349232 349224  0  2021 ?        00:03:53 postgres: stats collector   
telepg   349233 349224  0  2021 ?        00:00:07 postgres: logical replication launcher   
[paas-sotc-telepgtest-002][telepg][/app/telepg/pg_data_new]$

4. Check whether the dml during backup is effective

Result: it was found that the operations during the backup were also successfully restored

[paas-sotc-telepgtest-002][telepg][/app/telepg/pg_data_new]$psql -p54321 -Upostgres
psql (12.5)
Type "help" for help.

postgres=# \c test_rhl1
You are now connected to database "test_rhl1" as user "postgres".
test_rhl1=# \l
test_rhl1=# select * from tb1;
 id |   name   
----+----------
  1 | asd
  2 | qwe
  3 | ert
  4 | asdasdsd ---During backup insert
  5 | asdsd    ---During backup insert
(5 rows)

5. View directory changes after successful startup

  • backup_ There are too many labels old, but the content has not changed
  • More postmaster opts
  • More postmaster pid
  • recovery. Automatic deletion of signal
[paas-sotc-telepgtest-002][telepg][/app/telepg/pg_data_new]$ll
total 72
-rw------- 1 telepg telepg   210 Feb 11 14:59 backup_label.old
drwx------ 7 telepg telepg    67 Feb 11 14:52 base
-rw-r----- 1 telepg telepg    30 Feb 11 15:46 current_logfiles
drwx------ 2 telepg telepg  4096 Feb 11 15:48 global
drwx------ 2 telepg telepg  4096 Feb 11 15:41 log
drwx------ 2 telepg telepg     6 Jun 15  2021 pg_commit_ts
drwx------ 2 telepg telepg     6 Jun 15  2021 pg_dynshmem
-rw------- 1 telepg telepg  4826 Jun 15  2021 pg_hba.conf
-rw------- 1 telepg telepg  1636 Jun 15  2021 pg_ident.conf
drwx------ 4 telepg telepg    68 Feb 11 15:46 pg_logical
drwx------ 4 telepg telepg    36 Jun 15  2021 pg_multixact
drwx------ 2 telepg telepg    18 Feb 11 15:46 pg_notify
drwx------ 2 telepg telepg     6 Feb 11 14:59 pg_replslot
drwx------ 2 telepg telepg     6 Jun 15  2021 pg_serial
drwx------ 2 telepg telepg     6 Jun 15  2021 pg_snapshots
drwx------ 2 telepg telepg     6 Nov  1 15:26 pg_stat
drwx------ 2 telepg telepg    84 Feb 11 15:50 pg_stat_tmp
drwx------ 2 telepg telepg    18 Feb 11 15:46 pg_subtrans
drwx------ 2 telepg telepg     6 Jun 15  2021 pg_tblspc
drwx------ 2 telepg telepg     6 Jun 15  2021 pg_twophase
-rw------- 1 telepg telepg     3 Jun 15  2021 PG_VERSION
drwx------ 3 telepg telepg   140 Feb 11 15:46 pg_wal
drwx------ 2 telepg telepg    90 Aug 19 17:30 pg_xact
-rw------- 1 telepg telepg    88 Jun 15  2021 postgresql.auto.conf
-rw------- 1 telepg telepg 26843 Feb 11 15:38 postgresql.conf
-rw-r----- 1 telepg telepg    63 Feb 11 15:46 postmaster.opts
-rw-r----- 1 telepg telepg    90 Feb 11 15:46 postmaster.pid
-rw------- 1 telepg telepg     0 Feb 11 14:59 tablespace_map.old
[paas-sotc-telepgtest-002][telepg][/app/telepg/pg_data_new]$
[paas-sotc-telepgtest-002][telepg][/app/telepg/pg_data_new]$cat backup_label.old 
START WAL LOCATION: 0/DD000028 (file 0000000200000000000000DD)
CHECKPOINT LOCATION: 0/DD000060
BACKUP METHOD: streamed
BACKUP FROM: master
START TIME: 2022-02-11 14:59:18 CST
LABEL: pg_backup
START TIMELINE: 2
[paas-sotc-telepgtest-002][telepg][/app/telepg/pg_data_new]$cat postmaster.opts
/app/telepg/pg12.5/bin/postgres "-D" "/app/telepg/pg_data_new"

[paas-sotc-telepgtest-002][telepg][/app/telepg/pg_data_new]$cat postmaster.pid
132432
/app/telepg/pg_data_new
1644565570
54321
/tmp
0.0.0.0
 54321001        23
ready   

6. View the log of the database during startup

  1. starting archive recovery and restored log file will be performed first
  2. Wait for the archive recovery complet, and then you can start the library
2022-02-11 15:46:10.234 CST,,,132432,,62061442.20550,1,,2022-02-11 15:46:10 CST,,0,LOG,00000,"ending log output to stderr",,"Future log output will go to log destination ""csvlog"".",,,,,,,""
2022-02-11 15:46:10.236 CST,,,132434,,62061442.20552,1,,2022-02-11 15:46:10 CST,,0,LOG,00000,"database system was interrupted; last known up at 2022-02-11 14:59:18 CST",,,,,,,,,""
2022-02-11 15:46:10.306 CST,,,132434,,62061442.20552,2,,2022-02-11 15:46:10 CST,,0,LOG,00000,"starting archive recovery",,,,,,,,,""
2022-02-11 15:46:10.309 CST,,,132434,,62061442.20552,3,,2022-02-11 15:46:10 CST,,0,LOG,00000,"restored log file ""00000002.history"" from archive",,,,,,,,,""
2022-02-11 15:46:10.331 CST,,,132434,,62061442.20552,4,,2022-02-11 15:46:10 CST,,0,LOG,00000,"restored log file ""0000000200000000000000DD"" from archive",,,,,,,,,""
2022-02-11 15:46:10.379 CST,,,132434,,62061442.20552,5,,2022-02-11 15:46:10 CST,1/0,0,LOG,00000,"redo starts at 0/DD000028",,,,,,,,,""
2022-02-11 15:46:10.380 CST,,,132434,,62061442.20552,6,,2022-02-11 15:46:10 CST,1/0,0,LOG,00000,"consistent recovery state reached at 0/DD000328",,,,,,,,,""
2022-02-11 15:46:10.380 CST,,,132432,,62061442.20550,2,,2022-02-11 15:46:10 CST,,0,LOG,00000,"database system is ready to accept read only connections",,,,,,,,,""
2022-02-11 15:46:10.384 CST,,,132434,,62061442.20552,7,,2022-02-11 15:46:10 CST,1/0,0,LOG,00000,"redo done at 0/DD000328",,,,,,,,,""
2022-02-11 15:46:10.384 CST,,,132434,,62061442.20552,8,,2022-02-11 15:46:10 CST,1/0,0,LOG,00000,"last completed transaction was at log time 2022-02-11 15:01:00.25688+08",,,,,,,,,""
2022-02-11 15:46:10.402 CST,,,132434,,62061442.20552,9,,2022-02-11 15:46:10 CST,1/0,0,LOG,00000,"restored log file ""0000000200000000000000DD"" from archive",,,,,,,,,""
2022-02-11 15:46:10.446 CST,,,132434,,62061442.20552,10,,2022-02-11 15:46:10 CST,1/0,0,LOG,00000,"selected new timeline ID: 3",,,,,,,,,""
2022-02-11 15:46:10.501 CST,,,132434,,62061442.20552,11,,2022-02-11 15:46:10 CST,1/0,0,LOG,00000,"archive recovery complete",,,,,,,,,""
2022-02-11 15:46:10.505 CST,,,132434,,62061442.20552,12,,2022-02-11 15:46:10 CST,1/0,0,LOG,00000,"restored log file ""00000002.history"" from archive",,,,,,,,,""
2022-02-11 15:46:10.527 CST,,,132432,,62061442.20550,3,,2022-02-11 15:46:10 CST,,0,LOG,00000,"database system is ready to accept connections",,,,,,,,,""

Keywords: Database PostgreSQL server

Added by bguzel on Mon, 14 Feb 2022 03:52:09 +0200