PostgreSQL mistakenly deleted PG_ filenode. What about map

Today, I saw someone on the Internet asking me to delete PG by mistake_ filenode. How to restore or rebuild a map? Let's learn about PG before solving this problem_ filenode. Map file.

For each table in PostgreSQL, there are files related to it on disk, and the name of these files is relfilenode. We can use pg_class to query the relfilenode field.

However, for some special tables, we will find that the corresponding field is 0. The official document explains that 0 means that this is a "mapping" relationship, and its disk file name depends on the low-level state. So which tables have the relfilenode field of 0? How to find the files corresponding to the table with relfilenode 0?

We all know that for an ordinary table, the relfilenode and oid are the same by default, for example:

bill=# create table t(id int);
CREATE TABLE
bill=# select oid,relname,relfilenode from pg_class where relname = 't';
  oid  | relname | relfilenode
-------+---------+-------------
 24919 | t       |       24919
(1 row)

However, when we perform operations on the table, such as vacuum full and truncate, the relfilenode will change:

bill=# vacuum FULL t;
VACUUM
bill=# select oid,relname,relfilenode from pg_class where relname = 't';
  oid  | relname | relfilenode
-------+---------+-------------
 24919 | t       |       24922
(1 row)

Let's look at the tables with relfilenode 0:

bill=# select oid, relname, relfilenode,reltablespace
bill-# from pg_class
bill-# where relfilenode = 0 and relkind = 'r'
bill-# order by reltablespace;
 oid  |        relname        | relfilenode | reltablespace
------+-----------------------+-------------+---------------
 1247 | pg_type               |           0 |             0
 1249 | pg_attribute          |           0 |             0
 1255 | pg_proc               |           0 |             0
 1259 | pg_class              |           0 |             0
 3592 | pg_shseclabel         |           0 |          1664
 1262 | pg_database           |           0 |          1664
 2964 | pg_db_role_setting    |           0 |          1664
 1213 | pg_tablespace         |           0 |          1664
 1261 | pg_auth_members       |           0 |          1664
 1214 | pg_shdepend           |           0 |          1664
 2396 | pg_shdescription      |           0 |          1664
 1260 | pg_authid             |           0 |          1664
 6100 | pg_subscription       |           0 |          1664
 6000 | pg_replication_origin |           0 |          1664
(14 rows)

Through the reltablespace field, we can also find that these tables are divided into two categories: one is pg_type,pg_attribute,pg_proc and pg_class, which are shared tables. In the kernel, we call them Nail tables. The other is the one with reltablespace of 1664, that is, in PG_ Shared tables in the global tablespace.

Why is the relfilenode of these system tables 0? Because we don't want to query the system tables that are accessed frequently from other system tables every time, so the performance will be very low. They are through pg_filenode.map file to manage.

In PG_ filenode. In the map file, map the oid of these system tables with relfileno, and the size of this file is 512, which is just the size of an OS disk sector. At the same time, PG has done the alignment processing, and corresponds to it with the RelMapFile structure in the source code. Structure size: 628 + 44 = 496 + 16 = 512. In other words, this file can store up to 62 records of the system catalog table.

Due to the importance of this file, it is just aligned with the size of the disk sector to reduce the probability of file crash.

typedef struct RelMapping
{
	Oid			mapoid;			/* OID of a catalog */
	Oid			mapfilenode;	/* its filenode number */
} RelMapping;

typedef struct RelMapFile
{
	int32		magic;			/* always RELMAPPER_FILEMAGIC */
	int32		num_mappings;	/* number of valid RelMapping entries */
	RelMapping	mappings[MAX_MAPPINGS];
	pg_crc32c	crc;			/* CRC of all above */
	int32		pad;			/* to make the struct size be 512 exactly */
} RelMapFile;

Next, let's take a look at the contents stored in this file:

As shown in the figure above, the file can be divided into four parts.
Part I: 2717 0059, which represents the magic data word of the file header.

#define RELMAPPER_FILEMAGIC		0x592717	/* version ID value */

Part II: 0011 0000, indicating the number of mapping objects contained in the file. We can verify through the following SQL:
It's just 17 rows of data, which is consistent with 0011 above.

bill=# select relname,relfilenode from pg_class where pg_relation_filepath(oid) like 'base/16385/%' and relfilenode = 0;
              relname              | relfilenode
-----------------------------------+-------------
 pg_toast_1255                     |           0
 pg_toast_1255_index               |           0
 pg_proc_oid_index                 |           0
 pg_proc_proname_args_nsp_index    |           0
 pg_type_oid_index                 |           0
 pg_type_typname_nsp_index         |           0
 pg_attribute_relid_attnam_index   |           0
 pg_attribute_relid_attnum_index   |           0
 pg_class_oid_index                |           0
 pg_class_relname_nsp_index        |           0
 pg_class_tblspc_relfilenode_index |           0
 pg_attribute                      |           0
 pg_proc                           |           0
 pg_type                           |           0
 pg_toast_1247                     |           0
 pg_toast_1247_index               |           0
 pg_class                          |           0
(17 rows)

The third part: 04eb 0000 4095 0000, which is the actual mapping relationship 04eb, that is, the oid of the object, 4095 represents the relfilenode of the object.

Part IV: ebfa f3a4, the verification value of the end of the file, which is calculated by the crc32 algorithm for the file name:

/* verify the CRC */
INIT_CRC32C(crc);
COMP_CRC32C(crc, (char *) map, offsetof(RelMapFile, crc));
FIN_CRC32C(crc);

if (!EQ_CRC32C(crc, map->crc))
ereport(FATAL,
(errmsg("relation mapping file "%s" contains incorrect checksum",
mapfilename)));

End of introduction pg_filenode.map file let's answer the previous question. What if the file is deleted by mistake?

If the file is deleted from a database, the library cannot be connected:
pg14@vm-192-168-204-153-> psql bill bill
psql: error: connection to server on socket "/home/pg14/pgdata/.s.PGSQL.2022" failed: FATAL: could not open file "base/16385/pg_filenode.map": No such file or directory

However, it will not affect the connection of other libraries:
pg14@vm-192-168-204-153-> psql postgres
psql (14.1)
Type "help" for help.

postgres=#

If the file in the global directory is deleted by mistake, all libraries cannot be connected:
pg14@vm-192-168-204-153-> psql postgres
psql: error: connection to server on socket "/home/pg14/pgdata/.s.PGSQL.2022" failed: FATAL: could not open file "global/pg_filenode.map": No such file or directory

pg14@vm-192-168-204-153-> psql bill
psql: error: connection to server on socket "/home/pg14/pgdata/.s.PGSQL.2022" failed: FATAL: could not open file "global/pg_filenode.map": No such file or directory

So we can know that when PG_ filenode. After the map file is damaged or deleted by mistake, at least the corresponding library must not be connected, so we can't create the file manually.

However, generally speaking, the file of most libraries is the same, and the file is rarely modified unless you perform vacuum full operations on these system tables. And if you are really unlucky that the library has done similar operations, what should you do? You have to copy a file from other libraries first, and then pg_filedump PG of this library_ Class data file, look at the information of the corresponding system table, and then pay attention not to modify PG directly_ filenode. Map file, but to modify the file name of the data file to meet PG_ filenode. Mapping relationship in map.

Why? As mentioned earlier, there is a check value calculated according to the file name at the end of the file. If you modify PG directly_ filenode. Map file, you will be prompted that the check value is incorrect:
psql: error: connection to server on socket "/home/pg14/pgdata/.s.PGSQL.2022" failed: FATAL: relation mapping file "base/16385/pg_filenode.map" contains incorrect checksum

Summary:
Deleted by mistake PG_ filenode. What about map? First, if you are lucky, copy the file from another library. If it is available, you can use it directly.

If the system table in the file in your library has done operations such as vacuum full, you need to pass a similar PG_ Go to PG by filedump_ Class, and then manually modify the name of the relevant data file to meet PG_ filenode. Mapping relationship in map.

Of course, this method will be very troublesome, so be careful not to delete your PG by mistake_ filenode. map!

Keywords: Database PostgreSQL

Added by chadowch on Tue, 18 Jan 2022 18:16:07 +0200