Postgres Partition Table II

Postgres Partition Table II

1, Background

  in the previous article Postgres partition table I In, we mentioned that there are some differences in the way Postgres creates and uses partition tables after version 10.0. This article will mainly introduce the creation and use of partition tables after Postgres 10.0.
   after we created partitions before version 10.0, the partition table needs to be directly specified for data writing. If the write operation occurs on the main table, the data cannot be automatically stored in the corresponding partition table (triggers are not used). Even if triggers are used, the triggers need to be changed frequently for partition changes. To solve this problem, Youdashen proposed to identify the partition field and detect the partition when writing, so as to dynamically create the partition and write data to the partition table. However, the actual use found that the insertion performance was poor. The author tested the performance of the method of automatically creating partition tables by triggers. Inserting 1000w data into 10 partitions took more than 1200s. However, the partition tables created by the new version (10.0 and later versions) took 7s for the same data warehousing.

2, Partition table management

2.1 overview of zoning table

   creating partition tables in version 10.0 and later requires specifying partition keys. Partitions support range partition and List partition, and Hash partition is added after 11.0.
   range partition is used to partition the range of data according to the partition key. For example, our data is the daily granularity data updated every day. At this time, when we want to partition the data according to the month, we can use range partition and specify the minimum and maximum values of the partition field in the partition.
   List partition is partitioned according to the enumeration value of partition key. For example, our data is students' achievement information. At this time, it can be partitioned according to different grades, such as one partition for each grade, or multiple grades can be combined into one partition. For example, grade 1 and grade 2 can be placed in the same partition, At this time, the enumeration values of grade 1 and grade 2 can be stored in the enumeration values of the partition.
   hash partitions are partitioned according to the hash value of the partition key. The number of partitions needs to be set when creating the main table, and after the partition is created, it cannot be deleted or added.
   note that no matter which of the above partition methods, the partition must be created manually, and the default partition must be created in the version after 11.0. When the written data does not meet any manually created partition (non default partition), the data will be written to the default partition.

All partitions support direct writing to the corresponding partition (specify the sub partition table name) and do not specify the partition (specify the primary partition table name). When specifying the sub partition, pay attention to whether the value of the partition key is within the partition range of the sub partition.

2.2 create partition table

2.2.1 range zoning

Syntax:

-- Main table
CREATE TABLE table_name ( column_name data_type )
    PARTITION BY RANGE ( { column_name } [, ... ] )
 
-- Sub table
CREATE TABLE table_name
    PARTITION OF parent_table
FOR VALUES
    FROM ( { numeric_literal | string_literal | TRUE | FALSE | MINVALUE | MAXVALUE } [, ...] )
      TO ( { numeric_literal | string_literal | TRUE | FALSE | MINVALUE | MAXVALUE } [, ...] ) 

Example 1 (single field partition):

--Create master table test_range use date Field as partition key--
create table test_range(id int, date varchar(255)) partition by range ( date );
--To create a partition, you need to specify the value range of the partition key stored in the partition. The partition range includes the upper bound and does not include the lower bound. All partition tables should create a default partition to avoid writing failure when the value of the partition key is not included in any partition. After creating a default partition, the data that is not within the defined partition range will be written to the default partition--
create table test_range_default partition of test_range default;
create table test_range_202101 partition of test_range for values from ('2021-01-01') to ('2021-02-01');
create table test_range_202102 partition of test_range for values from ('2021-02-01') to ('2021-03-01');
--Write test data to the main table. Note: when writing data to the main table, the data will be automatically written to the corresponding partition according to the partition key. If the default partition in the previous step is not created, this clause is invalid insert The statement will fail because(3,'2021-03-01')Medium partition value 2021-03-01 Not in any partition--
insert into test_range(id,date)values (1,'2021-02-01'),(2,'2021-01-01'), (3,'2021-03-01');
--Write data to a single partition. Note: when writing to a single partition, ensure that the value corresponding to the partition key is within the range of the partition key written to the partition--
insert into test_range_202102(id,date)values (4,'2021-02-02');

Example 2 (multiple field partitions):

--Create master table test_range_2 use x,y Two fields as partition keys--
create table test_range_2(x int, y int) partition by range ( x,y );
--To create a partition, you need to specify the range of partition key values stored in the partition,The partition range is:((x IS NOT NULL) AND (y IS NOT NULL) AND ((x > x_min) OR ((x = x_min) AND (y >= y_min))) AND ((x < x_max) OR ((x = x_max) AND (y < y_max)))),All partition tables should create default partitions to avoid writing failure when the value of partition key is not included in any partition. After creating the default partition, the data not within the defined partition range will be written to the default partition--
create table test_range_2_default partition of test_range_2 default;
create table test_range_2_20101 partition of test_range_2 for values from (2010,1) to (2010,7);
create table test_range_2_20102 partition of test_range_2 for values from (2010,7) to (2010,12);
create table test_range_2_20111 partition of test_range_2 for values from (2011,1) to (2011,7);
create table test_range_2_20112 partition of test_range_2 for values from (2011,7) to (2011,12);
--Write test data to the main table
insert into test_range_2(x,y)values (2010,1),(2010,7), (2011,1),(2011,8);

2.2.2 List partition

Syntax:

-- Main table
CREATE TABLE table_name ( column_name data_type )
    PARTITION BY LIST ( { column_name } )
-- Sub table
CREATE TABLE table_name
    PARTITION OF parent_table
FOR VALUES
    IN ( { numeric_literal | string_literal | TRUE | FALSE | NULL } [, ...] )

Example:

--Create master table to date Fields as partitions--
create table test_list(id int, date varchar(255)) partition by list( date );
--Create a default partition. The default partition is used as the default partition. If the partition key of the written data fails to match the partition range of the created partition, the data will be written to the default partition--
create table test_list_default partition of test_list default;
create table test_list_20210101 partition of test_list for values in ('2021-01-01'); 
create table test_list_20210102 partition of test_list for values in ('2021-01-02');
--Write data--
insert into test_list(id,date)values (1,'2021-01-01'),(2,'2021-01-02'), (3,'2021-03-01');

2.2.3 Hash partition

Syntax:

-- Main table
CREATE TABLE table_name ( column_name data_type )
    PARTITION BY HASH ( { column_name } [, ... ] )
-- Sub table
CREATE TABLE table_name
    PARTITION OF parent_table 
FOR VALUES
    WITH ( MODULUS numeric_literal, REMAINDER numeric_literal )

Hash partitions do not support default partitions, and the maximum partition = modulus value. The modulus of each partition should be consistent

Example:

--Create master table to date Fields as partitions--
create table test_hash(id int, date varchar(255)) partition by hash( id );
--Create partition
create table test_hash_0 partition of test_hash for values with(modulus 4, remainder 0);
create table test_hash_1 partition of test_hash for values with(modulus 4, remainder 1);
create table test_hash_2 partition of test_hash for values with(modulus 4, remainder 2);
create table test_hash_3 partition of test_hash for values with(modulus 4, remainder 3);
--report errors: remainder for hash partition must be less than modulus-
create table test_hash_4 partition of test_hash for values with(modulus 4, remainder 4);
--Write data--
insert into test_hash(id,date)values (1,'2021-01-01'),(2,'2021-01-02'), (3,'2021-03-01');

2.3 zoning management (ATTACH/DETACH)

2.3.1 DETACH (disassociate the primary table from the partition table)

Syntax:

ALTER TABLE name DETACH PARTITION partition_name

Example:

ALTER TABLE test_hash DETACH PARTITION test_hash_3

2.3.1 ATTACH (associate master table and partition)

Syntax:

ALTER TABLE name ATTACH PARTITION partition_name { FOR VALUES partition_bound_spec | DEFAULT }

Example:

--Create partition--
create table test_hash_attach (date varchar(255));
--Associate partition to main table--
alter table test_hash attach partition test_hash_attach for values with (modulus 4, remainder 3);

2.4 precautions

  1. Only the range partition and the list partition support the default partition, while the hash partition does not support the default partition. The default partition is applicable to avoid writing failure when the corresponding partition cannot be found according to the partition key. In actual use, the reason why the corresponding partition cannot be found should be investigated.
  2. The partition table can also be used as the main table, that is, the partition can also be divided into level 1 partition, level 2 partition, level 3 partition... And each partition can be used as the main table to create sub partitions under it
  3. Partitioned primary tables can have indexes
  4. Partition master tables can create unique constraints and need to contain partition keys
  5. The partition master table can create foreign keys, but cannot create foreign keys by referencing the partition table
  6. Partition master table can create FOR EACH ROW trigger
  7. When the execution result of the update statement modifies the partition key of the data row, if the partition key of the data row still meets the constraints of the current partition, the number of records in the original partition of the data row will not change. If the partition key of the data row does not meet the constraints of the current partition, it will be removed from the original partition and written into the partition whose partition constraints meet the updated partition key

Keywords: Database PostgreSQL

Added by lordphate on Sat, 05 Mar 2022 01:19:37 +0200