profile analysis of database from redo log

Label

PostgreSQL, pg_xlogdump, database profile

background

It is convenient to view TOP SQL in PostgreSQL, as follows:

PostgreSQL How to Find TOP SQL (e.g. IO-consumed SQL)

From another point of view, such as REDO files, we can also analyze the operation type of the database in each period, analyze the resource consumption of the database, and put forward optimization suggestions.

What does the redo log look like?

pg_xlogdump

To view the redo log, you have to mention pg_xlogdump. The principle and introduction are as follows

"PostgreSQL xlog dump - pg_xlog dump source code explanation"

Let's briefly look at the contents of a REDO file:

pg_xlogdump 000000050000159E00000095 000000050000159E00000099|less  
  
rmgr: Btree       len (rec/tot):     42/    74, tx:  891772656, lsn: 159E/95000CA0, prev 159E/94FFEC78, bkp: 0000, desc: insert: rel 1663/56867/10216768; tid 19180/150  
rmgr: Btree       len (rec/tot):     34/    66, tx:  891772656, lsn: 159E/95000CF0, prev 159E/95000CA0, bkp: 0000, desc: insert: rel 1663/56867/10906168; tid 150019/337  
rmgr: Transaction len (rec/tot):     12/    44, tx:  891772656, lsn: 159E/95000D38, prev 159E/95000CF0, bkp: 0000, desc: commit: 2017-05-04 13:15:50.262106 CST  
rmgr: Heap2       len (rec/tot):     26/    58, tx:          0, lsn: 159E/95000D68, prev 159E/95000D38, bkp: 0000, desc: clean: rel 1663/56867/10216762; blk 402406 remxid 891772032  
rmgr: Heap        len (rec/tot):     25/  8065, tx:  891772657, lsn: 159E/95000DA8, prev 159E/95000D68, bkp: 1000, desc: lock 891772657: rel 1663/56867/10216770; tid 39310/2 LOCK_ONLY EXCL_LOCK   
rmgr: Heap        len (rec/tot):     36/  8116, tx:  891772657, lsn: 159E/95002D48, prev 159E/95000DA8, bkp: 0100, desc: update: rel 1663/56867/10216770; tid 39310/2 xmax 891772657 ; new tid 122748/1 xmax 0  
rmgr: Btree       len (rec/tot):     18/  5238, tx:  891772657, lsn: 159E/95004D18, prev 159E/95002D48, bkp: 1000, desc: insert: rel 1663/56867/10216776; tid 15681/11  
rmgr: Transaction len (rec/tot):     12/    44, tx:  891772657, lsn: 159E/950061A8, prev 159E/95004D18, bkp: 0000, desc: commit: 2017-05-04 13:15:50.320538 CST  

rmgr refers to the category of this REDO RECORD, followed by information such as length.

With the following commands, you can count how many redo records each resource has, and what is the description?

pg_xlogdump 000000050000159E00000095 000000050000159E00000099|grep -v " lock "| awk '{print $2" "$16" "$17}'|sort |uniq -c|sort -n -r  
  
 178464 Gin Insert item,  
  39060 Gin Vacuum page,  
  30980 Heap2 clean: rel  
  23855 Gin Vacuum data  
  22240 Btree insert: rel  
  18068 Heap insert: rel  
  15650 Heap2 visible: rel  
   8516 Heap delete: rel  
   4057 Transaction commit: 2017-05-04  
   3727 XLOG full-page image:  
   3331 Gin Insert new  
   3026 Heap update: rel  
   1196 Transaction abort: 2017-05-04  
   1191 Gin Update metapage,  
    765 Btree vacuum: rel  
    755 Heap hot_update: rel  
    243 Gin Delete list  
     55 Btree insert_upper: rel  
     50 Btree split_r: rel  
     32 Btree unlink_page: rel  
     32 Btree mark_page_halfdead: rel  
     29 Btree reuse_page: rel  
     28 Heap2 freeze_page: rel  
     27 Standby running xacts:  
     27 Heap insert(init): rel  
     20 Heap inplace: rel  
     11 Btree delete: index  
      9 Heap2 cleanup info:  
      5 Standby AccessExclusive locks:  
      5 Btree split_l: rel  
      4 Gin Page split,  
      2 XLOG checkpoint: redo  
      2 Gin Create posting  
      1 Heap update(init): rel  

What problems can you see in this log?

For example, we can see that Gin's REDO is very large, indicating that users use GIN index, while the index is written and updated very frequently.

In this case, the user can set the fastupdate feature of GIN index, at the same time, adjust the vacuum naptime and the vacuum threshold of the table to minimize the merging frequency of GIN (but the fastupdate will reduce the retrieval performance).

Other,

If you see a lot of VACUUM, it means that these REDO s contain a lot of garbage recycling information. Users may produce more garbage and are being recycled.

If VACUUM FREEZE is more, it means more freezing, users can modify the freezing cycle to reduce this REDO.

Reference resources

PostgreSQL uses pg_xlogdump to find the wrong transaction number

"PostgreSQL xlog dump - pg_xlog dump source code explanation"

PostgreSQL How to Find TOP SQL (e.g. IO-consumed SQL)

Keywords: PostgreSQL SQL Database less

Added by sagee on Wed, 03 Jul 2019 22:22:44 +0300