SQLite learning path ⑫ WAL introduction( 2021SC@SDUSC )

The full name of WAL is Write Ahead Logging (pre write log) is a mechanism used to implement atomic transactions in many databases. It is an efficient log algorithm in databases. From the database principle, it implements the redo log mode. That is, when modifying the database, it does not directly modify the database content, but writes the modified data into the log and synchronizes it to the disk, so it has no impact on other read processes influence. If the database crashes, scan the log file after restart, and then update it in the database. In order to improve efficiency, WAL log mode provides checkpoint operation to update data regularly.

The implementation of WAL in SQLite is also based on the above principle. When updating SQLite pages, the updated pages will be synchronized to disk, and checkpoint operations will be performed regularly. When reading the database, in order to read the latest page, you need to scan the log file to get the latest data page. In order to improve the scanning speed, SQLite designed a WAL index index to speed up the operation of WAL logs. The specific code implementation will be written slowly.

Wal has the following advantages:

  • Read operations do not block write operations, and write operations do not block read operations. This is the "golden rule" of concurrent management;
  • In most operational scenarios, Wal is quite fast compared to rollback logs.
  • Disk I/O becomes more predictable and causes fewer fsync() system calls. Because all Wal writes are written to log files linearly, many I / OS become continuous and can be executed as planned.

Wal's disadvantages are:

  • All processing is bound to a single host. In other words, Wal can no longer be used on a network file system such as NFS.
  • To meet the needs of wal and related shared memory, two additional semi persistent files - wal and - shm. Are introduced using wal This is not attractive for those who use SQLite database as application file format. This also affects the read-only environment because the - shm file must be writable and / or the directory where the database is located must also be writable.
  • For very large transactions, the performance of WAL will be reduced. Although WAL is a high-performance option, very large or long-running transactions introduce additional overhead

Working principle: the modification is not written directly to the database file, but to another file called WAL; If the transaction fails, the records in the WAL will be ignored and the modification will be revoked; If the transaction succeeds, it will be written back to the database file at a later time to commit the changes.
The behavior of synchronizing WAL files and database files is called checkpoint. It is automatically executed by SQLite. By default, when the WAL file is accumulated to 1000 pages for modification; of course, you can also manually execute checkpoint when appropriate. SQLite provides relevant interfaces. After the checkpoint is executed, the WAL is cleared.
During reading, SQLite will search in the WAL file, find the last write point, remember it, and ignore the write points after it (this ensures that reading, writing and reading can be performed in parallel); then, it determines whether the page of the data to be read is in the WAL file. If it is, it will read the data in the WAL file. If it is not, it will directly read the data in the database file.
When writing, SQLite can write it to the WAL file, but exclusive writing must be guaranteed, so parallel execution cannot be performed between writes.
In the implementation of WAL, shared memory technology is used. Therefore, all read and write processes must be on the same machine, otherwise, data consistency cannot be guaranteed.

In wal C is implemented as follows:

struct WalIndexHdr {
   u32 iVersion;
   wal Index version
   u32 unused; Unused area
    u32 iChange; Counters per transaction
   u8 isInit;
   u8 bigEndCksum; judge wal Medium checksum Is the type of big-endian
   u16 szPage; database page quantity
   u32 mxFrame; WAL Index of the last valid frame in
   u32 nPage; Page size in database
   u32 aFrameCksum[2]; Of the last frame of the log checksum
   u32 aSalt[2]; from WAL Copy two headers salt value
   u32 aCksum[2];
       The pre write log file is represented by an instance of the following objects.    struct Wal {
    sqlite3_vfs pVfs; //VFS used to create pDbFd
    sqlite3_file pDbFd; //Database file
    sqlite3_file pWalFd; //WAL file
    u32 iCallback; //The value passed to the callback log
    i64 mxWalSize; //After reset, adjust WAL to this size.
    int nWiData; //Size of apWiData array
    int szFirstBlock; //The size of the first block written to the WAL file
    uolatile u32 ** apWiData ; //Points to the contents of the WAL index in memory
    u32 szPage ; //Database page size
    i16 readLock ; //Read which lock is holding- 1 means none
    u8 syncFlags ; //Flag uses synchronization header
    u8 exclusiveMode ; //Nonzero, if the connection is exclusive
    u8 writeLock ; //If you're really writing a transaction
    u8 ckptLock ; //If you do lock with a checkpoint
    u8 readonly; //WAL_RDWR , WAL_RDONLY, or WAL_SHM_RDONLY
    u8 truncateOnCommit ; //Really truncate files submitted on WAL
    u8 syncHeader ; //FSYNC header if true
    u8 padToSectorBoundary //Pad record to next sector
    walIndexHdr HDR ; //A wal index header for the current transaction
    const char zWalName ; //The file name of the WAL file
    u32 nCkpt ; //wal checkpoint sequence counter\
# IFDEF SQLITE_DEBUG
    u8 LOCKERROR ; //When a lock error occurs
   # ENDIF
   } ;

Keywords: Database SQLite

Added by joshuamd3 on Sun, 19 Dec 2021 23:23:08 +0200