The full name of WAL is Write Ahead Logging (pre write log) is a mechanism used to implement atomic transactions in many databases. It is an efficient log algorithm in databases. From the database principle, it implements the redo log mode. That is, when modifying the database, it does not directly modify the database content, but writes the modified data into the log and synchronizes it to the disk, so it has no impact on other read processes influence. If the database crashes, scan the log file after restart, and then update it in the database. In order to improve efficiency, WAL log mode provides checkpoint operation to update data regularly.
The implementation of WAL in SQLite is also based on the above principle. When updating SQLite pages, the updated pages will be synchronized to disk, and checkpoint operations will be performed regularly. When reading the database, in order to read the latest page, you need to scan the log file to get the latest data page. In order to improve the scanning speed, SQLite designed a WAL index index to speed up the operation of WAL logs. The specific code implementation will be written slowly.
Wal has the following advantages:
- Read operations do not block write operations, and write operations do not block read operations. This is the "golden rule" of concurrent management;
- In most operational scenarios, Wal is quite fast compared to rollback logs.
- Disk I/O becomes more predictable and causes fewer fsync() system calls. Because all Wal writes are written to log files linearly, many I / OS become continuous and can be executed as planned.
Wal's disadvantages are:
- All processing is bound to a single host. In other words, Wal can no longer be used on a network file system such as NFS.
- To meet the needs of wal and related shared memory, two additional semi persistent files - wal and - shm. Are introduced using wal This is not attractive for those who use SQLite database as application file format. This also affects the read-only environment because the - shm file must be writable and / or the directory where the database is located must also be writable.
- For very large transactions, the performance of WAL will be reduced. Although WAL is a high-performance option, very large or long-running transactions introduce additional overhead
Working principle: the modification is not written directly to the database file, but to another file called WAL; If the transaction fails, the records in the WAL will be ignored and the modification will be revoked; If the transaction succeeds, it will be written back to the database file at a later time to commit the changes.
The behavior of synchronizing WAL files and database files is called checkpoint. It is automatically executed by SQLite. By default, when the WAL file is accumulated to 1000 pages for modification; of course, you can also manually execute checkpoint when appropriate. SQLite provides relevant interfaces. After the checkpoint is executed, the WAL is cleared.
During reading, SQLite will search in the WAL file, find the last write point, remember it, and ignore the write points after it (this ensures that reading, writing and reading can be performed in parallel); then, it determines whether the page of the data to be read is in the WAL file. If it is, it will read the data in the WAL file. If it is not, it will directly read the data in the database file.
When writing, SQLite can write it to the WAL file, but exclusive writing must be guaranteed, so parallel execution cannot be performed between writes.
In the implementation of WAL, shared memory technology is used. Therefore, all read and write processes must be on the same machine, otherwise, data consistency cannot be guaranteed.
In wal C is implemented as follows:
struct WalIndexHdr { u32 iVersion; wal Index version u32 unused; Unused area u32 iChange; Counters per transaction u8 isInit; u8 bigEndCksum; judge wal Medium checksum Is the type of big-endian u16 szPage; database page quantity u32 mxFrame; WAL Index of the last valid frame in u32 nPage; Page size in database u32 aFrameCksum[2]; Of the last frame of the log checksum u32 aSalt[2]; from WAL Copy two headers salt value u32 aCksum[2]; The pre write log file is represented by an instance of the following objects. struct Wal { sqlite3_vfs pVfs; //VFS used to create pDbFd sqlite3_file pDbFd; //Database file sqlite3_file pWalFd; //WAL file u32 iCallback; //The value passed to the callback log i64 mxWalSize; //After reset, adjust WAL to this size. int nWiData; //Size of apWiData array int szFirstBlock; //The size of the first block written to the WAL file uolatile u32 ** apWiData ; //Points to the contents of the WAL index in memory u32 szPage ; //Database page size i16 readLock ; //Read which lock is holding- 1 means none u8 syncFlags ; //Flag uses synchronization header u8 exclusiveMode ; //Nonzero, if the connection is exclusive u8 writeLock ; //If you're really writing a transaction u8 ckptLock ; //If you do lock with a checkpoint u8 readonly; //WAL_RDWR , WAL_RDONLY, or WAL_SHM_RDONLY u8 truncateOnCommit ; //Really truncate files submitted on WAL u8 syncHeader ; //FSYNC header if true u8 padToSectorBoundary //Pad record to next sector walIndexHdr HDR ; //A wal index header for the current transaction const char zWalName ; //The file name of the WAL file u32 nCkpt ; //wal checkpoint sequence counter\ # IFDEF SQLITE_DEBUG u8 LOCKERROR ; //When a lock error occurs # ENDIF } ;