reference resources:
- https://github.com/facebook/r...
- https://github.com/facebook/r...
- https://blog.csdn.net/Z_Stand...
- https://www.jianshu.com/p/f57...
- https://book.tidb.io/session3...
- https://www.iggiewang.cn/2021...
Introduction
After being highly encapsulated, the iterator of RocksDB can be used like the iterator of the iterator constructed by C++ stl library for each container. It can locate a key and start scanning from this key. It can also be used for reverse scanning.
If when creating an iterator, readoptions When a snapshot is given, the iterator will return the data of the snapshot. If nullptr is passed in, the hidden view will be created, and the hidden view cannot be converted to explicit.
User's use of Iterator
- NewIterator creates an iterator and needs to pass in ReadOptions
- Seek find a key
- SeekToFirst iterator moves to the first key position of db, which is generally used to traverse all keys of the whole db in sequence
- SeekToLast iterator moves to the last key position of db, which is generally used to traverse all keys of the whole db in reverse
- SeekForPrev moves to the previous position of the current key. It is generally used to traverse the keys between limit, start]
- The Next iterator moves to the Next key
- Move Prev iterator to previous key
Iterator::status() can return the error of the current iterator. Error includes IO error, checksum mismatch, internal error, etc
If there are no errors, the status is Status::OK(). If Iterator::Valid() is true, the status must be OK.
If Iterator::Valid() is false, it may be because the iterator has reached the end of the data. In this case, the status is OK; There may also be an error. In this case, the status is not OK.
5.13. In X or earlier versions, the two API s will be different, so you need to consult the documentation again.
Each Iterator
DBIter
Implementation: db/db_iter.cc
Interface: Iterator
DBIter is the encapsulation of InternalIterator. Its main work is to parse the Internal key into user key and show it
Example
In the underlying InternalIterator
InternalKey(user_key="Key1", seqno=10, Type=Put) | Value = "KEY1_VAL2" InternalKey(user_key="Key1", seqno=9, Type=Put) | Value = "KEY1_VAL1" InternalKey(user_key="Key2", seqno=16, Type=Put) | Value = "KEY2_VAL2" InternalKey(user_key="Key2", seqno=15, Type=Delete) | Value = "KEY2_VAL1" InternalKey(user_key="Key3", seqno=7, Type=Delete) | Value = "KEY3_VAL1" InternalKey(user_key="Key4", seqno=5, Type=Put) | Value = "KEY4_VAL1"
DBIter parses the Internal Key into the User Key and exposes the interface to the user
Key="Key1" | Value = "KEY1_VAL2" Key="Key2" | Value = "KEY2_VAL2" Key="Key4" | Value = "KEY4_VAL1"
MergingIterator
Implementation: table/merging_iterator.cc
Interface: InternalIterator
Mergeiterator consists of many children iterators
autovector<IteratorWrapper, kNumIterReserve> children_;
The merging iterator will put these children iterators into the heap and display them to the upper layer in order.
Example
The children iterator is:
= Child Iterator 1 = InternalKey(user_key="Key1", seqno=10, Type=Put) | Value = "KEY1_VAL2" = Child Iterator 2 = InternalKey(user_key="Key1", seqno=9, Type=Put) | Value = "KEY1_VAL1" InternalKey(user_key="Key2", seqno=15, Type=Delete) | Value = "KEY2_VAL1" InternalKey(user_key="Key4", seqno=5, Type=Put) | Value = "KEY4_VAL1" = Child Iterator 3 = InternalKey(user_key="Key2", seqno=16, Type=Put) | Value = "KEY2_VAL2" InternalKey(user_key="Key3", seqno=7, Type=Delete) | Value = "KEY3_VAL1"
The mergeiterator puts all children iterator s into the heap to make them orderly
InternalKey(user_key="Key1", seqno=10, Type=Put) | Value = "KEY1_VAL2" InternalKey(user_key="Key1", seqno=9, Type=Put) | Value = "KEY1_VAL1" InternalKey(user_key="Key2", seqno=16, Type=Put) | Value = "KEY2_VAL2" InternalKey(user_key="Key2", seqno=15, Type=Delete) | Value = "KEY2_VAL1" InternalKey(user_key="Key3", seqno=7, Type=Delete) | Value = "KEY3_VAL1" InternalKey(user_key="Key4", seqno=5, Type=Put) | Value = "KEY4_VAL1"
MemtableIterator
Implementation: db/memtable.cc
Interface: InternalIterator
As an iterator for memtables, each memtable implements its own iterator to expose orderly and iterative kv pairs to the outside world
BlockIter
Implementation: table/block_based/block.h
Interface: InternalIterator
This iterator is used to read blocks from SST files, whether they are index block s, data block s or meta block s (DataBlockIter, IndexBlockIter and MetaBlockIter inherit BlockIter). Since the SST File block is ordered and immutable, we load the block into memory and create a BlockIter for the sorted data.
TwoLevelIterator
Implementation: table/two_level_iterator.cc
Interface: InternalIterator
A TwoLevelIterator consists of two iterators
- first_level_iter_
- second_level_iter_
first_level_iter_ Used to find which second to use_ level_ iter_ , Then second_level_iter_ Point to the data you actually need to read
Example
RocksDB uses TwoLevelIterator to read SST files, first_level_iter_ BlockIter, second for Index block_ level_ iter_ Is BlockIter for Data block
[Data block, offset: 0x0000] KEY1 | VALUE1 KEY2 | VALUE2 KEY3 | VALUE3 [Data Block, offset: 0x0100] KEY4 | VALUE4 KEY7 | VALUE7 [Data Block, offset: 0x0250] KEY8 | VALUE8 KEY9 | VALUE9 [Data Block, offset: 0x0350] KEY11 | VALUE11 KEY15 | VALUE15 [Index Block, offset: 0x0500] KEY3 | 0x0000 KEY7 | 0x0100 KEY9 | 0x0250 KEY15 | 0x0500
first_level_iter_ => BlockIter over Index block
second_ level_ iter_ => Blockiter over data block, will be first_level_iter_ Decide which to point to
When using TwoLevelIterator to seek KEY8, first is used_ level_ iter_ Seek to the corresponding data block (0x0250), and then use second_level_iter_ Come to seek the specific key.
When reading the Data Block, if the Block is in the Block Cache, the object address is returned directly. Otherwise, disk IO occurs. Read the Block of SST, construct the Block object and cache its address in the Block Cache.