LevelDB source code parsing WriteBatch

You can also use my independent blog—— www.huliujia.com Get this article

brief introduction

The official note of LevelDB describes WriteBatch as follows:

WriteBatch holds a collection of updates to apply atomically to a DB

How to ensure atomicity may need to be clarified after reading the use of WriteBatch. Here, only one WriteBatch object can contain multiple update records (insert / delete) and support batch writing.

Many operations of WriteBatch are implemented through auxiliary classes, which directly operate the member variables of WriteBatch. This article will first introduce the member variables of WriteBatch and these auxiliary classes, and finally introduce the member functions of WriteBatch

Member variable of WriteBatch

WriteBatch has only one member and is a string type. All update records will be encoded and written to this rep_ Yes.

std::string rep_;

rep_ The coding format is as follows:

Number of bytes8 bytes4 bytesLengthenLengthenLengthenLengthen
contentsequencecountrecord 1reocrd 2record 3record ...

sequence is a 64bit serial number, and each WriteBatch has a unique serial number.

count is rep_ Number of record s contained.

count is followed by the record list. The encoding format of each record is the same as follows:

Number of bytes1 byteLengthenkey sizeLengthenvalue size
contentValueTypekey sizekeyvalue sizevalue

key size and value size use Variable length encoding format of LevelDB

WriteBatchInternal helper class

WriteBatchInternal has no member variables, only member functions.

SetCount and Count

void WriteBatchInternal::SetCount(WriteBatch* b, int n)
{
  EncodeFixed32(&b->rep_[8], n);
}

int WriteBatchInternal::Count(const WriteBatch* b)
{
  return DecodeFixed32(b->rep_.data() + 8);
}

SetCount calls EncodeFixed32 to write the input parameter n to the rep of WriteBatch in small end encoding_ count field for.

Count calls DecodeFixed32 to replace rep_ The count field of is returned after decoding.

SetSequence and Sequence

void WriteBatchInternal::SetSequence(WriteBatch* b, SequenceNumber seq)
{
  EncodeFixed64(&b->rep_[0], seq);
}

SequenceNumber WriteBatchInternal::Sequence(const WriteBatch* b)
{
  return SequenceNumber(DecodeFixed64(b->rep_.data()));
}

SetSequence calls EncodeFixed64 to write the serial number seq to rep in small end encoding_ The sequence field of the. The SequenceNumber here is actually std::uint64_t type.

The sequence function calls the DecodeFixed64 rep_ The sequence field of is returned after decoding.

SetContents and ByteSize

void WriteBatchInternal::SetContents(WriteBatch* b, const Slice& contents)
{
  assert(contents.size() >= kHeader);
  b->rep_.assign(contents.data(), contents.size());
}

static Slice WriteBatchInternal::Contents(const WriteBatch* batch){ return Slice(batch->rep_); }

static size_t WriteBatchInternal::ByteSize(const WriteBatch* batch){ return batch->rep_.size(); }

SetContent overwrites rep_ with conents. The value of kHeader is 12, which is the total length of sequence field and count field, which are required.

Contents rep_ Wrapped as a slice object. Slice mentioned in the previous article that it can be treated as a string.

ByteSize returns rep_ size of, i.e. rep_ Number of bytes.

Append

void WriteBatchInternal::Append(WriteBatch* dst, const WriteBatch* src)
{
  SetCount(dst, Count(dst) + Count(src));
  assert(src->rep_.size() >= kHeader);
  dst->rep_.append(src->rep_.data() + kHeader, src->rep_.size() - kHeader);
}

Append is an important function with two parameters, dst and src. Append appends the src data into the dst.

After merging, the sequence field of dst remains unchanged, and the count field value of dst is equal to the sum of dst and src count. The record list of src is appended and merged after the reocrd list of dst.

InsertInto

Status WriteBatchInternal::InsertInto(const WriteBatch* b, MemTable* memtable)
{
  MemTableInserter inserter;
  inserter.sequence_ = WriteBatchInternal::Sequence(b);
  inserter.mem_ = memtable;
  return b->Iterate(&inserter);
}

Construct MemTableInserter object and fill in sequence and memtable object pointer. Then call the Iterate of WriteBatch to insert the update record of the current batch into memtable.

MemTableInserter helper class

The MemTableInserter auxiliary class is used to insert records into MemTable. MemTableInserter inherits Handler, which is an abstract class with only Put and Delete pure virtual functions.

The Handler uses the template method design pattern, which can easily make different versions of Handler, such as MemTableInserterPro.

namespace
{
  class MemTableInserter : public WriteBatch::Handler
  {
  public:
    SequenceNumber sequence_;
    MemTable* mem_;

    void Put(const Slice& key, const Slice& value) override
    {
      mem_->Add(sequence_, kTypeValue, key, value);
      sequence_++;
    }

    void Delete(const Slice& key) override
    {
      mem_->Add(sequence_, kTypeDeletion, key, Slice());
      sequence_++;
    }
  };
}

When I saw the code, I was very confused. Why write a namespace without a name? I checked it specifically. This is called anonymous namespace, which allows functions, classes and objects inside the namespace to be accessed only in the current file, which is similar to the static scope of C language.

Member variable

MemTableInserter contains two member variables, sequence_ And mem_. sequence_ Is the serial number to be filled in when inserting Memtable. mem_ Is a pointer to the Memtable object.

Put

Call the Add function of Memtable to Add a record. The record type is kTypeValue, indicating that this is an insert record.

Delete

Call the Add function of Memtable to Add a record. The record type is kTypeDeletion, indicating that this is a deleted record.

WriteBatch member function

Put

void WriteBatch::Put(const Slice& key, const Slice& value)
{
  WriteBatchInternal::SetCount(this, WriteBatchInternal::Count(this) + 1);
  rep_.push_back(static_cast<char>(kTypeValue));
  PutLengthPrefixedSlice(&rep_, key);
  PutLengthPrefixedSlice(&rep_, value);
}

The first is to put rep_ Add 1 to the count field of. Then encode ValueType, key and value into a record and write it to rep_ Behind the.

Delete

void WriteBatch::Delete(const Slice& key)
{
  WriteBatchInternal::SetCount(this, WriteBatchInternal::Count(this) + 1);
  rep_.push_back(static_cast<char>(kTypeDeletion));
  PutLengthPrefixedSlice(&rep_, key);
}

Deleting elements in LevelDB is achieved by adding a delete record. Therefore, there are two differences between deleting or adding a record and Put. One is that the ValueType of record is ktypedeletion, and the other is that the deleted record has only key but no value.

Append

void WriteBatch::Append(const WriteBatch& source)
{
  WriteBatchInternal::Append(this, &source);
}

Call the Append of WriteBatchInternal to merge the source into the current WriteBatch object.

Iterate

Status WriteBatch::Iterate(Handler* handler) const
{
  Slice input(rep_);
  if( input.size() < kHeader )
  {
    return Status::Corruption("malformed WriteBatch (too small)");
  }

  input.remove_prefix(kHeader);
  Slice key, value;
  int found = 0;
  while( !input.empty())
  {
    found++;
    char tag = input[0];
    input.remove_prefix(1);
    switch( tag )
    {
      case kTypeValue:
        if( GetLengthPrefixedSlice(&input, &key) && GetLengthPrefixedSlice(&input, &value))
        {
          handler->Put(key, value);
        }else
        {
          return Status::Corruption("bad WriteBatch Put");
        }
        break;
      case kTypeDeletion:
        if( GetLengthPrefixedSlice(&input, &key))
        {
          handler->Delete(key);
        }else
        {
          return Status::Corruption("bad WriteBatch Delete");
        }
        break;
      default:
        return Status::Corruption("unknown WriteBatch tag");
    }
  }
  if( found != WriteBatchInternal::Count(this))
  {
    return Status::Corruption("WriteBatch has wrong count");
  }else
  {
    return Status::OK();
  }
}

The above code is quite a lot, but the logic is very simple, that is, traversal. The rep of WriteBatch_ Decoding: for each record, call the Put of the handler to add an insert record or call Delete to add or Delete a record according to the ValueType field.

Some checks will be performed during and after the traversal. If there is a problem with the data, it will return to a "Corruption" state. If the data is OK, an "OK" status is returned.

ApproximateSize and Clear

Approximate size returns rep_ Clear, clear rep_ And set the sequence field and count field to 0.

Why separate a WriteBatchInternal

From the perspective of code function, the author puts the rep of WriteBatch_ The decoding and encoding of are stored in WriteBatchInternal, and WriteBatch only has an encapsulated interface for external use. Make the code structure clearer.

With one exception, the Iterate member function of WriteBatch has a pair of rep_ For decoding. According to the above idea, Iterate should be implemented by WriteBatchInternal, and WriteBatch can be called again. It may be that the author still has some deviation in his understanding here. Welcome to post your ideas in the comment area.

Source version

https://github.com/google/leveldb/releases/tag/1.22

Keywords: source code analysis Leveldb

Added by NixNod on Sat, 02 Oct 2021 03:58:58 +0300