std::vector>Tp>push_ Implementation principle of back(const Tp&val)

background

  • The std::vector<Tp>container of C++ Standard Library has continuous memory, which has great advantages in some applications, such as the operation compatible with C language. We store n POD type data in vectors, such as std::vector<double>vec(20). I want to empty the data to 0. I can use memset(vec.data(), 0, vec.size() * sizeof(vec[-1]). Other containers, such as list s, can only be traversed by for loop s and cleared one by one. Vectors can grow automatically, hiding the underlying memory allocation and object construction behavior, which is friendly to program developers. Instead of continuing to allocate memory (realloc) after existing memory, vectors allocate new space - move old data - construct objects - release source space.
  • The average allocation time complexity for vector s inserting and deleting elements at the tail is O(1).
  • A vector has a very simple data structure, which requires only three pointers and can be defined as
template <typename Tp>
class vector{
public:
// ...

protected:
   Tp* M_start_;_M_start
   Tp* _M_finish;
   Tp* _M_end_of_storage;
}
  • The basic memory view of the vector s looks like this:

Vector::push_ How backs are implemented

  • Today let's look at std::vector::push_ Implementation of back (const Tp&val).
  • Before you begin, there are several important points of basic knowledge about memory allocation and object construction that need to be explained.
// 1. Unlocated version operator new / delete
// Memory Allocation Function
void* operator new(std::size_t);
// Memory Release Function
void operator delete(void*);

// 2. Locate version of operator new delete.
// Default placement versions of operator new.
inline void* operator new(std::size_t, void* __p) { return __p; }
// Default placement versions of operator delete.
inline void operator delete  (void*, void*) {}

// 3. Operators
new / delete
  • These concepts are easily confused. Version 3 finished allocating memory, pointer transition, and object construction. Version 1 is responsible for allocating memory only; Version 2 is only responsible for calling constructors to construct objects at specified locations. Amazingly, you can also construct objects at specified locations. Previous object constructions only know the object address after new.
  • Versions 1 and 2 are provided to separate object space allocation from object construction, providing great flexibility and efficiency in the container implementation of this standard library. For example, I can allocate a block of raw memory (version 1 new) and then construct an object on that memory (version 2 new). This flexibility is not available if the C++ new operator (version 3 new) is used directly.
  • We define a construction type Person for testing.
class Person {
public:
  Person() : age_(0), name_("anonymous"), is_male_(false) {}
  Person(int age, std::string name, bool is_male)
      : age_(age), name_(std::move(name)), is_male_(is_male) {}
  // copy constructor
  Person(constPerson &other) {
    this->age_ = other.age_;
    this->name_ = other.name_;
    this->is_male_ = other.is_male_;
  }
  // move constructor
  Person(Person &&person) {
    this->age_ = other.age_;
    this->name_ = std::move(other.age_);
    this->is_male_ = other.is_male_;
  }

  ~Person() = default;

private:
  int age_;
  std::string name_;
  bool is_male_;
};
  • We call the following code snippet,
void main() {
  std::vector<Person> v(2);
  // Create an 18-year-old girl named Friday.
  person person(18, "Friday", false);
  // Place behind vec.
  v.push_back(person);
}
  • Expecting a Person object push_back to the vector<Person>object v. If there is enough space behind v, push_back constructs objects directly behind V (locates new) and performs a ++ operation on the internal data structure that characterizes size, which is relatively simple and not considered here. Let's assume we're pushing_ The memory view of std:: vector<Person> v(2) before the back call is as follows, free space is full, v.size() == v.capacity() push_back needs to allocate new space - move old data - construct objects - release source space so it can complete push_back operation.

Allocate New Space

// Double capacity expansion
int new_capacity = v.size() * 2;
// Call version 1 new; And transform the allocated memory into Person*
Person* new_ptr = static_cast<Person*>(::operator new(new_capacity * sizeof(Person)));
  • At this point, the space configuration is as follows
  • This step only allocates memory space without any initialization, so the memory value of the space is unknown.

Move old data to new space

  // Call the standard library function to copy the old data to the new space.
#if 1
  Person* end_ptr = std::uninitialized_copy(v.begin(), v.end(), new_ptr);
#else
// This writing is wrong. Do you know why?
for(Person* iter = v.begin(); iter != v.end(); ++iter {
  *new_ptr++ = * iter; // Call Person & operator (const Person & other)
}
#endif
  • Although the name is std::uninitialized_copy(...), However, depending on the characteristics of the object stored in the container, such as whether it is movable or not, and whether it needs to be constructed, it is actually up to you to call the object's copy constructor or move the constructor or memmove to improve efficiency and avoid unnecessary copies.

  • For POD types, that is, type objects that do not need to be constructed, memmove is called directly to move memory.

  • For movable constructs is_ Move_ An object of constructible that calls a mobile constructor to construct

  • For other types, the copy constructor is called to copy the object. This is the least efficient but also the safest operation.

  • Here Person defines the move constructor, which means the Person object is movable, so it calls the Person:: Person (Person &&&) version. Memory comparison before and after moving is as follows:

Construct the object to be inserted

// At end_ptr position construct object, call locate new
::new(end_ptr) Person(18, "Friday", false);
this->_M_finish++; // This->size() plus 1
  • This completes a push_back operation. Yes, there is still old space to be freed, otherwise memory leaks will occur.

Release old space

  • auto b = std::move(a), std::move() guarantees that source a can be properly destructed after being moved.
  • What we need to do is to deconstruct all the objects in the original space one by one and then release the space again. Note that this is a different matter. We look at this picture for understanding and manipulation.
// 1. Destruct the original object
for(Person* iter = v.begin(); iter != v.end(); ++iter) {
  iter->~Person();
}
// or directly calls the standard library std::_Destroy(...) Function.
std::_Destroy(v.begin(), v.end());

// 2. Release old space
size_t num = v.size();
Person* base_ptr = &*v.begin();

// Call "1. Unlocated version operator delete"
::operator delete(base_ptr, num);

Resetting the value of a vector data member

  _M_start = new_ptr;
  _M_finish = end_ptr;
  _M_end_of_storage = new_ptr + new_capacity;

  // A new element was put in, so the ++ operation was performed.
  ++_M_finish;

  • So far, a push_of the vector s has been completed Back operation.

summary

  • Today we focused on vector::push_ The operation of back () requires that the reader be familiar with memory allocation, object construction, object destruction, and the composition of memory release, as well as the details of the operation.
  • Understand std:: uninitialized_ The process of copying.
  • Understand the role of std::move and move constructors in optimizing efficiency.

Keywords: C++ data structure

Added by porrascarlos80 on Sun, 30 Jan 2022 01:54:34 +0200