C + + container notes

C + + container

General characteristics

A basic feature of all containers is that it holds elements in "value" semantics.

That is, the container stores copies and copies of elements, not references.

Cost: high overhead and reduced performance.

resolvent:

1. Try to implement transfer construction and transfer assignment function for elements

When adding a container, use std::move() to "transfer" to reduce the cost of element replication:

Point p;                        // An object with high copy cost

v.push_back(p);                // Storage object, copy structure, high cost
v.push_back(std::move(p));    // After defining the transfer structure, you can transfer storage and reduce costs

**2. Use C++11 to add a new empty operation function for the container**

It can construct elements "in place", eliminating the cost of copying and transferring after construction. It is not only efficient, but also convenient to use:

v.emplace_back(...);            // Construct elements directly in the container without copying or transferring

3. The pointer of the element is stored in the container, and the element is saved indirectly

Disadvantages: you cannot take advantage of the container's ability to automatically destroy elements. You have to manually manage the life cycle of elements, which is prone to errors and may cause memory leakage.

Specific characteristics

One classification is divided into three categories: sequential containers, ordered containers and unordered containers according to the access mode of elements

Sequential container

Sequential containers are linear tables in data structures. There are five types: array, vector, deque, list and forward_list.

According to the storage structure, these five containers are divided into (array and linked list) structures

1. Continuously stored arrays: array, vector and deque;

2. Linked list of pointer structure: list and forward_list.

Array and vector directly correspond to the built-in array of C. the memory layout is fully compatible with C, so they are the containers with the lowest overhead and the fastest speed.

Difference: dynamic growth. Array is a static array, and its size is fixed during initialization; vector is a dynamic array. Although the size is set during initialization, it can grow as needed to accommodate any number of elements.

array<int, 2> arr;                // The initial array is 2 in length
assert(arr.size() == 2);        // The length of a static array is always 2

vector<int> v(2);              // The initial vector is 2 in length
for(int i = 0; i < 10; i++) {
    v.emplace_back(i);          // Append multiple elements
}
assert(v.size() == 12);          // The length dynamically increases to 12

The difference between deque and vector:

Common points: they are dynamically growing arrays, stored continuously, inserted and deleted in the middle, and the efficiency is low.

Difference: deque can efficiently insert and delete elements at both ends. vector can only be pushed_ Back appends an element at the end.

deque<int> d;                  // Initialize a deque with a length of 0
d.emplace_back(9);              // Add an element at the end
d.emplace_front(1);              // Add an element to the front end
assert(d.size() == 2);          // The length dynamically increases to 2

The disadvantage of the linked list is that the search efficiency is low and can only be accessed in the order of pointers.

list and forward_list differences:

List is a two-way linked list, which can be traversed forward or backward, and forward_list, as the name suggests, is a one-way linked list, which can only be traversed forward, and the search efficiency is lower.

Comparison between linked list structure and array structure:

The storage cost of the linked list is slightly higher, because one or two pointers must be attached to each element to point to the front and back nodes of the linked list.

The disadvantage of the linked list is that the search efficiency is low and can only be accessed in the order of pointers.

The efficiency of inserting and deleting in the middle of the array is low.

vector/deque and list / forward_ Lists can grow dynamically to accommodate more elements.

The capacity of vector is expanded twice every time, and the effect is better when inserting a large amount of data in a short time. The cost is that copying or moving elements during each capacity expansion costs a lot. Try to estimate the space and use reserve to allocate enough space in advance.

The capacity of deque\list is expanded according to the specified number of bytes.

Preferred containers array and vector

The fastest and lowest cost, the form of array also makes them the easiest to use, and the collocation algorithm can also realize fast sorting and searching.

Deque, list and forward_list is suitable for situations sensitive to insert and delete performance. If you still care about space overhead, you can only choose deque which is not a linked list

[the external chain image transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the image and upload it directly (img-xfb3wcmk-1631353981865) (C: \ users \ 75666 \ appdata \ roaming \ typora user images \ 1628770935250. PNG)]

Ordered container

Features: the sequential container is inserted in order, and the access elements are inserted in the original order.

An ordered container is different. Its elements are automatically sorted according to some rules after being inserted into the container, so it is "ordered".

Structure: tree structure, usually red black tree - binary tree with the best search performance.

There are four kinds of ordered containers in the standard library: set/multiset and map/multimap. Set is a set, and map is an associative array (also called "dictionary" in other languages).

A container with multi prefix means that it can hold duplicate key s, or it can be considered that there are only two ordered containers.

How to correctly understand order?

That is, how does the container determine the "order" of two elements

Another fundamental difference between ordered containers and sequential containers is that the key comparison function must be specified when defining the container

However, this function is usually the default less function, which indicates the less than relationship. It doesn't need to be written specifically:

template<
    class T                          // Template parameters have only one element type
> class vector;                      // vector

template<
    class Key,                      // The template parameter is the key type, that is, the element type
    class Compare = std::less<Key>  // Comparison function
> class set;                        // aggregate

template<
    class Key,                      // The first template parameter is of type key
    class T,                        // The second template parameter is the element type
    class Compare = std::less<Key>  // Comparison function
> class map;                        // Associative array

In addition to the basic types such as int and string, there is no default comparison function for user-defined types. It is a little troublesome to be the key of the container.

There are two ways to solve this problem: one is to overload "<", and the other is to customize template parameters.

For example, we have a Point class, which has no concept of size, but as long as it is overloaded with the "<" operator, it can be put into an ordered container:

bool operator<(const Point& a, const Point& b)
{
    return a.x < b.x;            // Custom comparison operation
}

set<Point> s;                    // Now you can put the ordered container correctly
s.emplace(7);
s.emplace(3);

Another way is to write a special function object or lambda expression, and then specify it in the template parameters of the container. This method is more flexible and can realize arbitrary sorting criteria:

set<int> s = {7, 3, 9};           // Define the collection and initialize 3 elements

for(auto& x : s) {                // Range loop output element
    cout << x << ",";              // Sort from small to large, 3,7,9
}   

auto comp = [](auto a, auto b)  // Define a lambda to compare sizes
{   
    return a > b;                // Define greater than relationship
};  

set<int, decltype(comp)> gs(comp)  // Use decltype to get the type of lambda

std::copy(begin(s), end(s),          // Copy algorithm
          inserter(gs, gs.end()));  // Using insert iterators

for(auto& x : gs) {                // Range loop output element
    cout << x << ",";                // Sort from large to small, 9,7,3
}  

Rules for using ordered containers:

Set is used for set relations and map is used for associative arrays.

Note: because ordered containers are automatically sorted during insertion, there is an implicit insertion sorting cost. When the amount of data is large, the cost of internal location search and tree rotation may be relatively high.

If you need to insert sorting in real time, you can select set/map. If it is non real-time, it is better to use vector, and sort all the data at one time after insertion. The effect will be better.

unordered container

There are also four unordered containers. There are also set and map in the name, but they are prefixed with unordered, which are unordered respectively_ set/unordered_ multiset,unordered_map/unordered_multimap.

The usage is almost the same as that of an ordered container.

Difference: different internal data structures.

Internal data structure: it is not a red black tree, but a hash table (also known as hash table).

example:

using map_type =                    // Type alias
    unordered_map<int, string>;      // Use unordered associative arrays

map_type dict;                      // Define an unordered associative array

dict[1] = "one";                      // Add three elements
dict.emplace(2, "two");
dict[10] = "ten";

for(auto& x : dict) {                // Traversal output
    cout << x.first << "=>"           // Uncertain sequence
         << x.second << ",";          // Neither insertion order nor size order
} 

Unordered containers are more demanding on key s than ordered containers:

The first is because the hash table requires that only the hash value can be put into the hash table;

The second is that the hash values may conflict, so when the hash values are the same, the real key value should be compared.

give an example:

template<
    class Key,                          // The first template parameter is of type key
    class T,                            // The second template parameter is the element type
    class Hash = std::hash<Key>,        // Function object that evaluates the hash value
    class KeyEqual = std::equal_to<Key> // Equality comparison function
> class unordered_map; 

To put the key into the container, you must implement the following two functions:

The "= =" function is relatively simple and can be implemented by overloading the operator in a similar way to the "<" function:

bool operator==(const Point& a, const Point& b)
{
    return a.x == b.x;              // Custom equality comparison operation
}

Hash functions are a little more troublesome. You can use function objects or lambda expressions to implement them. It is better to call the standard std::hash function object internally instead of calculating directly, otherwise it is easy to cause hash conflicts:

auto hasher = [](const auto& p)    // Define a lambda expression
{
    return std::hash<int>()(p.x);  // Call the standard hash function object to calculate
};

When putting a custom type into an unordered container, the container must support equality functions and hash functions.

unordered_set<Point, decltype(hasher)> s(10, hasher);

s.emplace(7);
s.emplace(3);

Selection of ordered and unordered containers:

If you only want simple sets and dictionaries without sorting requirements, you should use unordered containers. Without comparing the cost of sorting, it will be very fast.

)/ / define a lambda expression
{
return std::hash()(p.x); / / call the standard hash function object to calculate
};

When putting a custom type into an unordered container, the container must support**Equality function and hash function.**

```c++
unordered_set<Point, decltype(hasher)> s(10, hasher);

s.emplace(7);
s.emplace(3);

Selection of ordered and unordered containers:

If you only want simple sets and dictionaries without sorting requirements, you should use unordered containers. Without comparing the cost of sorting, it will be very fast.

Practical tips: * * use more type aliases and don't write dead container definitions. * * because most interfaces are the same, you can change different containers as long as you change the alias definitions.

Keywords: C++ data structure linked list

Added by lihman on Sat, 11 Sep 2021 22:54:58 +0300