Learning notes of c + + Advanced Programming 7

Learn more about template parameters

In fact, there are three template parameters: type parameter, non type parameter and template template parameter (there is no repetition here, which is indeed the name). Chapter 12 lists examples of type parameters and non type parameters, but I haven't seen the template parameter. This chapter also has some manual problems related to type parameters and non type parameters not covered in Chapter 12. These three types of template parameters are discussed in depth below.

Learn more about template type parameters

The type parameter of the template is the essence of the template. Any number of type parameters can be declared. For example, you can add a second type parameter to the Grid template in Chapter 12 to indicate that the Grid is built on another templated class container. The standard library defines several templated container classes, including vector and deque. The original Grid class uses the vector of vector to store the elements of the Grid. The user of Grid class may want to use the vector of deque. The type parameter of another template allows the user to specify whether the underlying container is vector or deque. The following is the class definition with additional template parameters:

template <typename T,typename Container>
class Grid
{
 	public:
    	explicit Grid(size_t width = kDefaultWidth,
                      size_t height = kDefaultHeight);
    	virtual ~Grid() = default;
    	
    	//Explicity default a copy constructor and assignment operator
    	Grid(const Grid& src) = default;
    	Grid<T,Container>& operator=(const Grid& rhs) = default;
    	
    	//Explicitly default a move constructor and assignment operator
    	Grid(Grid&& src) = default;
    	Grid<T,Container>& operator=(Grid&& rhs) = default;
    	
    	typename Container::value_type& at(size_t x,size_t y);
    	const typename Container::value_type& at(size_t x,size_t y) const;
    	size_t getHeight() const{return mHeight;}
    	size_t getWidth() const{return mWidth;}
    
    	static const size_t kDefaultWidth = 10;
    	static const size_t kDefaultHeight = 10;
    private:
    	void verifyCoordinate(size_t x,size_t y) const;
    	std::vector<Container> mCells;
    	size_t mWidth = 0,mHeight = 0;
};

The template now has two parameters: T and Container. Therefore, all places where Grid is referenced must now specify Grid < T, Container > to represent two template parameters. The only other change is that mCells is now the vector of the Container, not the vector of the vector. The following is the definition of the constructor:

template<typename T,typename Container>
Grid<T,Container>::Grid(size_t width,size_t height)
    :mWidth(width),mHeight(height)
    {
        mCells.resize(mWidth);
        for(auto& column : mCells)
        {
            column.resize(mHeight);
        }
    }

This constructor assumes that the Container type has a resize() method. If you try to instantiate this template by specifying a type without a resize() method, the compiler generates an error. The return type of the at() method is the element type stored in the given type Container. You can use typename Container: value_ The following are the implementations of the remaining methods:

template<typename T,typename Container>
void Grid<T,Container>::verifyCoordinate(size_t x,size_t y) const
{
	if(x >= mWidth || y >= mHeight)
	{
		throw std::out_of_range("");
	}
}
template<typename T,typename Container>
const typename Container::value_type& 
	Grid<T,Container>::at(size_t x,size_t y) const
{
	verifyCoordinate(x,y);
	return mCells[x][y];
}

template<typename T,typename Container>
typename Container::value_type&
Grid<T,Container>::at(size_t x,size_t y)
{
    return const_cast<typename Container::value_type&>(std::as_const(*this).at(x,y));
}

Grid objects can now be instantiated and used as follows

Grid<int,vector<optional<int>>> myIntVectorGrid;
Grid<int,queue<optional<int>>> myIntDequeGrid;

myIntVectorGrid.at(3,4) = 5;
cout<<myIntVectorGrid.at(3,4).value_or(0)<<endl;

myIntDequeGrid.at(1,2) = 3;
cout<<myIntVectorGrid.at(1,2).value_or(0)<<endl;

Grid<int,vector<optional<int>>> grid2(myIntVectorGrid);
grid2 = myIntVectorGrid;

output

xz@xiaqiu:~/study/test/test$ g++ -o test test.cpp -std=c++17
test.cpp: In instantiation of 'Grid<T, Container>::Grid(size_t, size_t) [with T = int; Container = std::queue<std::optional<int> >; size_t = long unsigned int]':
test.cpp:77:36:   required from here
test.cpp:47:20: error: 'class std::queue<std::optional<int> >' has no member named 'resize'; did you mean 'size'?
   47 |             column.resize(mHeight);
      |             ~~~~~~~^~~~~~
      |             size
test.cpp: In instantiation of 'const typename Container::value_type& Grid<T, Container>::at(size_t, size_t) const [with T = int; Container = std::queue<std::optional<int> >; typename Container::value_type = std::optional<int>; size_t = long unsigned int]':
test.cpp:72:12:   required from 'typename Container::value_type& Grid<T, Container>::at(size_t, size_t) [with T = int; Container = std::queue<std::optional<int> >; typename Container::value_type = std::optional<int>; size_t = long unsigned int]'
test.cpp:82:26:   required from here
test.cpp:65:21: error: no match for 'operator[]' (operand types are 'const value_type' {aka 'const std::queue<std::optional<int> >'} and 'size_t' {aka 'long unsigned int'})
   65 |     return mCells[x][y];
      |            ~~~~~~~~~^
xz@xiaqiu:~/study/test/test$ 

code

int main()
{
    Grid<int, vector<optional<int>>> myIntVectorGrid;
    //Grid<int, queue<optional<int>>> myIntDequeGrid;

    myIntVectorGrid.at(3, 4) = 5;
    cout << myIntVectorGrid.at(3, 4).value_or(0) << endl;

    // myIntDequeGrid.at(1, 2) = 3;
    // cout << myIntVectorGrid.at(1, 2).value_or(0) << endl;

    Grid<int, vector<optional<int>>> grid2(myIntVectorGrid);
    grid2 = myIntVectorGrid;
    
    //Grid<int,int> test; // WILL NOT COMPILE
    return 0;
}

Using Container for the parameter name does not mean that the type must be Container. Try instantiating the Grid class with int:

Grid<int,int> test; // WILL NOT COMPILE

This line of code cannot compile successfully, but the compiler may not give the expected error. The compiler does not report an error, saying that the second type parameter is not a Container but an int, but gives a strange error. For example, Microsoft Visual C + + reports "Container": must be a class or namespace when followed by ':: ". This is because the compiler attempts to generate a Grid class that treats int as a Container. Everything is normal before trying to process this line of the class template definition;

xz@xiaqiu:~/study/test/test$ g++ -o test test.cpp -std=c++17
test.cpp: In instantiation of 'class Grid<int, int>':
test.cpp:90:19:   required from here
test.cpp:70:1: error: 'int' is not a class, struct, or union type
   70 | Grid<T, Container>::at(size_t x, size_t y)
      | ^~~~~~~~~~~~~~~~~~
test.cpp:62:1: error: 'int' is not a class, struct, or union type
   62 | Grid<T, Container>::at(size_t x, size_t y) const
      | ^~~~~~~~~~~~~~~~~~
test.cpp: In instantiation of 'Grid<T, Container>::Grid(size_t, size_t) [with T = int; Container = int; size_t = long unsigned int]':
test.cpp:90:19:   required from here
test.cpp:48:16: error: request for member 'resize' in 'column', which is of non-class type 'int'
   48 |         column.resize(mHeight);
      |         ~~~~~~~^~~~~~
xz@xiaqiu:~/study/test/test$ 
typename Container::value_type& at(size_t x,size_t y);

In this line, the compiler realizes that column is of type int and has no embedded value_type alias. As with function parameters, you can specify default values for template parameters. For example, you may want to indicate that the default container for Grid is vector. This template class is defined as follows:

template <typename T,typename Container = std::vector<std::optional<T>>>
class Grid
{
    //Everything else is the same as before
};

The type T in the first template parameter can be used as the parameter of the optional template in the default value of the second template parameter. C + + syntax requires that the default value cannot be repeated in the template title line of the method definition. Now with this default parameter, when instantiating the grid, the customer can specify or not specify the underlying container:

Grid<int, deque<optional<int>>> myDequeGrid;
Grid<int, vector<optional<int>>> myVectorGrid;
Grid<int> myVectorGrid2(myVectorGrid);

Introduction to template parameter

There is another problem with the Container parameter discussed in this section. When instantiating a class template, write the code as follows:

Grid<int,vector<optional<int>>> myIntGrid;

Please note the repetition of int type. The element type must be specified for both Grid and vector in vector. What happens if you write the following code?

Grid<int,vector<optional<SpreadsheetCell>>>myIntGrid;

This does not work well. It would be nice if you could write the following code so that such errors do not occur:

Grid<int,vector>myIntGrid;

The Grid class should be able to determine that an optional vector with an element type of int is required. However, the compiler will not allow such parameters to be passed to ordinary type parameters, because the vector itself is not a type, but a template. If you want to receive a template as a template parameter, you must use a special parameter called the template parameter. Specify template templ The ate parameter is a bit like specifying the function pointer parameter in an ordinary function. The type of the function pointer includes the return type and parameter type of the function. Similarly, when specifying the template template parameter, the complete specification of the template parameter includes the parameters of the template.

For example, containers such as vector and deque have a template parameter list, as shown below. The E parameter is the element type, and the Allocator parameter is shown in Chapter 17.

template<typename E,typename Allocator = std::allocator<E>>
class vector
{
	//vector definition
};

To pass such a container as a template parameter, you can only copy and paste the declaration of the class template (in this case, template < typename e, typename allocator = allocator > class vector), replace the class name (vector) with the parameter name (Container), and use it as the template parameter of another template declaration (Grid in this case) , instead of a simple type name. With the previous template specification, the following is the class template definition of the Grid class that receives a container template as the second template parameter:

template<typename T,
	template<typename E,
			 typename Allocate = std::allocator<E>> class Container = std::vector>
    class Grid
    {
		public:
        	// Omitted code that is the same as before
			std::optional<T>& at(size_t x,size_t y);
        	const std::optional<T>& at(size_t x,size_t y) const;
        	// Omitted code that is the same as before
		private:
        	void verifyCoordinate(size_t x,size_t y) const;
        	
        	std::vector<Container<std::optional<T>>> mCells;
        	size_t mWidth = 0,mHeight = 0;
    };

What's going on here? The first template parameter is the same as before: element type T. the second template parameter is now the template of the Container, such as vector or deque. As mentioned earlier, this "template type" Two parameters must be received: element type E and allocator type. Note the repeated word class after the nested template parameter list. The name of this parameter in the Grid template is Container. The default value is now vector instead of vector, because Container is a template rather than an actual class.

The more common syntax rules for the template parameter are:

template <..., template <TemplateTypeParams> class ParameterName, ...>

Starting from C++17, you can also replace class with typename keyword, as shown below:

template <..., template <TemplateTypeParams> typename ParameterName, ...>

Instead of using the Container itself in the code, you must specify the Container < STD:: optional > as the Container type. For example, the declaration of mCells is as follows:

std::vector<Container<std::optional<T>>> mCells;

You do not need to change the method definition, but you must change the template line, for example:

template <typename T,
template <typename E, typename Allocator = std::allocator<E>> class Container>
void Grid<T, Container>::verifyCoordinate(size_t x, size_t y) const
{
    if (x >= mWidth || y >= mHeight) 
    {
	    throw std::out_of_range("");
    }
}

You can use the Grid template as follows:

Grid<int,vector>myGrid;
myGrid.at(1,2) = 3;
cout<<myGrid.at(1,2).value_or(0)<<endl;
Grid<int,vector>myGrid2(myGrid);

code

#include <cstddef>
#include <stdexcept>
#include <vector>
#include <optional>
#include <utility>
#include <deque>
#include <iostream>
using namespace std;

template <typename T,
          template <typename E, typename Allocator = std::allocator<E>> class Container = std::vector>
class Grid
{
public:
    explicit Grid(size_t width = kDefaultWidth, size_t height = kDefaultHeight);
    virtual ~Grid() = default;

    // Explicitly default a copy constructor and assignment operator.
    Grid(const Grid &src) = default;
    Grid<T, Container> &operator=(const Grid &rhs) = default;

    // Explicitly default a move constructor and assignment operator.
    Grid(Grid &&src) = default;
    Grid<T, Container> &operator=(Grid &&rhs) = default;

    std::optional<T> &at(size_t x, size_t y);
    const std::optional<T> &at(size_t x, size_t y) const;

    size_t getHeight() const { return mHeight; }
    size_t getWidth() const { return mWidth; }

    static const size_t kDefaultWidth = 10;
    static const size_t kDefaultHeight = 10;

private:
    void verifyCoordinate(size_t x, size_t y) const;

    std::vector<Container<std::optional<T>>> mCells;
    size_t mWidth = 0, mHeight = 0;
};

template <typename T, template <typename E, typename Allocator = std::allocator<E>> class Container>
Grid<T, Container>::Grid(size_t width, size_t height)
    : mWidth(width)
    , mHeight(height)
{
    mCells.resize(mWidth);
    for (auto &column : mCells)
    {
        column.resize(mHeight);
    }
}

template <typename T, template <typename E, typename Allocator = std::allocator<E>> class Container>
void Grid<T, Container>::verifyCoordinate(size_t x, size_t y) const
{
    if (x >= mWidth || y >= mHeight)
    {
        throw std::out_of_range("");
    }
}

template <typename T, template <typename E, typename Allocator = std::allocator<E>> class Container>
const std::optional<T> &Grid<T, Container>::at(size_t x, size_t y) const
{
    verifyCoordinate(x, y);
    return mCells[x][y];
}

template <typename T, template <typename E, typename Allocator = std::allocator<E>> class Container>
std::optional<T> &Grid<T, Container>::at(size_t x, size_t y)
{
    return const_cast<std::optional<T>&>(std::as_const(*this).at(x, y));
}

int main()
{
    Grid<int, vector>myGrid;
    myGrid.at(1, 2) = 3;
    cout << myGrid.at(1, 2).value_or(0) << endl;
    Grid<int, vector>myGrid2(myGrid);
    return 0;
}

output

xz@xiaqiu:~/study/test/test$ ./test3xz@xiaqiu:~/study/test/test$ 

be careful

The above C + + syntax is a bit confusing because it tries to get the most flexibility. Try not to get into a syntax dilemma here, and remember the main concept: you can pass templates as parameters to other templates.

Learn more about non type template parameters

Sometimes you may want users to specify a default element to initialize each cell in the grid. Here is a completely reasonable way to achieve this goal, which uses T() as the default value of the second template parameter;

template<typename T,const T DEFAULT = T()>
class Grid
{
public:
    explicit Grid(size_t width = kDefaultWidth, size_t height = kDefaultHeight);
    virtual ~Grid() = default;

    // Explicitly default a copy constructor and assignment operator.
    Grid(const Grid& src) = default;
    Grid<T, DEFAULT>& operator=(const Grid& rhs) = default;

    // Explicitly default a move constructor and assignment operator.
    Grid(Grid&& src) = default;
    Grid<T, DEFAULT>& operator=(Grid&& rhs) = default;

    std::optional<T>& at(size_t x, size_t y);
    const std::optional<T>& at(size_t x, size_t y) const;

    size_t getHeight() const { return mHeight; }
    size_t getWidth() const { return mWidth; }

    static const size_t kDefaultWidth = 10;
    static const size_t kDefaultHeight = 10;

private:
    void verifyCoordinate(size_t x, size_t y) const;

    std::vector<std::vector<std::optional<T>>> mCells;
    size_t mWidth = 0, mHeight = 0;
};

This definition is legal. The type T in the first parameter can be used as the type of the second parameter, and the non type parameter can be const, just like the function parameter. The initial value of t can be used to initialize each cell in the grid:

template<typename T,const T DEFAULT>
Grid<T,DEFAULT>::Grid(size_t width,size_t height)
:mWidth(width),mHeight(height)
{
	mCells.resize(mWidth);
    for(auto& column : mCells)
    {
        column.resize(mHeight);
        for(auto& element : column)
        {
            element = DEFAULT;
        }
    }
}

Other method definitions remain unchanged, except that the second template parameter must be added to the template row, and all Grid instances must become Grid < T, default >. After these modifications, you can instantiate an int Grid and set initial values for all elements;

Grid<int> myIntGrid; //Initial value is 0
Grid<int,10>myIntGrid2; //Initial value is 10

The initial value can be any integer. However, suppose you try to create a sprasheetcell mesh:

SpreadsheetCell defaultCell;Grid<SpreadsheetCell,defaultCell> mySpreadsheet; //WILL NOT COMPILE

This causes compilation errors because objects cannot be passed as arguments to non type parameters.

int main(){    Grid<int> myIntGrid; //Initial value is 0    Grid<int,10>myIntGrid2; //Initial value is 10    Grid<int,10.0>myIntGrid2; //Initial value is 10    // SpreadsheetCell defaultCell;    // Grid<SpreadsheetCell,defaultCell> mySpreadsheet; //WILL NOT COMPILE    return 0;}

output

xz@xiaqiu:~/study/test/test$ g++ -o test test.cpp SpreadsheetCell.cpp -std=c++17test.cpp: In function 'int main()':test.cpp:60:18: error: conversion from 'double' to 'int' in a converted constant expression   60 |     Grid<int,10.0>myIntGrid2; //Initial value is 10      |                  ^test.cpp:60:18: error: could not convert '1.0e+1' from 'double' to 'int'test.cpp:60:19: error: conflicting declaration 'int myIntGrid2'   60 |     Grid<int,10.0>myIntGrid2; //Initial value is 10      |                   ^~~~~~~~~~test.cpp:59:17: note: previous declaration as 'Grid<int, 10> myIntGrid2'   59 |     Grid<int,10>myIntGrid2; //Initial value is 10      |                 ^~~~~~~~~~xz@xiaqiu:~/study/test/test$ 

warning

Non type parameters cannot be objects, or even double and float values. Non type parameters are limited to integers, enumerations, pointers, and references. This example shows a strange behavior of a template class. It can be used normally for one type, but another type will fail to compile. A more detailed way to allow the user to specify the initial element value of the grid is to use references as non type template parameters. Here is the new class definition:

template<typename T,const T& DEFAULT>class Grid{ public:    explicit Grid(size_t width = kDefaultWidth, size_t height = kDefaultHeight);    virtual ~Grid() = default;    // Explicitly default a copy constructor and assignment operator.    Grid(const Grid &src) = default;    Grid<T, DEFAULT> &operator=(const Grid &rhs) = default;    // Explicitly default a move constructor and assignment operator.    Grid(Grid &&src) = default;    Grid<T, DEFAULT> &operator=(Grid &&rhs) = default;    std::optional<T> &at(size_t x, size_t y);    const std::optional<T> &at(size_t x, size_t y) const;    size_t getHeight() const { return mHeight; }    size_t getWidth() const { return mWidth; }    static const size_t kDefaultWidth = 10;    static const size_t kDefaultHeight = 10;private:    void verifyCoordinate(size_t x, size_t y) const;    std::vector<std::vector<std::optional<T>>> mCells;    size_t mWidth = 0, mHeight = 0;};template <typename T, const T &DEFAULT>Grid<T, DEFAULT>::Grid(size_t width, size_t height)    : mWidth(width)    , mHeight(height){    mCells.resize(mWidth);    for (auto &column : mCells)    {        column.resize(mHeight);        for (auto &element : column)        {            element = DEFAULT;        }    }}template <typename T, const T &DEFAULT>void Grid<T, DEFAULT>::verifyCoordinate(size_t x, size_t y) const{    if (x >= mWidth || y >= mHeight)    {        throw std::out_of_range("");    }}template <typename T, const T &DEFAULT>const std::optional<T> &Grid<T, DEFAULT>::at(size_t x, size_t y) const{    verifyCoordinate(x, y);    return mCells[x][y];}template <typename T, const T &DEFAULT>std::optional<T> &Grid<T, DEFAULT>::at(size_t x, size_t y){    return const_cast<std::optional<T>&>(std::as_const(*this).at(x, y));}

This template class can now be instantiated for any type. The C++17 standard specifies that the reference passed in as the second template parameter must be a converted constant expression (template parameter type), and it is not allowed to reference sub objects, temporary objects, string literals, the results of typeid expressions or predefined_ func_ Variable. The following example declares an int grid and a SpreadsheetCell grid with initial values.

int main(){	int defaultInt = 1;	Grid<int,defaultInt>myIntGrid;		SpreadsheetCell defaultCell(1.2);	Grid<SpreadsheetCell,defaultCell> mySpreadsheet;	return 0;}

output

xz@xiaqiu:~/study/test/test$ g++ -o test test.cpp SpreadsheetCell.cpp -std=c++17test.cpp: In function 'int main()':test.cpp:81:24: error: '& defaultInt' is not a valid template argument of type 'const int&' because 'defaultInt' is not a variable   81 |     Grid<int,defaultInt>myIntGrid;      |                        ^test.cpp:84:37: error: '& defaultCell' is not a valid template argument of type 'const SpreadsheetCell&' because 'defaultCell' is not a variable   84 |     Grid<SpreadsheetCell,defaultCell> mySpreadsheet;      |                                     ^xz@xiaqiu:~/study/test/test$ 

Declared as static, you can compile it

int main(){    static int defaultInt = 1;    Grid<int,defaultInt>myIntGrid;        static SpreadsheetCell defaultCell(1.2);    Grid<SpreadsheetCell,defaultCell> mySpreadsheet;    return 0;}

But these are the rules of C++17, and most compilers have not yet implemented them. Before C++17, the arguments passed to reference non type template parameters cannot be temporary or unlinked (external or internal) named lvalues. Therefore, for the above example, the rules before C++17 are used below. Use internal links to define initial values:

namespace {    int defaultInt = 11;    SpreadsheetCell defaultCell(1.2);}int main(){    Grid<int, defaultInt> myIntGrid;    Grid<SpreadsheetCell, defaultCell> mySpreadsheet;    return 0;}

Partial specialization of template class

In Chapter 12, the specialization of const char * class is called the specialization of complete template class, because it makes a specialization of each template parameter in the Grid template. There are no template parameters left in this specialization. This is not the only way to specialize classes. You can also write partially specialized classes that allow you to specialize some template parameters without dealing with other parameters. For example, the basic version of the Grid template has non type parameters for width and height:

Grid.h

template <typename T, size_t WIDTH, size_t HEIGHT>class Grid{public:    Grid() = default;    virtual ~Grid() = default;    // Explicitly default a copy constructor and assignment operator.    Grid(const Grid& src) = default;    Grid& operator=(const Grid& rhs) = default;    std::optional<T>& at(size_t x, size_t y);    const std::optional<T>& at(size_t x, size_t y) const;    size_t getHeight() const { return HEIGHT; }    size_t getWidth() const { return WIDTH; }private:    void verifyCoordinate(size_t x, size_t y) const;    std::optional<T> mCells[WIDTH][HEIGHT];};template <typename T, size_t WIDTH, size_t HEIGHT>void Grid<T, WIDTH, HEIGHT>::verifyCoordinate(size_t x, size_t y) const{	if (x >= WIDTH || y >= HEIGHT) {		throw std::out_of_range("");	}}template <typename T, size_t WIDTH, size_t HEIGHT>const std::optional<T>& Grid<T, WIDTH, HEIGHT>::at(size_t x, size_t y) const{	verifyCoordinate(x, y);	return mCells[x][y];}template <typename T, size_t WIDTH, size_t HEIGHT>std::optional<T>& Grid<T, WIDTH, HEIGHT>::at(size_t x, size_t y){	return const_cast<std::optional<T>&>(std::as_const(*this).at(x, y));}

You can use this method to make a special case of this template class for char* C style strings;

#include "Grid.h" // The file containing the Grid template definitiontemplate <size_t WIDTH, size_t HEIGHT>class Grid<const char*, WIDTH, HEIGHT>{public:    Grid() = default;    virtual ~Grid() = default;    // Explicitly default a copy constructor and assignment operator.    Grid(const Grid& src) = default;    Grid& operator=(const Grid& rhs) = default;    std::optional<std::string>& at(size_t x, size_t y);    const std::optional<std::string>& at(size_t x, size_t y) const;    size_t getHeight() const { return HEIGHT; }    size_t getWidth() const { return WIDTH; }private:    void verifyCoordinate(size_t x, size_t y) const;    std::optional<std::string> mCells[WIDTH][HEIGHT];};template <size_t WIDTH, size_t HEIGHT>void Grid<const char*, WIDTH, HEIGHT>::verifyCoordinate(size_t x,                                                         size_t y) const{	if (x >= WIDTH || y >= HEIGHT)     {		throw std::out_of_range("");	}}template <size_t WIDTH, size_t HEIGHT>const std::optional<std::string>& 	Grid<const char*, WIDTH, HEIGHT>::at(size_t x, size_t y) const{	verifyCoordinate(x, y);	return mCells[x][y];}template <size_t WIDTH, size_t HEIGHT>std::optional<std::string>& 	Grid<const char*, WIDTH, HEIGHT>::at(size_t x, size_t y){	return const_cast<std::optional<std::string>&>(std::as_const(*this).at(x, y));}

In this example, all template parameters are not exceptional. Therefore, the template code line is as follows:

template <size_t WIDTH, size_t HEIGHT>class Grid<const char*, WIDTH, HEIGHT>

Note that this template has only two parameters: WIDTH and HEIGHT. However, the Grid class takes three parameters: T, WIDTH, and HEIGHT. Therefore, the template parameter list contains two parameters, while the explicit Grid < const char *, WIDTH, HEIGHT > contains three parameters. Three parameters must still be specified when instantiating the template. Templates cannot be instantiated only by HEIGHT and WIDTH:

Grid<int, 2, 2> myIntGrid; // Uses the original GridGrid<const char*, 2, 2> myStringGrid; // Uses the partial specializationGrid<2, 3> test; // DOES NOT COMPILE! No type specified.

output

xz@xiaqiu:~/study/test/test$ g++ -o test test.cpp SpreadsheetCell.cpp -std=c++17test.cpp: In function 'int main()':test.cpp:56:14: error: wrong number of template arguments (2, should be 3)   56 |     Grid<2, 3> test; // DOES NOT COMPILE! No type specified.      |              ^In file included from test.cpp:1:Grid.h:10:7: note: provided for 'template<class T, long unsigned int WIDTH, long unsigned int HEIGHT> class Grid'   10 | class Grid      |       ^~~~xz@xiaqiu:~/study/test/test$ 

The above grammar is really messy. What's worse, in partial specialization, different from full specialization, each method definition should be preceded by a template code line, as shown below;

template <size_t WIDTH, size_t HEIGHT>const std::optional<std::string>&Grid<const char*, WIDTH, HEIGHT>::at(size_t x, size_t y) const{    verifyCoordinate(x, y);    return mCells[x][y];}

This template line with two parameters is required to indicate that the method has parameterized the two parameters. Note that grid < const char *, width, height > should be used to represent the complete class name.

The previous examples do not show the real power of partial specialization. You can write a special case implementation for a subset of possible types without having to special case each type. For example, you can write a special Grid class for all pointer types. This special copy constructor and assignment operator can perform deep copy of the object pointed to by the pointer instead of saving shallow copy of the pointer in the Grid. The following is the definition of the class. It is assumed that only one parameter is used to specialize the earliest version of Grid. In this implementation, the Grid becomes the owner of the provided pointer, so it automatically releases memory when needed;

GridPtr.h

#pragma once

#include "Grid.h"
#include <memory>

template <typename T>
class Grid<T *>
{
public:
    explicit Grid(size_t width = kDefaultWidth, 
                  size_t height = kDefaultHeight);
    virtual ~Grid() = default;

    // Copy constructor and copy assignment operator.
    Grid(const Grid &src);
    Grid<T *> &operator=(const Grid &rhs);

    // Explicitly default a move constructor and assignment operator.
    Grid(Grid &&src) = default;
    Grid<T *> &operator=(Grid &&rhs) = default;

    void swap(Grid &other) noexcept;

    std::unique_ptr<T> &at(size_t x, size_t y);
    const std::unique_ptr<T> &at(size_t x, size_t y) const;

    size_t getHeight() const { return mHeight; }
    size_t getWidth() const { return mWidth; }

    static const size_t kDefaultWidth = 10;
    static const size_t kDefaultHeight = 10;

private:
    void verifyCoordinate(size_t x, size_t y) const;

    std::vector<std::vector<std::unique_ptr<T>>> mCells;
    size_t mWidth = 0, mHeight = 0;
};

template <typename T>
Grid<T *>::Grid(size_t width, size_t height)
    : mWidth(width)
    , mHeight(height)
{
    mCells.resize(mWidth);
    for (auto &column : mCells)
    {
        column.resize(mHeight);
    }
}

template <typename T>
void Grid<T *>::swap(Grid &other) noexcept
{
    using std::swap;

    swap(mWidth, other.mWidth);
    swap(mHeight, other.mHeight);
    swap(mCells, other.mCells);
}

template <typename T>
Grid<T *>::Grid(const Grid &src)
    : Grid(src.mWidth, src.mHeight)
{
    // The ctor-initializer of this constructor delegates first to the
    // non-copy constructor to allocate the proper amount of memory.

    // The next step is to copy the data.
    for (size_t i = 0; i < mWidth; i++)
    {
        for (size_t j = 0; j < mHeight; j++)
        {
            // Make a deep copy of the element by using its copy constructor.
            if (src.mCells[i][j])
            {
                mCells[i][j].reset(new T(*(src.mCells[i][j])));
            }
        }
    }
}

template <typename T>
Grid<T *> &Grid<T *>::operator=(const Grid &rhs)
{
    // Check for self-assignment.
    if (this == &rhs)
    {
        return *this;
    }

    // Use copy-and-swap idiom.
    auto copy = rhs;    // Do all the work in a temporary instance
    swap( copy);        // Commit the work with only non-throwing operations
    return *this;
}

template <typename T>
void Grid<T *>::verifyCoordinate(size_t x, size_t y) const
{
    if (x >= mWidth || y >= mHeight)
    {
        throw std::out_of_range("");
    }
}

template <typename T>
const std::unique_ptr<T> &Grid<T *>::at(size_t x, size_t y) const
{
    verifyCoordinate(x, y);
    return mCells[x][y];
}

template <typename T>
std::unique_ptr<T> &Grid<T *>::at(size_t x, size_t y)
{
    return const_cast<std::unique_ptr<T>&>(std::as_const(*this).at(x, y));
}

Grid.h

#pragma once#include <cstddef>#include <stdexcept>#include <vector>#include <optional>#include <utility>template <typename T>class Grid{public:    explicit Grid(size_t width = kDefaultWidth, size_t height = kDefaultHeight);    virtual ~Grid() = default;    // Explicitly default a copy constructor and assignment operator.    Grid(const Grid &src) = default;    Grid<T> &operator=(const Grid &rhs) = default;    // Explicitly default a move constructor and assignment operator.    Grid(Grid &&src) = default;    Grid<T> &operator=(Grid &&rhs) = default;    std::optional<T> &at(size_t x, size_t y);    const std::optional<T> &at(size_t x, size_t y) const;    size_t getHeight() const { return mHeight; }    size_t getWidth() const { return mWidth; }    static const size_t kDefaultWidth = 10;    static const size_t kDefaultHeight = 10;private:    void verifyCoordinate(size_t x, size_t y) const;    std::vector<std::vector<std::optional<T>>> mCells;    size_t mWidth = 0, mHeight = 0;};template <typename T>Grid<T>::Grid(size_t width, size_t height)    : mWidth(width)    , mHeight(height){    mCells.resize(mWidth);    for (auto &column : mCells)    {        column.resize(mHeight);    }}template <typename T>void Grid<T>::verifyCoordinate(size_t x, size_t y) const{    if (x >= mWidth || y >= mHeight)    {        throw std::out_of_range("");    }}template <typename T>const std::optional<T> &Grid<T>::at(size_t x, size_t y) const{    verifyCoordinate(x, y);    return mCells[x][y];}template <typename T>std::optional<T> &Grid<T>::at(size_t x, size_t y){    return const_cast<std::optional<T>&>(std::as_const(*this).at(x, y));}

As usual, these two lines of code are the key:

template <typename T>class Grid<T*>

The above syntax shows that this class is a special case of Grid template for all pointer types. An implementation is provided only if it is a pointer type. Note that if you instantiate the mesh as follows: Grid < int * > myintgrid, it is actually int rather than int *. This is not intuitive enough, but unfortunately, this syntax is used in this way. Here is an example:

Grid<int> myIntGrid; // Uses the non-specialized gridGrid<int*> psGrid(2, 2); // Uses the partial specialization for pointer typespsGrid.at(0, 0) = make_unique<int>(1);psGrid.at(0, 1) = make_unique<int>(2);psGrid.at(1, 0) = make_unique<int>(3);Grid<int*> psGrid2(psGrid);Grid<int*> psGrid3;psGrid3 = psGrid2;auto& element = psGrid2.at(1, 0);if (element) {    cout << *element << endl;    *element = 6;}cout << *psGrid.at(1, 0) << endl; // psGrid is not modifiedcout << *psGrid2.at(1, 0) << endl; // psGrid2 is modified

Code output

xz@xiaqiu:~/study/test/test$ ./test336xz@xiaqiu:~/study/test/test$

The implementation of the method is quite simple, except for the copy constructor, which uses the copy constructor of each element for deep copy:

template <typename T>Grid<T *>::Grid(const Grid &src)    : Grid(src.mWidth, src.mHeight){    // The ctor-initializer of this constructor delegates first to the    // non-copy constructor to allocate the proper amount of memory.    // The next step is to copy the data.    for (size_t i = 0; i < mWidth; i++)    {        for (size_t j = 0; j < mHeight; j++)        {            // Make a deep copy of the element by using its copy constructor.            if (src.mCells[i][j])            {                mCells[i][j].reset(new T(*(src.mCells[i][j])));            }        }    }}

Partial specialization of analog functions by overloading

The C + + standard does not allow partial specialization of the template of a function. Instead, you can overload the function with another template. The difference is subtle. Suppose you want to write a special Find() function template (see Chapter 12). This special case dereferences the pointer and directly calls operator = =. According to the syntax of partial specialization of the class template, the following code may be written:

template <typename T>
size_t Find<T*>(T* const& value, T* const* arr, size_t size)
{
    for (size_t i = 0; i < size; i++) 
    {
        if (*arr[i] == *value) 
        {
        	return i; // Found it; return the index
    	}
    }
    return NOT_FOUND; // failed to find it; return NOT_FOUND
}

output

xz@xiaqiu:~/study/test/test$ g++ -o test test.cpp  -std=c++17
test.cpp:10:12: error: expected initializer before '<' token
   10 | size_t Find<T*>(T* const& value, T* const* arr, size_t size)
      |            ^
xz@xiaqiu:~/study/test/test$ 

However, this syntax for declaring partial exceptions to function templates is not allowed by the C + + standard. The correct way to implement the required behavior is to write a new template for Find(). The difference may seem trivial and impractical, but otherwise you can't compile

template <typename T>
size_t Find(T* const& value, T* const* arr, size_t size)
{
    for (size_t i = 0; i < size; i++) 
    {
        if (*arr[i] == *value) 
        {
        	return i; // Found it; return the index
        }
    }
    return NOT_FOUND; // failed to find it; return NOT_FOUND
}

The first parameter of this Find() version is t * const &, which is consistent with the original Find() function template (it takes const T & as the first parameter), but it is also feasible to use t * (instead of T * const &) as the first parameter of partial specialization of Find(). You can define the original find (template), partial special case version for pointer type, full special case version for const char * and overloaded version only for const char * in a program. The compiler will select the appropriate version to call according to the derivation rules.

be careful:

In all overloaded versions, function template specialization and specific function template instantiation, the compiler always selects the "most specific" function version. If the non templated version is equivalent to function template instantiation, the compiler prefers the non templated version.

Several specialized versions

static const size_t NOT_FOUND = static_cast<size_t>(-1);template <typename T>size_t Find(const T &value, const T *arr, size_t size){    cout << "original" << endl;    for (size_t i = 0; i < size; i++)    {        if (arr[i] == value)        {            return i; // found it; return the index        }    }    return NOT_FOUND; // failed to find it; return NOT_FOUND}template <typename T>size_t Find(T *const &value, T *const *arr, size_t size){    cout << "ptr special" << endl;    for (size_t i = 0; i < size; i++)    {        if (*arr[i] == *value)        {            return i; // found it; return the index        }    }    return NOT_FOUND; // failed to find it; return NOT_FOUND}/* // This does not work.template <typename T>size_t Find<T*>(T* const& value, T* const* arr, size_t size){    cout << "ptr special" << endl;    for (size_t i = 0; i < size; i++) {        if (*arr[i] == *value) {            return i; // found it; return the index        }    }    return NOT_FOUND; // failed to find it; return NOT_FOUND}*/template<>size_t Find<const char *>(const char *const &value, const char *const *arr, size_t size){    cout << "Specialization" << endl;    for (size_t i = 0; i < size; i++)    {        if (strcmp(arr[i], value) == 0)        {            return i; // found it; return the index        }    }    return NOT_FOUND; // failed to find it; return NOT_FOUND}size_t Find(const char *const &value, const char *const *arr, size_t size){    cout << "overload" << endl;    for (size_t i = 0; i < size; i++)    {        if (strcmp(arr[i], value) == 0)        {            return i; // found it; return the index        }    }    return NOT_FOUND; // failed to find it; return NOT_FOUND}

The following code calls Find() several times, and the comments inside explain which version of Find() is called:

int main()
{
    size_t res = NOT_FOUND;

    int myInt = 3, intArray[] = { 1, 2, 3, 4 };
    size_t sizeArray = std::size(intArray);
    res = Find(myInt, intArray, sizeArray);      // calls Find<int> by deduction
    res = Find<int>(myInt, intArray, sizeArray); // calls Find<int> explicitly

    double myDouble = 5.6, doubleArray[] = { 1.2, 3.4, 5.7, 7.5 };
    sizeArray = std::size(doubleArray);
    res = Find(myDouble, doubleArray, sizeArray);         // calls Find<double> by deduction
    res = Find<double>(myDouble, doubleArray, sizeArray); // calls Find<double> explicitly

    const char *word = "two";
    const char *words[] = { "one", "two", "three", "four" };
    sizeArray = std::size(words);
    res = Find<const char *>(word, words, sizeArray); // calls template specialization for const char*s
    res = Find(word, words, sizeArray);               // calls overloaded Find for const char*s

    int *intPointer = &myInt, *pointerArray[] = { &myInt, &myInt };
    sizeArray = std::size(pointerArray);
    res = Find(intPointer, pointerArray, sizeArray);    // calls the overloaded Find for pointers

    SpreadsheetCell cell1(10), cellArray[] = { SpreadsheetCell(4), SpreadsheetCell(10) };
    sizeArray = std::size(cellArray);
    res = Find(cell1, cellArray, sizeArray);                  // calls Find<SpreadsheetCell> by deduction
    res = Find<SpreadsheetCell>(cell1, cellArray, sizeArray); // calls Find<SpreadsheetCell> explicitly

    SpreadsheetCell *cellPointer = &cell1;
    SpreadsheetCell *cellPointerArray[] = { &cell1, &cell1 };
    sizeArray = std::size(cellPointerArray);
    res = Find(cellPointer, cellPointerArray, sizeArray); // Calls the overloaded Find for pointers

    return 0;
}

output

xz@xiaqiu:~/study/test/test$ ./testoriginaloriginaloriginaloriginalSpecializationoverloadptr specialoriginaloriginalptr specialxz@xiaqiu:~/study/test/test$ 

Template recursion

The functions provided by C + + templates are much more powerful than the simple classes and functions described earlier in this chapter and in Chapter 12. One of the functions is called template recursion. This section first explains the motivation of template recursion, and then describes how to implement template recursion. This section uses the operator overloading function discussed in Chapter 15. If you skip that chapter or make changes to operator [] The syntax of overloading is unfamiliar. Please refer to Chapter 15 before continuing.

N-dimensional mesh: first attempt

The previous Grid template example only supports two dimensions so far, which limits its usefulness. What if you want to write a mathematical program for three-dimensional tic tac toe or four-dimensional matrix? Of course, you can write a template class or a non template class for each dimension. However, this repeats a lot of code. Another method is to write only one-dimensional mesh. Then, instantiate the Grid with another Grid as the element type to create a Grid of any dimension. The Grid element type itself can be instantiated with the Grid as the element type, and so on. The following is the implementation of the OneDGrid class template. This is just a one-dimensional version of the Grid template in the previous example, adding the resize() method and replacing at() with operator []. Like standard library containers such as vector, the operator [] implementation does not perform boundary checking. In addition, in this example, mElements stores an instance of the job instead of an instance of std::optional.

template <typename T>class OneDGrid{public:    explicit OneDGrid(size_t size = kDefaultSize);    virtual ~OneDGrid() = default;    T &operator[](size_t x);    const T &operator[](size_t x) const;    void resize(size_t newSize);    size_t getSize() const { return mElements.size(); }    static const size_t kDefaultSize = 10;private:    std::vector<T> mElements;};template <typename T>OneDGrid<T>::OneDGrid(size_t size){    resize(size);}template <typename T>void OneDGrid<T>::resize(size_t newSize){    mElements.resize(newSize);}template <typename T>T &OneDGrid<T>::operator[](size_t x){    return mElements[x];}template <typename T>const T &OneDGrid<T>::operator[](size_t x) const{    return mElements[x];}

With this implementation of OneDGrid, you can create a multi-dimensional grid in the following ways;

int main(){    OneDGrid<int> singleDGrid;    OneDGrid<OneDGrid<int>> twoDGrid;    OneDGrid<OneDGrid<OneDGrid<int>>> threeDGrid;    singleDGrid[3] = 5;    twoDGrid[3][3] = 5;    threeDGrid[3][3][3] = 5;    return 0;}

This code works, but the declaration code looks a little messy. It is improved below.

Real N-dimensional grid

Templates can be used to recursively write "real" N-dimensional grids because the dimensions of the grid are recursive in nature. As can be seen from the following statement, onedgrid < onedgrid < onedgrid > > threedgrid

You can think of each nested level of OneDGrid as a recursive step, and the OneDGrid of int is the basic case of recursion. In other words, a three-dimensional mesh is a one-dimensional mesh of a one-dimensional mesh of an int one-dimensional mesh. Users do not need to recurse by themselves. They can write a class template to recurse automatically. Then, you can create the following N-dimensional mesh:

NDGrid<int, 1> singleDGrid;NDGrid<int, 2> twoDGrid;NDGrid<int, 3> threeDGrid;

NDGrid class template requires element type and integer representing dimension as parameters. The key problem here is that the element type of NDGrid is not the element type specified in the template parameter list, but another NDGrid specified in the recursive dimension of the upper layer. In other words, 3D mesh is the vector of 2D mesh, and 2D mesh is each vector of 1D mesh. When using recursion, you need to deal with the base case. You can write a partially specialized NDGrid with dimension 1, where the element type is not another NDGrid, but the element type specified by the template parameter. The following is the general form of NDGrid template definition, highlighting the differences from the previous OneDGrid

template <typename T, size_t N>
class NDGrid
{
public:
    explicit NDGrid(size_t size = kDefaultSize);
    virtual ~NDGrid() = default;

    NDGrid < T, N - 1 > & operator[](size_t x);
    const NDGrid < T, N - 1 > & operator[](size_t x) const;

    void resize(size_t newSize);
    size_t getSize() const { return mElements.size(); }

    static const size_t kDefaultSize = 10;

private:
    std::vector < NDGrid < T, N - 1 >> mElements;
};

Note that mElements are vectors of ndgrid < T, N - 1 >: This is a recursive step. In addition, operator [] returns a reference to the element type, which is still ndgrid < T, N - 1 > instead of T. The template definition of the basic case is a partial specialization with dimension 1:

template <typename T>
class NDGrid<T, 1>
{
public:
    explicit NDGrid(size_t size = kDefaultSize);
    virtual ~NDGrid() = default;

    T &operator[](size_t x);
    const T &operator[](size_t x) const;

    void resize(size_t newSize);
    size_t getSize() const { return mElements.size(); }

    static const size_t kDefaultSize = 10;

private:
    std::vector<T> mElements;
};

The recursion ends here: the element type is T, not another template instance. The most difficult part of the implementation of template recursion is not the template recursion itself, but the correct size of each dimension in the grid. This implementation creates an N-dimensional grid, and each dimension is the same size. Specifying a different size for each dimension is much more difficult. However, even with this simplification, there is still a problem: users should be able to create arrays of a specified size, such as 20 or 50. Therefore, the constructor receives an integer as a size parameter. However, when you dynamically reset the vector of a sub mesh, you cannot pass this size parameter to the sub mesh element because the vector uses the default constructor to create the object. Therefore, you must explicitly call resize() on each grid element of the vector. The base case does not need to resize the element, because the element of the base case is T, not the grid.

The following is the implementation of NDGrid master template, which highlights the differences between NDGrid and OneDGrid:

template <typename T, size_t N>
NDGrid<T, N>::NDGrid(size_t size)
{
    resize(size);
}

template <typename T, size_t N>
void NDGrid<T, N>::resize(size_t newSize)
{
    mElements.resize(newSize);

    // Resizing the vector calls the 0-argument constructor for
    // the NDGrid<T, N-1> elements, which constructs
    // them with the default size. Thus, we must explicitly call
    // resize() on each of the elements to recursively resize all
    // nested Grid elements.
    for (auto &element : mElements)
    {
        element.resize(newSize);
    }
}

template <typename T, size_t N>
NDGrid < T, N - 1 > & NDGrid<T, N>::operator[](size_t x)
{
    return mElements[x];
}

template <typename T, size_t N>
const NDGrid < T, N - 1 > & NDGrid<T, N>::operator[](size_t x) const
{
    return mElements[x];
}

The following is the implementation of partial specialization (basic case). Note that you must rewrite a lot of code because you cannot inherit any implementation in a specialization. The differences between and non specialized NDGrid are highlighted here:

template <typename T>NDGrid<T, 1>::NDGrid(size_t size){    resize(size);}template <typename T>void NDGrid<T, 1>::resize(size_t newSize){    mElements.resize(newSize);}template <typename T>T &NDGrid<T, 1>::operator[](size_t x){    return mElements[x];}template <typename T>const T &NDGrid<T, 1>::operator[](size_t x) const{    return mElements[x];}

Now, you can write the following code;

int main(){    NDGrid<int, 3> my3DGrid;    my3DGrid[2][1][2] = 5;    my3DGrid[1][1][1] = 5;    cout << my3DGrid[2][1][2] << endl;    return 0;}

output

xz@xiaqiu:~/study/test/test$ ./test5xz@xiaqiu:~/study/test/test$ 

Variable parameter template

Common templates can only take a fixed number of template parameters. The variable template can receive variable template parameters. For example, the following code defines a template that can receive any number of template parameters, using a parameter pack called Types:

template<typename... Types>class MyVariadicTemplate { };

be careful:

The three points after typename are not errors. This is the syntax for defining parameter packages for variable parameter templates. Parameter packages can receive variable parameters. Spaces are allowed before and after three points.

You can instantiate MyVariadicTemplate with any number of types, for example:

MyVariadicTemplate<int> instance1;MyVariadicTemplate<string, double, list<int>> instance2;

You can even instantiate MyVariadicTemplate with zero template parameters:

MyVariadicTemplate<> instance3;

To avoid instantiating variable parameter templates with zero template parameters, you can write templates as follows:

template<typename T1, typename... Types>class MyVariadicTemplate { };

With this definition, attempting to instantiate MyVariadicTemplate with zero template parameters will lead to compilation errors. For example, Microsoft Visual C + + will give the following error:

error C2976: 'MyVariadicTemplate' : too few template arguments

Different parameters passed to variable parameter templates cannot be traversed directly. The only way is with the help of template recursion. The following two examples illustrate how to use the variable parameter template.

Type safe variable length parameter list

Variable parameter templates allow you to create type safe variable length parameter lists. The following example defines a variable parameter template processValues(), which allows variable numbers of parameters of different types to be received in a type safe manner. The function processValues() processes each value in the variable length parameter list and executes the handleValue() function on each parameter. This means that the handleValue() function must be written for each type to be processed, such as int, double, and string in the following example:

void handleValue(int value) { cout << "Integer: " << value << endl; }void handleValue(double value) { cout << "Double: " << value << endl; }void handleValue(string_view value) { cout << "String: " << value << endl; }void processValues() { /* Nothing to do in this base case.*/ }template<typename T1, typename... Tn>void processValues(T1 arg1, Tn... args){    handleValue(arg1);    processValues(args...);}

In the previous example, the three-point operator "..." was used twice. This operator appears in three places and has two different meanings. First, it is used after typename in the template parameter list and after type Tn in the function parameter list. In both cases, it represents a parameter package. The parameter package can receive variable parameters. The second use of the "..." operator is after the parameter name args in the function body. In this case, it represents the parameter package extension. This operator unpacks / expands the parameter package to get each parameter. It basically extracts the content to the left of the operator and repeats it for each template parameter in the package, separated by commas. Take the following lines from the previous example:

processValues(args...);

This line unpacks (or extends) the args parameter package into different parameters, separates the parameters by commas, and then calls the processvalues () function with these expanded parameters. Templates always require at least one template parameter: T1. The result of calling processValues() recursively through args.. is that each call will lose one template parameter. Since the implementation of the processvalues () function is recursive, you need to take a method to stop recursion. To do this, implement a processvalues () function that requires it to receive zero parameters. The processValues() variable parameter template can be tested with the following code:

processValues(1, 2, 3.56, "test", 1.1f);

The recursive call generated by this example is:

processValues(1, 2, 3.56, "test", 1.1f); handleValue(1);  processValues(2, 3.56, "test", 1.1f);   handleValue(2);    processValues(3.56, "test", 1.1f);     handleValue(3.56);        processValues("test", 1.1f);       handleValue("test");        processValues(1.1f);         handleValue(1.1f);          processValues();

It is important to remember that this variable length parameter list is completely type safe. The processvalues () function will automatically call the correct overloaded version of handlevalue() based on the actual type. C + + also performs type conversion automatically as usual. For example, the type of 1.1f in the previous example is float. The processvalues () function calls handleValue(double value) because there is no loss in the conversion from float to double. However, if processvalues () is called with some type of parameter, and this type has no corresponding handlevalue() function, the compiler will generate an error. There is a small problem with the previous implementation. Since this is a recursive implementation, the parameters are copied every time processvalues () is called recursively. Depending on the type of parameter, this can be costly. You might think that this duplication problem can be avoided by passing a reference to the processValues() function instead of using the pass by value method. Unfortunately, processValues() cannot be called literally Because literal references are not allowed unless const references are used. In order to use literal values while using non const references, forward references can be used. The following implementation uses forward references T & &, and STD:: forward (perfect forwarding all parameters means that if rvalue is passed to processValues() , it is forwarded as an rvalue reference. If lvalue or lvalue reference is passed to processvalues (), it is forwarded as an lvalue reference.

void processValues() { /* Nothing to do in this base case.*/ }template<typename T1, typename... Tn>void processValues(T1&& arg1, Tn&&... args){    handleValue(std::forward<T1>(arg1));    processValues(std::forward<Tn>(args)...);}

One line of code needs further explanation:

processValues(std::forward<Tn>(args)...);

The "..." operator is used to unpack the parameter package. It uses std::forward() on each parameter in the parameter package and separates them with commas. For example, suppose args is a parameter package with three parameters (A1, A2 and A3), corresponding to three types (A1, A2 and A3). The extended call is as follows:

processValues(std::forward<A1>(a1),			  std::forward<A2>(a2),              std::forward<A3>(a3));

In the function body that uses the parameter package, the number of parameters in the parameter package can be obtained by the following methods:

int numOfArgs = sizeof...(args);

A practical example of using variable length parameter templates is to write a safe and type safe function template similar to printf(). This is a good exercise to practice variable length parameter templates.

#include <iostream>#include <string>#include <string_view>using namespace std;void handleValue(int value){    cout << "Integer: " << value << endl;}void handleValue(double value){    cout << "Double: " << value << endl;}void handleValue(string_view value){    cout << "String: " << value << endl;}// First version using pass-by-valuevoid processValues()    // Base case{    // Nothing to do in this base case.}template<typename T1, typename... Tn>void processValues(T1 arg1, Tn... args){    handleValue(arg1);    processValues(args...);}// Second version using pass-by-rvalue-referencevoid processValuesRValueRefs()  // Base case{    // Nothing to do in this base case.}template<typename T1, typename... Tn>void processValuesRValueRefs(T1 &&arg1, Tn &&... args){    handleValue(std::forward<T1>(arg1));    processValuesRValueRefs(std::forward<Tn>(args)...);}int main(){    processValues(1, 2, 3.56, "test", 1.1f);    cout << endl;    processValuesRValueRefs(1, 2, 3.56, "test", 1.1f);    return 0;}

output

xz@xiaqiu:~/study/test/test$ ./testInteger: 1Integer: 2Double: 3.56String: testDouble: 1.1Integer: 1Integer: 2Double: 3.56String: testDouble: 1.1xz@xiaqiu:~/study/test/test$ 

Variable mixed class

Parameter packages can be used almost anywhere. For example, the following code uses a parameter package to define a variable mixed class for MyClass class. Chapter 5 discusses the concept of mixed class.

class Mixin1
{
public:
    Mixin1(int i) : mValue(i) {}
    virtual void Mixin1Func() { cout << "Mixin1: " << mValue << endl; }
private:
	int mValue;
};
class Mixin2
{
public:
    Mixin2(int i) : mValue(i) {}
    virtual void Mixin2Func() { cout << "Mixin2: " << mValue << endl; }
private:
	int mValue;
};
template<typename... Mixins>
class MyClass : public Mixins...
{
public:
    MyClass(const Mixins&... mixins) : Mixins(mixins)... {}
    virtual ~MyClass() = default;
};

The above code first defines two mixed classes Mixin1 and Mixin2. Their definition in this example is very simple. Their constructors receive an integer and then save the integer. The two classes have a function to print the information of a specific instance. The MyClass variable parameter template uses the parameter package typename... Mixins to receive variable mixed classes. MyClass continues Accept all mixed classes, and its constructor receives the same number of parameters to initialize each inherited mixed class. Note that the....... Extension operator basically receives the contents on the left of the operator. Repeat these contents for each template parameter in the parameter package and separate them with commas. MyClass can be used as follows:

MyClass<Mixin1, Mixin2> a(Mixin1(11), Mixin2(22));
a.Mixin1Func();
a.Mixin2Func();

MyClass<Mixin1> b(Mixin1(33));
b.Mixin1Func();
//b.Mixin2Func(); // Error: does not compile.

MyClass<> c;
//c.Mixin1Func(); // Error: does not compile.
//c.Mixin2Func(); // Error: does not compile.

Attempting to call Mixin2Func() on b will result in a compilation error because b does not inherit from the Mixin2 class. The output of this program is as follows:

output

xz@xiaqiu:~/study/test/test$ g++ -o test test.cpp -std=c++17test.cpp: In function 'int main()':test.cpp:35:7: error: 'class MyClass<Mixin1>' has no member named 'Mixin2Func'; did you mean 'Mixin1Func'?   35 |     b.Mixin2Func(); // Error: does not compile.      |       ^~~~~~~~~~      |       Mixin1Funcxz@xiaqiu:~/study/test/test$ 

output

xz@xiaqiu:~/study/test/test$ ./testMixin1: 11Mixin2: 22Mixin1: 33xz@xiaqiu:~/study/test/test$ 

Collapse expression

C++17 adds folding expression In this way, it will be easier to process parameter packages in the variable parameter template. Table 22-1 lists the four supported folding types. In this table, 6 can be any of the following operators: +, -, *, /,%, ^, &, |, <, > > >, + =, - =, * =, / =,% =, ^ =, & =, | =, < =, > =, =, = =,! =, <, >, < =, > =, & &, |,. *, - > * * * *.

Some examples are analyzed below. The previous processValues() function template is defined recursively, as follows:

void processValues() //Nothing to do in this base case
{
	template<typename T1,typename... Tn>
	void processValues(T1 arg1,Tn... args)
	{
		handleValue(arg1);
		processValue(args...);
	}
}

Because it is defined recursively, the basic case is required to stop recursion. Use the folding expression, use the unary right folding, and implement it through a single function template. At this time, the basic case is not required.

template<typename... Tn>void processValues(const Tn&... args){	(handleValue(args),...);}

Basically, three points in the function body trigger folding. Expand this line, call handlevalue0 for each parameter in the parameter package, and separate each call to handlevalue() with commas. For example, suppose args is a parameter package containing three parameters (a1, a2 and a3). The unary right folding is as follows

(handleValue(a1), (handleValue(a2), handleValue(a3)));

Here is another example. The printValues() function template writes all arguments to the console, separated by line breaks.

template<typename... Values>
void printValues(const Values&... values)
{
	((cout << values << endl), ...);
}

Suppose values is a parameter package containing three parameters (v1, v2 and v3). The extended form of unary right folding is as follows:

((cout << v1 << endl), ((cout << v2 << endl), (cout << v3 << endl)));

Any number of arguments can be used when calling printValues(), as follows:

printValues(1, "test", 2.34);

In these examples, folding is used in conjunction with the comma operator, but in fact, folding can be used in conjunction with any type of operator. For example, the following code defines a variable parameter function template that uses binary left folding to recalculate the sum of all values passed to it. Binary left folding always requires an Init value (see table 22-1D). Therefore, sumValues() There are two template type parameters: one is a normal parameter, which is used to specify the type of Init, and the other is a parameter package, which can receive 0 or more arguments.

template<typename T, typename... Values>double sumValues(const T& init, const Values&... values){	return (init + ... + values);}

Suppose values is a parameter package containing three parameters (v1, v2 and v3). The extended form of binary left folding county is as follows;

return (((init + v1) + v2) + v3);

The usage of sumValues() function template is as follows:

cout << sumValues(1, 2, 3.3) << endl;cout << sumValues(1) << endl;

The function template requires at least one parameter, so the following code cannot be compiled:

cout << sumvalues() << endl;

Template meta programming

This section explains template meta programming. This is a very complex topic. Some books on template metaprogramming explain all the details. This book does not have enough space to explain all the details of template meta programming. This section explains the most important concepts through several examples. The goal of template metaprogramming is to perform some calculations at compile time rather than at run time. Template metaprogramming is basically a small programming language based on C + +. The following first discusses a simple example. This example calculates the factorial of a number at compile time and can use the calculation result as a simple constant at run time.

Compile time factorial

The following code demonstrates how to calculate the factorial of a number at compile time. The code uses the template recursion described earlier in this chapter. We need a recursion template and a basic template to stop recursion. According to the mathematical definition, the factorial of 0 is 1, so it is used as the basic case:

template<unsigned char f>class Factorial{    public:    static const unsigned long long val = (f * Factorial<f - 1>::val);};template<>class Factorial<0>{    public:    static const unsigned long long val = 1;};int main(){    cout << Factorial<6>::val << endl;    return 0;}

This will calculate the factorial of 6, mathematically expressed as 6!, The value is 1x2x3x4x5x6 or 720.

be careful

Remember that factorials are calculated at compile time. At runtime, the value calculated at compile time can be accessed through:: val, which is just a static constant value. The above specific example calculates the factorial of a number at compile time, but template meta programming is not necessary. Due to the introduction of constexpr, you can write it in the following form without using a template. However, template implementation is still an excellent example of implementing recursive templates.

constexpr unsigned long long factorial(unsigned char f){    if (f == 0)     {	    return 1;    }     else     {    	return f * factorial(f - 1);    }}

If the following version is called, the value is calculated at compile time:

constexpr auto f1 = factorial(6);

However, in this statement, don't forget constexpr. If you write the following code, the calculation will be completed at run time!

auto f1 = factorial(6);

In the template metaprogramming version, such errors cannot be made. Always make calculations complete at compile time.

Loop expansion

The second example of template metaprogramming is to expand the Loop at compile time instead of executing the Loop at run time. Note that Loop unrolling should only be used when needed, because the compiler is usually smart enough to automatically unroll loops that can be unrolled. This example uses template recursion again because you need to do something in the Loop at compile time. In each recursion, the Loop template instantiates itself through i- 1. Stop recursion when 0 is reached.

template<int i>
class Loop
{
public:
    template <typename FuncType>
    static inline void Do(FuncType func) 
    {
        Loop<i - 1>::Do(func);
        func(i);
    }
};
template<>
class Loop<0>
{
public:
    template <typename FuncType>
    static inline void Do(FuncType /* func */) { }
};

You can use the Loop template as follows:

void DoWork(int i) { cout << "DoWork(" << i << ")" << endl; }int main(){	Loop<3>::Do(DoWork);}

This code will cause the compiler to expand the loop and call the DoWork() function three times in a row. The output of this program is as follows:

DoWork(1)DoWork(2)DoWork(3)

Using lambda expressions, you can use the version of DoWork() that receives multiple parameters:

void DoWork2(string str, int i){	cout << "DoWork2(" << str << ", " << i << ")" << endl;}int main(){	Loop<2>::Do([](int i) { DoWork2("TestStr", i); });}

The above code first implements a function that receives a string and an int value. The main() function uses a lambda expression and calls DoWork() with a fixed string Test() as the first parameter in each iteration. Compile and run the above code, and the output should be as follows:

DoWork2(TestStr,1)
DoWork2(TestStr,2)

Print tuple

This example prints the elements in std::tuple through template meta programming. Chapter 20 explains tuples. Tuples allow you to store any number of values, each of which has its own specific type. Tuples have a fixed size and value type, which are determined at compile time. However, tuples do not provide any built-in mechanism to traverse their elements. The following example demonstrates how to metaprogram elements in tuples at compile time through template metaprogramming. As in most cases in template metaprogramming, this example also uses template recursion. tuple_ The print class template receives two template parameters: tuple type and integer initialized to tuple size. Then recursively instantiate itself in the constructor, reducing the size of each call. When the size becomes 0, tuple_ A partial specialization of print stops recursion. The main() function demonstrates how to use this tuple_print class template.

template<typename TupleType, int n>
class tuple_print
{
public:
    tuple_print(const TupleType& t) 
    {
        tuple_print<TupleType, n - 1> tp(t);
        cout << get<n - 1>(t) << endl;
    }
};
template<typename TupleType>
class tuple_print<TupleType, 0>
{
public:
	tuple_print(const TupleType&) { }
};
int main()
{
    using MyTuple = tuple<int, string, bool>;
    MyTuple t1(16, "Test", true);
    tuple_print<MyTuple, tuple_size<MyTuple>::value> tp(t1);
}

Analyze the main() function and you will find that tuple is used_ The line of the print class template looks a little complicated because the size and exact type of tuples are required as template parameters. This code can be simplified by introducing an auxiliary function template that automatically deduces template parameters. The simplified implementation is as follows;

template<typename TupleType, int n>
class tuple_print_helper
{
    public:
    tuple_print_helper(const TupleType& t) 
    {
        tuple_print_helper<TupleType, n - 1> tp(t);
        cout << get<n - 1>(t) << endl;
	}
};
template<typename TupleType>
class tuple_print_helper<TupleType, 0>
{
    public:
    tuple_print_helper(const TupleType&) { }
};
template<typename T>
void tuple_print(const T& t)
{
	tuple_print_helper<T, tuple_size<T>::value> tph(t);
}
int main()
{
    auto t1 = make_tuple(167, "Testing", false, 2.3);
    tuple_print(t1);
}

The first change here is the original tuple_ Rename the print class template to tnple_print_helper. Then, the above code implements a program named tuple_ The small function template of print (), which receives the tuple type as the template type parameter and the reference to the tuple itself as the function parameter. Instantiate the tuple in the body of the function_ print_ Helper class template. The main() function shows how to use this simplified version. Since the exact type of solution group is no longer necessary, make can be used in combination with the auto keyword_ tuple(). tuple_ The call of print() function template is very simple, as shown below:

tuple_print (t1)

You do not need to specify the parameters of the function template because the compiler can automatically infer from the supplied parameters.

  1. constexpr if

C++17 introduces constexpr if. These are statements that are executed at compile time, not at run time. If the branch of the constexpr if statement never arrives, it will not be compiled. This can be used to simplify a large number of template metaprogramming techniques, as well as SFINAE discussed later in this chapter. For example, you can use constexp ri to simplify the previous code for printing tuple elements as follows. Note that the basic case of template recursion is no longer needed because recursion can be stopped through the constexpr if statement.

template<typename TupleType, int n>class tuple_print_helper{public:    tuple_print_helper(const TupleType& t)     {        if constexpr(n > 1)         {        	tuple_print_helper<TupleType, n - 1> tp(t);    	}    	cout << get<n - 1>(t) << endl;    }};template<typename T>void tuple_print(const T& t){	tuple_print_helper<T, tuple_size<T>::value> tph(t);}

Now, you can even discard the class template itself and replace it with a simple function template tuple_print_helper:

template<typename TupleType, int n>void tuple_print_helper(const TupleType& t) {    if constexpr(n > 1)     {	    tuple_print_helper<TupleType, n - 1>(t);    }    cout << get<n - 1>(t) << endl;}template<typename T>void tuple_print(const T& t){	tuple_print_helper<T, tuple_size<T>::value>(t);}

It can be further simplified. Combine the two methods into one, as follows:

template<typename TupleType, int n = tuple_size<TupleType>::value>void tuple_print(const TupleType& t) {    if constexpr(n > 1)     {    	tuple_print<TupleType, n - 1>(t);    }    cout << get<n - 1>(t) << endl;}

Still call as before:

auto t1 = make_tuple(167, "Testing", false, 2.3);tuple_print(t1);
  1. Use compile time Integer Sequences and folds

C + + using STD:: integer_ Sequence (defined in) supports compile time integer sequences. A common use case of template metaprogramming is to generate a compile time index sequence, that is, size_ A sequence of integers of type T. Here, you can use the auxiliary std::index_sequence. You can use std::index_sequence_for to generate an index sequence of the same length as the given parameter package. The tuple printing program is realized by using variable parameter template, compile time index sequence and C++17 folding expression:

template<typename Tuple, size_t... Indices>
void tuple_print_helper(const Tuple& t, index_sequence<Indices...>)
{
	((cout << get<Indices>(t) << endl), ...);
}
template<typename... Args>
void tuple_print(const tuple<Args...>& t)
{
	tuple_print_helper(t, index_sequence_for<Args...>());
}

When called, tuple_ print_ The unary right folding County expression in the helper () function template is extended to the following form

Type trait

With type trait, you can make decisions based on the type at compile time. For example, you can write a template that requires a type derived from a specific type, or a type that can be converted to a specific type, or an integer type, and so on. The C + + standard defines some auxiliary classes for this purpose. All functions related to type trait are defined in < type_ Traits > header file. The type trait is divided into several different categories. Some examples of the available type trait for each category are listed below. For a complete list, refer to the standard library reference resources (see Appendix B).

➤ original type category
o is_void
o is_integral
o is_floating_point
o is_pointer
o ...
➤ type attribute
o is_const
o is_literal_type
o is_polymorphic
o is_unsigned
o is_constructible
o is_copy_constructible
o is_move_constructible
o is_assignable
o is_trivially_copyable
o is_swappable*
o is_nothrow_swappable*
o has_virtual_destructor
o has_unique_object_representations*
o ...
➤ reference modification
o remove_reference
o add_lvalue_reference
o add_rvalue_reference
➤ pointer modification
o remove_pointer
o add_pointer
➤ composite type category
o is_reference
o is_object
o is_scalar
o ...
➤ type relationship
o is_same
o is_base_of
o is_convertible
o is_invocable*
o is_nothrow_invocable*
o ...
➤ const volatile modification
o remove_const
o add_const
o ...
➤ symbol modification
o make_signed
o make_unsigned
➤ array modification
o remove_extent
o remove_all_extents
➤ logical operator trait
o conjuction*
o disjunction*
o negation*
➤ other conversions
o enable_if
o conditional
o invoke_result*
o ...

Type trait marked with an asterisk is only available in C++17 and later versions. Type trait is a very advanced C + + feature. The above list only shows some types of trait in the C + + standard. From this list alone, it can be seen that this book cannot explain all the details of type trait. Here are just a few use cases to show how to use type trait.

  1. Use type category

Before giving an example of a template that uses type trait, you should first learn something like is_ How integral's classes work. C + + standard for integral_ The definition of constant class is as follows:;

template <class T, T v>struct integral_constant {    static constexpr T value = v;    using value_type = T;    using type = integral_constant<T, v>;    constexpr operator value_type() const noexcept { return value; }    constexpr value_type operator()() const noexcept { return value; }};

This also defines bool_constant,true_type and false_type alias:

template <bool B>using bool_constant = integral_constant<bool, B>;using true_type = bool_constant<true>;using false_type = bool_constant<false>;

This defines two types: true_type and false_type. When calling true_ When type:: value, the value obtained is true; Call false_type::value, the value obtained is false. You can also call true_type::type, which returns true_type. This also applies to false_type. Such as is_ Integrated and is_class inherits true_type or false_type. For example, is_integral is a special case of bool type, as shown below:

template<> struct is_integral<bool> : public true_type { };

This allows you to write is_integral::value and returns true. Note that you do not need to write these exceptions yourself, they are part of the standard library. The following code demonstrates the simplest example of using type categories:

if (is_integral<int>::value) {	cout << "int is integral" << endl;} else {	cout << "int is not integral" << endl;}if (is_class<string>::value) {	cout << "string is a class" << endl;} else {	cout << "string is not a class" << endl;}

This example is through is_integral to check whether int is an integer type, and pass is_class to check whether the string is a class. The output is as follows:

int is integralstring is a class

For each trait with a value member, C++17 adds a variable template with the same name as the trait, followed by_ v. Instead of writing some trait::value, write some trait_v. For example, is_integral_v and is_const_v et al. Let's rewrite the previous example with a variable template

if (is_integral_v<int>) {	cout << "int is integral" << endl;} else {	cout << "int is not integral" << endl;}if (is_class_v<string>) {	cout << "string is a class" << endl;} else {	cout << "string is not a class" << endl;}

Of course, you may never use type trait in this way. Type trait is more useful only when combining templates to generate code based on certain properties of types. The following template example demonstrates this. The code defines the function template process_helper() has two overloaded versions. This function template receives a type as a template parameter. The first parameter is a value and the second parameter is true_type or false_ The. process() function template of the instance of type receives a parameter and calls process_helper() function:

template<typename T>void process_helper(const T& t, true_type){	cout << t << " is an integral type." << endl;}template<typename T>void process_helper(const T& t, false_type){	cout << t << " is a non-integral type." << endl;}template<typename T>void process(const T& t){	process_helper(t, typename is_integral<T>::type());}

process_ The second parameter of the helper () function call is defined as follows:

typename is_integral<T>::type()

This parameter uses is_integral determines whether it is an integer type. Use:: type to access the result integral_constant type, which can be true_type or package false_type. process_helper() function requires true_ An instance of type or false type is used as the second parameter, which is why: there are two empty parentheses after type. Note that process_ Two overloaded versions of the helper() function use the type true_type or false_ Unknown parameter to type. Because these parameters are not used inside the function body, they are anonymous. These parameters are only used for function overload resolution.

These codes are tested as follows:

process(123);process(2.2);process("Test"s);

output

123 is an integral type.2.2 is a non-integral type.Test is a non-integral type.

The previous example uses only a single function template, but does not show how to use type trait to select different overloads based on type.

template<typename T>void process(const T& t){    if constexpr (is_integral_v<T>)     {    	cout << t << " is an integral type." << endl;    }     else     {    	cout << t << " is a non-integral type." << endl;    }}
  1. Use type relationships

There are three types of relationships; is_same,is_base of and is_convertible. An example is given below to show how to use is_same. The rest of the type relationships work similarly. The following same() function template is_ The same type trait determines whether two given parameters are of the same type, and then outputs the corresponding information.

template<typename T1, typename T2>void same(const T1& t1, const T2& t2){    bool areTypesTheSame = is_same_v<T1, T2>;    cout << "'" << t1 << "' and '" << t2 << "' are ";    cout << (areTypesTheSame ? "the same types." : "different types.") << endl;}int main(){    same(1, 32);    same(1, 3.01);	same(3.01, "Test"s);}

output

'1' and '32' are the same types.'1' and '3.01' are different types'3.01' and 'Test' are different types
  1. Use enable if

Using enable enables you to understand the "substitution failure is not an error (SFINAE) feature, which is a complex and astringent feature in C + +. Only the basic knowledge of SFINAE is explained below. If you have a set of overloaded functions, you can use enable_ If selectively disables certain overloads based on certain type characteristics. enable_ If is usually used to overload the return type of a function group. enable_ If receives two template type parameters. The first parameter is a Boolean value, and the second parameter is a type that defaults to void. If the Boolean value is true, enable_ The if class has a nested type that can be accessed using:: type, which is given by the second template type parameter. If the Boolean value is false, there is no nested type. The C + + standard defines alias templates for traits (such as enable_if) with type members, which have the same name as the trait but are attached_ t. For example, do not write the following code:

typename enable_if<..., bool>::type

And write the following shorter version:

enable_if_t<..., bool>

Via enable_if can rewrite the previous example of using the same() function template as an overloaded check_type() function template. In this version, the check type() function returns true or false depending on whether the type of the given value is the same. If you don't want to check_ type returns anything. You can delete the return statement and enable_ The second template type parameter of if, or replace with void.

template<typename T1, typename T2>
enable_if_t<is_same_v<T1, T2>, bool>
check_type(const T1& t1, const T2& t2)
{
    cout << "'" << t1 << "' and '" << t2 << "' ";
    cout << "are the same types." << endl;
    return true;
}
template<typename T1, typename T2>
enable_if_t<!is_same_v<T1, T2>, bool>
check_type(const T1& t1, const T2& t2)
{
    cout << "'" << t1 << "' and '" << t2 << "' ";
    cout << "are different types." << endl;
    return false;
}

int main()
{
    check_type(1, 32);
    check_type(1, 3.01);
    check_type(3.01, "Test"s);
}

output

'1' and '32' are the same types.'1' and '3.01' are different types.'3.01' and 'Test' are different types.

The above code defines two versions of check_type(), whose return type is enable_ The nested type bool of if. First, through is_same_v check whether the two types are the same, and then pass enable_if_t get results. When enable_if_ Enable when the first parameter of T is true_ if_ The type of T is bool; When the first parameter is false, there will be no return type. This is where SFINAE works. When the compiler starts compiling the first line of the main() function, it tries to find a check that receives two integer values_ Type() function. The compiler will find the first overloaded check in the source code_ Type() function template, and set TI and T2 to integers to infer the instances that can use this template. The compiler then attempts to determine the return type. Since these two parameters are integers, they are of the same type, is_same_v < T1, T2 > will return true, which causes enable_if_t < true, bool > return type bool. In this way, everything is fine when instantiating, and the compiler can use this version of check type(). However, when the compiler tries to compile the second line of the main() function, the compiler tries again to find the appropriate check_type() function. The compiler starts with the first check_ Start with type(), and judge that TI can be set as int type and T2 can be set as double type. The compiler then attempts to determine the return type. This time, TI and T2 are different types, which means is_same_v < T1, T2 > will return false. Therefore, enable if T < false, bool > does not represent a type, and the check type() function will not have a return type. The compiler will notice this error, but due to SFINAE, no real compilation error will be generated. The compiler will normally backtrack and try to find another check_type() function. In this case, the second check_type() works because! is_same_v < T1, T2 > is true, then enable_if_t < true, bool > return type bool.

If you want to use enable on a set of constructors_ If cannot be used for a return type because the constructor has no return type. At this point, you can use enable on additional constructor parameters with default values_ if. It is recommended to use enable with caution_ If is only used when overload ambiguity needs to be resolved, that is, it cannot be used when other technologies (such as special case, partial special case, etc.) are used to resolve overload ambiguity. For example, if you only want compilation to fail when the wrong type is used on the template, you should use static introduced in Chapter 27_ Assert() instead of SFINAE. Of course, enable_if there are legitimate use cases. An example is to instantiate a copy function for a class similar to a custom vector, using enable and is_trivially_copyable type trait performs bitwise copying on ordinary replicable types (for example, using the C function memcpy()).

Warning:

Relying on SFEINAE is a very skilled and complex thing. If you selectively use SFINAE and enable to disable the error overloading in the overload set, you will get strange compilation errors, which are difficult to track.

4. Use constexpr_if to simplify the enable_if structure

As can be seen from the previous examples, using enable_if will be very complex. In some cases, the constexpr if feature introduced by C++17 helps to greatly simplify enable_if. For example, suppose that there are the following two classes;

class IsDoable{public:	void doit() const { cout << "IsDoable::doit()" << endl; }};class Derived : public IsDoable { };

You can create a function template call doit(). If the method is available, it calls the doit() method, and otherwise prints an error message on the console. To do this, use enable_if to check whether the given type derives from IsDoable:

template<typename T>enable_if_t<is_base_of_v<IsDoable, T>, void>call_doit(const T& t){	t.doit();}template<typename T>enable_if_t<!is_base_of_v<IsDoable, T>, void>call_doit(const T&){	cout << "Cannot call doit()!" << endl;}

The following code tests the implementation:

Derived d;
call_doit(d);
call_doit(123);

output

IsDoable::doit()
Cannot call doit()!

Using constexpr_if of C++17 can greatly simplify the implementation of enable_if:

template<typename T>
void call_doit(const T& [[maybe_unused]] t)
{
	if constexpr(is_base_of_v<IsDoable, T>) 
	{
		t.doit();
	} 
	else 
	{	
		cout << "Cannot call doit()!" << endl;
	}
}	

You can't do this with a normal if statement! With a normal if statement, both branches need to be compiled, and if you specify a type t that does not derive from IsDoable, it will fail. At this time, the line t.doit() cannot be compiled. However, with a constexpr statement, if a type that does not derive from IsDoable is provided, t.doit() One line will not even compile! Note that the [[may_unused]] feature introduced by C++17 is used here. If the given type T is not derived from IsDoable, the t.doit() line will not compile. Therefore, in call_doit() The parameter t will not be used in the instantiation of. If there are unused parameters, most compilers will give warnings or even errors. This feature can prevent such warnings or errors of parameter T. instead of using is_base_of type trait, you can also use the newly introduced is_invocable trait in C++17, which can be used to determine whether a given function can be used when calling a given function Group. The following is the call_doit() implementation using is_invocable trait:

template<typename T>void call_doit(const T& [[maybe_unused]] t){    if constexpr(is_invocable_v<decltype(&IsDoable::doit), T>)     {        t.doit();    }     else     {        cout << "Cannot call doit()!" << endl;    }}
  1. Logical operator trait

There are three logical operators: trail, concatenation, disjunction, and negation. Variable templates ending with _vare also available. These trails receive a variable number of template type parameters and can be used to perform logical operations on type trail, as shown below:

cout << conjunction_v<is_integral<int>, is_integral<short>> << " ";cout << conjunction_v<is_integral<int>, is_integral<double>> << " ";cout << disjunction_v<is_integral<int>, is_integral<double>,is_integral<short>> << " ";cout << negation_v<is_integral<int>> << " ";

output

1 0 1 0

C + + multithreaded programming

On multiprocessor computer systems, multithreading programming is very important. People are allowed to write programs that use all processors in parallel. The system can obtain multiple processor units in a variety of ways. The system can have multiple processor chips. Each chip is an independent CPU (central processing unit). The system can also have only one processor chip, but the chip is internally composed of multiple independent CPUs (also known as core components. These processors are called multi-core processors. Systems can also be a combination of the above two methods. Although systems with multiple processor units have existed for a long time, they are rarely used in consumer systems. Today, all major CPU vendors sell multi-core processors. Today, from servers to PC s and even smart hands Computers are using multi-core processors. Due to the popularity of this multi-core processor, it is becoming more and more important to write multi-threaded applications. Professional C + + programmers need to know how to write correct multi-threaded code to make full use of all available processor units. The writing of multi-threaded applications used to rely on platform and operating system related API s. This makes cross platform Multithreaded programming is difficult. C++11 solves this problem by introducing a standard thread library. Multithreaded programming is a complex topic. This chapter explains multithreaded programming using standard thread libraries, but due to space constraints, it is impossible to cover all the details. There are some professional books on multithreaded programming on the market. If you are interested in more details, Please refer to the references listed in the "multithreading" section of Appendix B. other third-party C + + libraries can also be used to write platform independent multithreaded programs, such as pthreads library and boost:thread library. However, these libraries are not part of the C + + standard, so they are not discussed in this book.

Overview of multithreaded programming

Multiple calculations can be executed in parallel through multithreaded programming, which can make full use of multiple processor units in most systems today. Decades ago, the CPU market competed for the highest frequency, and the dominant frequency is very important for single threaded applications. Around 2005, this competition has stopped due to the problems of power management and heat dissipation management. Today, the CPU market The competition is the largest number of cores in a single processor chip. At the time of writing this book, dual core and quad core CPUs have become very common. There are also reports that 12 core, 16 core, 18 core or more processors will be released. Similarly, look at the processors called GPU in graphics cards, and you will find that they are massively parallel processors. Today, high-end graphics cards already have 4000 processors With multiple cores, the number will increase rapidly. These graphics cards are not only used for games, but also perform computing intensive tasks, such as image and video processing, protein folding (for discovering new drugs) and SETI (search for extra terrestrial Intelligence) Signal processing in the project. C++98/03 does not support multi-threaded programming, so we must use the third-party library or the multi-threaded API in the target operating system. Since C++11, C + + has a standard multi-threaded library, which makes it easier to write cross platform multi-threaded applications. The current C + + standard is only for CPU and not for GPU. This situation may change in the future. There are two reasons why we should start writing multithreaded code. First, suppose there is a computing problem that can be broken down into small blocks that can run independently of each other, then running on a multiprocessor unit can obtain a huge performance improvement. Second, we can modularize computing tasks on the orthogonal axis; for example, we can perform long-time computing in threads without blocking UI threads, In this way, the user interface can still respond when long-time calculations are performed in the background.

Of course, it is not always possible to decompose the problem into parts that can be executed independently and in parallel. However, at least part of the problem can be parallelized to improve performance. One difficulty in multithreaded programming is to parallelize the algorithm, which is highly related to the type of algorithm. Other difficulties are to prevent contention conditions, deadlocks, tearing and pseudo sharing. These can all be used Sub or explicit synchronization mechanism, see later in this chapter.

Warning;

To avoid these multithreading problems, programs should be designed so that multiple threads do not need to read and write shared memory locations. You can also use the atomic operations described in section 23.3 "atomic operation library" later in this chapter, or use the synchronization method described in section 23.4 "mutual exclusion".

Contention condition

A contention condition can occur when multiple threads want to access any kind of shared resource. The contention condition for the shared memory context is called "data contention" . data contention occurs when multiple threads access the shared memory and at least one thread writes to the shared memory. For example, suppose there is a shared variable, one thread increments the value of the variable and the other thread decrements its value. Increasing and decrementing this value means that the current value needs to be obtained from memory, and the results are saved back to memory after increasing or decrementing. In older architectures, such as PDP-11 and VAX, this is done through an atomic INC processor instruction. In modern x86 processors, the INC instruction is no longer atomic, which means that other instructions can be executed in this operation, which may cause the code to obtain error values. Table 23-1 shows the result of the incrementing thread ending before the decrementing thread starts, assuming the initial value The value is 1.

Thread 1 (increment)Thread 2 (decrement)
Load value (value = 1)
Incremental value (value = 2)
Stored value (value = 2)
Load value (value = 2)
Decrement value (value = 1)
Stored value (value = 1)

The final value stored in memory is 1. When the decrement thread completes before the increment thread starts, the final value is also 1, as shown in table 23-2.

Thread 1 (increment)Thread 2 (decrement)
Load value (value = 1)
Decrement value (value = 0)
Store value (value = 0)
Load value (value = 0)
Incremental value (value = 1)
Stored value (value = 1)

However, when instructions are interleaved, the results are different, as shown in table 23-3

Thread 1 (increment)Thread 2 (decrement)
Load value (value = 1)
Incremental value (value = 2)
Load value (value = 1)
Decrement value (value = 0)
Stored value (value = 2)
Store value (value = 0)

In this case, the final result is 0. In other words, the result of the increment operation is lost. This is a contention condition.

Si crack

Tearing is a special case or result of data contention. There are two types of Mo crack; Tear read and write. If a thread has written part of the data into memory, but some data has not been written, any other thread reading the data will see inconsistent data and tear reading occurs. If two threads write data at the same time, one thread may write a part of the data, while the other thread may write another part of the data. The final results will be inconsistent and torn writes occur.

deadlock

If you choose to use synchronization methods such as mutual exclusion to solve the problem of contention conditions, you may encounter another common problem of multithreaded programming: deadlock. Deadlock refers to the infinite blocking caused by two threads waiting to access the resources locked by another blocking thread, which can also be extended to the case of more than two threads. For example, suppose two threads want to access A shared resource, and they must have permission to access the resource. If one of the threads currently has access to the resource, but is blocked indefinitely for other reasons, the other thread trying to obtain the permission of the same resource will also be blocked indefinitely. One mechanism for obtaining permissions on shared resources is mutually exclusive objects, as discussed later. For example, suppose there are two threads and two resources (protected by two mutually exclusive objects A and B). The two threads get permissions on the two resources, but they get permissions in different order. Table 23-4 shows this phenomenon in the form of pseudo code.

It is best to always obtain permissions in the same order to avoid this deadlock. A mechanism for breaking such deadlocks can also be included in the program. One possible way is to try to wait a certain time to see if you can get permission to a resource. If this permission cannot be obtained within a certain time interval, the thread stops waiting and releases other locks currently held. A thread may sleep for a short period of time and then try again to get all the resources it needs. This approach may also give other threads the opportunity to obtain the necessary locks and continue execution. The availability of this method depends largely on the specific deadlock situation.

-Instead of using the workaround described in the previous paragraph, try to avoid any possible deadlock situations. If you need to obtain the permissions of multiple resources protected by multiple mutually exclusive objects instead of obtaining the permissions of each resource separately, it is recommended to use the standard std::lock() or STD:: try described in section 23.4_ Lock() function. These two functions obtain or attempt to obtain permissions for multiple resources through a single call.

Pseudo sharing

Most caches use the so-called "cache line j". For modern CPU s, the cache line is usually 64 bytes. If you need to write some content to the cache row, you need to lock the entire row. If the code structure is not designed properly, it will cause serious performance problems for multithreaded code. For example, suppose two threads are using two different parts of data, and those data share a cache row. If one thread writes something, the other thread will be blocked because the entire cache row is locked. You can use explicit memory alignment to optimize the data structure to ensure that the data processed by multiple threads do not share any cache rows. In order to do this in a portable way, C++17 introduces hardware_ The destructive interference size constant, defined in, returns the recommended offset between two concurrently accessed objects to avoid sharing cache rows. This value can be used in conjunction with the align keyword to properly align the data.

thread

With the help of the C + + thread library defined in the header file, it will be very easy to start a new thread. There are several ways to specify what needs to be executed in a new thread. New threads can execute global functions, operator() of function objects, lambda expressions, and even member functions of a class instance.

Create a thread from a function pointer

Like CreateThread() on Windows_ Functions such as beginthread() and pthread in the pthreads library_ The create () function requires that the thread function has only one parameter. On the other hand, the std::thread Class of standard C + + can use any number of parameters. Suppose the counter() function receives two integers: the first represents the ID and the second represents the number of iterations the function will cycle. The body of a function is a loop that performs a given number of iterations. In each iteration, print a message to standard output:

void counter(int id, int numIterations){    for (int i = 0; i < numIterations; ++i)     {    	cout << "Counter " << id << " has value " << i << endl;    }}

Multiple threads executing this function can be started through std::thread. Thread t1 can be created to execute counter() with parameters 1 and 6:

thread t1(counter, 1, 6);

The constructor of the thread class is a variable parameter template, that is, it can receive any number of parameters. Chapter 22 discusses variable parameter templates in detail. The first parameter is the name of the function to be executed by the new thread. When the thread starts executing, the subsequent variable parameters are passed to this function.

A thread object is considered joinable if it represents a current or past active thread of the system. Even if the thread is executed, the thread object is still in a combinable state. Thread objects constructed by default are not associative. Before destroying a joinable thread object, you must call its join() or detach () method. The call to join() is a blocking call and will wait until the thread completes its work. When detach() is called, the thread object is separated from the underlying OS thread. At this point, the OS thread will continue to run independently. When these two methods are called, the thread will become uncoupled. If a thread object that can still be combined is destroyed, the destructor will call std::terminate(), which will suddenly terminate all threads and the application itself. The following code starts two threads to execute the counter() function. After starting the thread, main() calls the join() method of the two threads.

#include <iostream>#include <thread>using namespace std;void counter(int id, int numIterations){	for (int i = 0; i < numIterations; ++i)     {		cout << "Counter " << id << " has value " << i << endl;	}}int main(){	thread t1(counter, 1, 6);	thread t2(counter, 2, 4);	t1.join();	t2.join();	return 0;}

output

xz@xiaqiu:~/study/test/test$ ./testCounter Counter 1 has value 20 has value 0Counter 1 has value Counter 2 has value 1Counter 2 has value 2Counter 2 has value 31Counter 1 has value 2Counter 1 has value 3Counter 1 has value 4Counter 1 has value 5xz@xiaqiu:~/study/test/test$ 

The output will be different on different systems, and it is likely that the results of each run will be different. This is because two threads execute the counter() function at the same time, so the output depends on the number of processing cores in the system and the thread scheduling of the operating system. By default, accessing cout from different threads is thread safe, without any data contention, unless coutsync_ is called before the first output or input operation. with_ stdio(false). However, even without data contention, output from different threads can still be interleaved. This means that the output of the previous example may be mixed together

xz@xiaqiu:~/study/test/test$ ./test
Counter 1 has value 0
Counter Counter 1 has value 1
Counter 1 has value 2
Counter 1 has value 3
Counter 1 has value 4
Counter 1 has value 5
2 has value 0
Counter 2 has value 1
Counter 2 has value 2
Counter 2 has value 3

be careful:

The parameters of thread functions are always copied to some internal storage of the thread. Pass parameters by reference through std ': ref() or cref() in the header file.

Creating threads from function objects

Instead of using function pointers, you can also use function objects to execute in threads. Section 23.2.1 uses function pointer technology. The only way to pass information to threads is to pass parameters to functions. Using a function object, you can add member variables to the function object class, and you can initialize and use these variables in any way. The following example first defines the Counter class. This class has two member variables: one represents the ID and the other represents the number of cycles and generations. Both member variables are initialized by the constructor of the class. To make the Counter class a function object, you need to implement operator() as discussed in Chapter 18. The implementation of operator() is the same as that of counter() function

class Counter
{
public:
    Counter(int id, int numIterations)
    : mId(id), mNumIterations(numIterations)
    {
    }
    void operator()() const
    {
        for (int i = 0; i < mNumIterations; ++i) 
        {
        	cout << "Counter " << mId << " has value " << i << endl;
    	}
	}
private:
    int mId;
    int mNumIterations;
};

The following code snippet demonstrates three ways to initialize a thread through a function object. The first method uses a unified initialization syntax. Create an instance of Counter class through constructor parameters, and then put this instance in curly braces and pass it to the constructor of thread class. The second method defines a named instance of the Counter class and passes it to the constructor of the thread class. The third method is similar to the first method: create an instance of the Counter class and pass it to the constructor of the thread class, but use parentheses instead of curly braces.

// Using uniform initialization syntaxthread t1{ Counter{ 1, 20 }};// Using named variableCounter c(2, 12);thread t2(c);// Using temporarythread t3(Counter(3, 10));// Wait for threads to finisht1.join();t2.join();t3.join();

Comparing the creation method with the world, it seems that the only difference is that the first method uses curly braces and the third method uses parentheses. However, if the function object constructor does not require any parameters, the above third method will not work properly. For example;

class Counter
{
public:
    Counter() {}
    void operator()() const { /* Omitted for brevity */ }
};
int main()
{
    thread t1(Counter());
    t1.join();
}

output

xz@xiaqiu:~/study/test/test$ g++ -o test test.cpp -lpthread
test.cpp: In function 'int main()':
test.cpp:13:8: error: request for member 'join' in 't1', which is of non-class type 'std::thread(Counter (*)())'
   13 |     t1.join();
      |        ^~~~
xz@xiaqiu:~/study/test/test$ 

This will lead to compilation errors, because C + + will interpret the first line in the main() function as the declaration of the function. The t1 function returns a thread object, and its parameter is a function pointer to the parameterless function that returns a Counter object. Therefore, it is recommended to use the unified initialization syntax:

thread tl{ Counter{} }; //1 oK

be careful:

Negative objects are always copied to some internal storage of the thread. If you want to execute operator() on a specific instance of a function object instead of copying, you should pass in the instance by reference using std::ref() or cref() in the header file.

Counter c(2, 12);
thread t2(ref(c));

Creating threads through lambda

Lambda expressions can be well used in standard C + + thread libraries. The following example starts a thread to execute the given lambda expression:

#include <thread>#include <iostream>using namespace std;int main(){	int id = 1;	int numIterations = 5;	thread t1([id, numIterations] {		for (int i = 0; i < numIterations; ++i) {			cout << "Counter " << id << " has value " << i << endl;		}	});	t1.join();	return 0;}

Creating threads through member functions

You can also specify the member function of the class to execute in the thread. The following example defines the base class Request with the process () method. The main() function creates an instance of the Request class and starts a new thread that executes the process() member function of the Request instance req:

class Request{	public:		Request(int id):mId(id){}		void process()		{			cout<<"Processing request "<<mId<<endl;		}	private:		int mId;};int main(){    Request req(100);    thread t{ &Request::process, &req };    t.join();}

With this technique, methods in an object can be executed in different threads. If another thread accesses the same object, you need to confirm that the access is thread safe to avoid contention conditions. Mutex discussed later in this chapter can be used as a thread safe synchronization mechanism.

Thread local storage

The C + + standard supports the concept of thread local storage. Through keyword thread_local, any variable can be marked as thread local data, that is, each thread has an independent copy of this variable, and this variable can persist throughout the life cycle of the thread. This variable is initialized exactly once for each thread. For example, in the following code, two global variables are defined; Each thread shares a unique k copy, and each thread has its own n copy:

int k;thread_local int n;

Note that if thread_ The local variable is declared in the function scope, so the behavior of the variable is consistent with the declaration as a static variable. However, each thread has its own copy, and no matter how many times the function is called in the thread, each thread initializes the variable only once.

Cancel thread

The C + + standard does not include any mechanism for canceling one running thread from another. The best way to achieve this is to provide some kind of communication mechanism supported by both threads. The simplest mechanism is to provide a shared variable. The target thread checks this variable regularly to determine whether it should be terminated. Other threads can set this shared variable to indirectly indicate that the thread is closed. It must be noted here that multiple threads access the shared variable, and at least one thread writes to the shared variable. Atomic or conditional variables discussed later in this chapter are recommended.

Get results from thread

As the previous example shows, starting a new thread is easy. However, in most cases, you may be more interested in the results produced by the thread. For example, if a thread performs some mathematical calculations, you must want to get the calculation results from the thread at the end of the execution. One way is to pass a pointer or reference to the result variable to the thread, in which the thread saves the result. Another method is to store the result in the class member variable of the function object, and the result value can be obtained after the thread execution. This will only take effect if the function object is passed by reference to the thread constructor using std::ref(). However, there is a simpler way to get results from threads: furure. future also makes it easier to handle errors in threads.

Copy and re throw exceptions

The whole exception mechanism works well in C + +. Of course, this is only limited to the case of single thread. Each thread can throw its own exception, but they must catch the exception within its own thread. If an exception thrown by one thread cannot be caught in another thread, the C + + runtime will call std::terminate() to terminate the entire application. Exceptions thrown from one thread cannot be caught in another thread. This introduces a number of problems when you want to combine exception handling mechanisms with multithreaded programming. Without using the standard thread library, it is difficult to handle exceptions normally between threads, or even impossible at all. The standard thread library solves this problem through the following exception related functions. These functions can be used not only for std::exception, but also for all types of exceptions, such as int, string, custom exception, etc.

➤➤exception_ptr current_exception() noexcept;

This function is called in the catch block and returns a exception_. PTR object, which refers to the exception currently being processed or its copy. If no exception is handled, an empty exception is returned_ PTR object. As long as there is an exception that references the exception object_ PTR type objects and referenced exception objects are available. exception_ The type of PTR object is NullablePointer, which means that this variable can be easily checked by a simple recognition statement, as shown in the following example.

➤➤[[noreturn]] void rethrow_exception(exception_ptr p);

This function is re thrown by exception_ Exception referenced by PTR parameter. It is not necessary to re throw the exception in the thread that first generated the referenced exception, so this feature is particularly suitable for exception handling across different threads. The [[noreturn]] attribute indicates that this function will never return normally. Chapter 11 describes the features.

➤➤template exception_ptr make_exception_ptr(E e) noexcept;

This function creates an exception that references a copy of the given exception object_ PTR object. This is actually a short form of the following code:

try {throw e;} catch(...) {return current_exception();}

Let's take a look at how to implement exception handling between different threads through these functions. The following code defines a function that does something and throws an exception. This function will eventually run in a separate thread:

void doSomeWork()
{
    for (int i = 0; i < 5; ++i) 
    {
        cout << i << endl;
    }
    cout << "Thread throwing a runtime_error exception..." << endl;
    throw runtime_error("Exception from thread");
}

The following threadfunc () function wraps the above function in a try/catch block and captures all exceptions that doSomeWork() may throw. Pass in a parameter for threadFunc() of type exception_ptr&. Once an exception is caught, it is passed through current_ The exception () function gets a reference to the exception being handled and assigns the reference to exception_ptr parameter. After that, the thread exits normally:

void doSomeWork()
{
    for (int i = 0; i < 5; ++i) 
    {
    	cout << i << endl;
    }
    cout << "Thread throwing a runtime_error exception..." << endl;
	throw runtime_error("Exception from thread");
}

The following threadfunc () function wraps the above function in a try/catch block and captures all exceptions that doSomeWork() may throw. Pass in a parameter for threadFunc() of type exception_ptr&. Once an exception is caught, it is passed through current_ The exception () function gets a reference to the exception being handled and assigns the reference to exception_ptr parameter. After that, the thread exits normally:

void threadFunc(exception_ptr& err){    try     {    	doSomeWork();    } catch (...)     {        cout << "Thread caught exception, returning exception..." << endl;        err = current_exception();    }}

The following doWorkInThread() function is invoked in the main thread, whose responsibility is to create a new thread and start executing the threadFunc() function in this thread. Exception for type_ The reference to the object of PTR is passed in threadFunc() as a parameter. Once the thread is created, the doWorkInThread() function uses the join() method to wait for the thread to finish executing, and then checks the error object. Due to exception_ The type of PTR is NullablePointer, so it is easy to check through the certificate statement. If it is a non null value, the exception will be thrown again in the current thread. In this example, the current thread is the main thread. Re throw the exception in the main thread, and the exception is transferred from one thread to another.

void doWorkInThread()
{
    exception_ptr error;
    // Launch thread
    thread t{ threadFunc, ref(error) };
    // Wait for thread to finish
    t.join();
    // See if thread has thrown any exception
    if (error) 
    {
        cout << "Main thread received exception, rethrowing it..." << endl;
        rethrow_exception(error);
    } 
    else 
    {
    	cout << "Main thread did not receive any exception." << endl;
    }
}

The main() function is quite simple. It calls doWorkInThread(), wraps this call in a trycatch block, and catches the exception thrown by any thread created by doWorkInThread();

int main()
{
    try {
    doWorkInThread();
    } catch (const exception& e) {
    cout << "Main function caught: '" << e.what() << "'" << endl;
    }
}

code

#include <thread>#include <iostream>#include <exception>#include <stdexcept>using namespace std;void doSomeWork(){    for (int i = 0; i < 5; ++i)    {        cout << i << endl;    }    cout << "Thread throwing a runtime_error exception..." << endl;    throw runtime_error("Exception from thread");}void threadFunc(exception_ptr &err){    try    {        doSomeWork();    }    catch (...)    {        cout << "Thread caught exception, returning exception..." << endl;        err = current_exception();    }}void doWorkInThread(){    exception_ptr error;    // Launch thread    thread t{ threadFunc, ref(error) };    // Wait for thread to finish    t.join();    // See if thread has thrown any exception    if (error)    {        cout << "Main thread received exception, rethrowing it..." << endl;        rethrow_exception(error);    }    else    {        cout << "Main thread did not receive any exception." << endl;    }}int main(){    try    {        doWorkInThread();    }    catch (const exception &e)    {        cout << "Main function caught: '" << e.what() << "'" << endl;    }    return 0;}

output

xz@xiaqiu:~/study/test/test$ ./test01234Thread throwing a runtime_error exception...Thread caught exception, returning exception...Main thread received exception, rethrowing it...Main function caught: 'Exception from thread'xz@xiaqiu:~/study/test/test$ 

To make this example compact and easier to understand, the main() function usually uses join() to block the main thread and wait for the thread to complete. Of course, in a real application, you don't want to block the main thread. For example, in a GUI application, blocking the main thread means that the UI loses response. At this point, you can use the messaging paradigm to communicate between threads. For example, you can let the previous threadFunc() function send a message to the UI thread, and the parameter of the message is current_ A copy of the exception() result. But even so, as mentioned earlier, you need to ensure that join() or detach() is called on any generated thread.

Atomic operation Library

Atomic types allow atomic access, which means that concurrent read and write operations can be performed without additional synchronization mechanisms. Without atomic operations, incrementing variables is not thread safe, because the compiler first loads the value from memory into a register, and then saves the result back to memory. Another thread may come into contact with memory during the execution of this increment operation, resulting in data contention. For example, the following code is not thread safe and contains data contention conditions. This contention condition is discussed at the beginning of this chapter:

int counter = 0; // Global variable
++counter; // Executed in multiple threads

To make this thread safe and not explicitly use any synchronization mechanisms (such as mutex objects discussed later in this chapter), use the std::atomic type. The following is the same code that uses atomic integers:

atomic<int> counter(0) ; // Global variable
++counter;// Executed in multiple threads

To use these atomic types, you need to include header files. The C + + standard defines named integer atomic types for all basic types, as shown in table 23-5.

Atomic types can be used without explicitly using any synchronization mechanism. At the bottom, however, some types of atomic operations may use synchronization mechanisms (such as mutually exclusive objects). This can happen if the target hardware lacks instructions to perform operations atomically. Is can be used on atomic types_ lock_ Free () method to query whether it supports lock free operation. The so-called lock free operation means that there is no explicit synchronization mechanism at the bottom at runtime. The std::atomic class template can be used with all types, not just integer types. For example, you can create atomic or atomic, but this requires MyType to have is_trivially_copy features. The underlying layer may require an explicit synchronization mechanism, depending on the size of the specified type. In the following example, Foo and Bar have is_trivially_copy feature, namely std::is_trivially_copyable_v is equal to true. However, atomic is not a lockless operation, and atomic is a lockless operation.

class Foo { private: int mArray[123]; };class Bar { private: int mInt; };int main(){    atomic<Foo> f;    // Outputs: 1 0    cout << is_trivially_copyable_v<Foo> << " "          << f.is_lock_free() << endl;    atomic<Bar> b;    // Outputs: 1 1    cout << is_trivially_copyable_v<Bar> << " "          << b.is_lock_free() << endl;}

When accessing a piece of data in multithreading, atom can also solve problems such as memory sorting, compiler optimization and so on. Basically, it is impossible to read and write the same piece of data safely in multiple threads without using atomic or explicit synchronization mechanisms.

Atomic type example

This section explains why atomic types should be used. Suppose you have a function called increment(), which increments an integer value passed in through a reference parameter in a loop. This code uses std::this_thread::sleep_for() introduces a small delay in each loop. sleep_ The parameter of for() is std::chrono::duration. See Chapter 20.

void increment(int& counter){    for (int i = 0; i < 100; ++i)     {        ++counter;        this_thread::sleep_for(1ms);    }}

Now, to run multiple threads in parallel, you need to execute the increment() function on the shared variable counter. If you implement the program in the original way without using atomic types or any thread synchronization mechanism, you will introduce contention conditions. The following code, after loading 10 threads, calls the join() of each thread and waits for all threads to complete.

int main(){	int counter = 0;	vector<thread> threads;	for(int i = 0;i < 10; ++i)	{		threads.push_back(thread{increment,ref(counter)});	}	for(auto & t : threads)	{		t.join();	}	cout<<"Result = "<<counter<<endl;}

Since increment() increments this integer 100 times, 10 threads are loaded, and each thread executes increment() on the same shared variable counter, the expected result is 1000. If you execute this program several times, you may get the following output, but the values are different

xz@xiaqiu:~/study/test/test$ ./test
Result = 943
xz@xiaqiu:~/study/test/test$ ./test
Result = 875
xz@xiaqiu:~/study/test/test$ ./test
Result = 868
xz@xiaqiu:~/study/test/test$ ./test
Result = 826
xz@xiaqiu:~/study/test/test$ ./test
Result = 881
xz@xiaqiu:~/study/test/test$ ./test
Result = 842
xz@xiaqiu:~/study/test/test$ ./test
Result = 828
xz@xiaqiu:~/study/test/test$ ./test
Result = 850
xz@xiaqiu:~/study/test/test$ 

This code clearly shows the data contention behavior. In this example, you can use atomic types to solve this problem. The following code highlights the changes

#include <atomic>
void increment(atomic<int>& counter)
{
	for(int i = 0;i < 100; ++i)
	{
		++counter;
		this_thread::sleep_for(1ms);
	}
}
int main()
{
	atomic<int> counter(0);
    vector<thread> threads;
    for(int i = 0; i < 10; ++i)
    {
        threads.push_back(thread{increment,ref(counter)});
    }
    for(auto& t : threads)
    {
        t.join();
    }
    cout<<"Result = "<<counter<<endl;
}

A header file is added to this code to change the type of shared counter from int to std::atomic. Running this improved version will always get the result of 1000:

xz@xiaqiu:~/study/test/test$ ./test
Result = 1000
xz@xiaqiu:~/study/test/test$ ./test
Result = 1000
xz@xiaqiu:~/study/test/test$ ./test
Result = 1000
xz@xiaqiu:~/study/test/test$ ./test
Result = 1000
xz@xiaqiu:~/study/test/test$ ./test
Result = 1000
xz@xiaqiu:~/study/test/test$ ./test
Result = 1000
xz@xiaqiu:~/study/test/test$ ./test
Result = 1000
xz@xiaqiu:~/study/test/test$ ./test
Result = 1000
xz@xiaqiu:~/study/test/test$ 

Without explicitly adding any synchronization mechanism to the code, we can get a thread safe program without contention conditions, because the + + counter operation on the atomic type will load values, increment values and save values in the atomic transaction, and this process will not be interrupted. However, the modified code raises a new problem: performance. Try to minimize the number of synchronizations, including atomic operations and explicit synchronization, as this will reduce performance. For this simple example, the recommended best solution is to have increment() evaluate the result in a local variable and then evaluate it after the loop adds it to the counter reference. Note that you still need to use the atomic type because you still have to write counter:

void increment(atomic<int>& counter)
{
	int result = 0;
	for(int i = 0;i<100;++i)
	{
		++result;
		this_thread::sleep_for(1ms);
	}
	counter += result;
}

Atomic operation

The C + + standard defines some atomic operations. This section describes some of these operations. For a complete list, refer to the standard library reference resources (see Appendix B).

Here is an example of an atomic operation:

bool atomic<T>::compare_exchange_strong(T& expected,T desired);

This operation implements the following logic in an atomic manner, and the pseudo code is as follows:

if(*this == expected)
{
	*this = desired;
	return true;
}
else
{
    expected = *this;
    return false;
}

This logic may seem strange at first, but it is a key component in writing lock free concurrent data structures. Lockless concurrent data structures allow data to be manipulated without any synchronization mechanism. But implementing such data structures is a high-level topic that is beyond the scope of this book.

Another example is atomic:: fetch for integer atomic types_ Add(). This operation obtains the current value of the atomic type, adds the given incremental value to the atomic value, and then returns the original value that is not incremented. For example:

atomic<int> value(10);cout<<"Value = "<<value<<endl;int fetched = value.fetch_add(4);cout<<"Fetched = "<<fetched<<endl;cout<<"value = "<<value<<endl;

If no other thread operates on the contents of fetched and value variables, the output is as follows:

xz@xiaqiu:~/study/test/test$ ./testValue = 10Fetched = 10value = 14xz@xiaqiu:~/study/test/test$ 

Integer atomic types support the following atomic operations: fetch_add(),fetch_sub(),fetch_and(),fetch_or(), fetch xor(), + +, –, + =, - =, & =, ^ and|F. The atomic pointer type supports fetch_add(),fetch_sub(), + +, –, + = and - =. Most atomic operations can receive an additional parameter that specifies the desired memory order. For ex amp le;

T atomic<T>::fetch_add(T value,memory_order = memory_order_seq_cst);

You can change the default memory order. The C + + standard provides memory_order_relaxed ,memory_order_consume ,memory_order_ acquire,memory_order release,memory _order_acq_rel and memory_order seq_cst, which are defined in the std namespace. However, it is rarely necessary to use an order other than the default. Although other memory sequences may perform better than the default sequence, according to some standards, if they are used improperly, contention conditions or other thread related problems that are difficult to track may be introduced again. If you need more information about memory ordering, see the references on Multithreading in Appendix B.

mutex

If you are writing a multithreaded application, you must pay special attention to the order of operations. If a thread reads and writes shared data, a problem may occur. There are many ways to avoid this problem, such as never sharing data between threads. However, if data sharing cannot be avoided, a synchronization mechanism must be provided so that only one thread can change data at a time. Scalars such as Booleans and integers often use the above atomic operations to achieve synchronization, but when the data is more complex and must be used in multiple threads, an explicit synchronization mechanism must be provided. The standard library supports mutually exclusive forms, including mutex() class and lock class. These classes can be used to achieve synchronization between threads, which will be discussed next

Mutex class

The basic usage mechanism of mutex (for mutual exclusion) is as follows:

A thread that wants to share memory read and write with other threads attempts to lock the mutex object. If another thread is holding the lock, the thread that wants access will be blocked until the lock is released or until it times out.

Once a thread obtains a lock, the thread can freely use the shared memory, because this assumes that all threads that want to use the shared data have correctly obtained the lock on the mutex object.

After the thread reads and writes the shared memory, the thread releases the lock, so that other threads have the opportunity to obtain the lock to access the shared memory. If two or more threads are waiting for a lock, there is no mechanism to ensure which thread gets the lock first and continues to access the data.

The C + + standard provides non timed mutex classes and timed mutex classes.

  1. Non timed mutex class

The standard library has three non timed mutex classes: std::mutex and recursive_mutex and shared_ Mutex (referenced since C++17). The first two classes are defined in, and the last class is defined in < shared_ Mutex >. Each class supports the following methods. lock(): the calling thread will attempt to acquire the lock and block until the lock is acquired. This method blocks indefinitely. If you want to set the maximum thread blocking time, you should use the timed mutex class.

try_lock(): the calling thread will attempt to acquire the lock. If the current lock is held by another thread, this call returns immediately. If the lock is successfully acquired, try_lock() returns true, and otherwise false.

unlock(): release the lock held by the calling thread so that another thread can acquire the lock. std::mutex is a standard mutex class with exclusive ownership semantics. Only one thread can have mutexes. If another thread wants to take ownership of the mutex, the thread can block through lock () or try_lock() attempt failed. Threads that already own std::mutex cannot call lock () and try again on this mutex_ Lock(), and otherwise may cause deadlock! std::recursive_ The behavior of mutex is almost the same as that of std::mutex. The difference is that the thread that has obtained the ownership of the recursive mutex is allowed to call lock () and try again on the same mutex_ lock(). The number of times the calling thread calls the unlock() method should be equal to the number of times the recursive mutex is locked.

shared_mutex supports the concept of "shared lock ownership", which is also known as the reader writers lock. Threads can acquire exclusive or shared ownership of locks. Exclusive ownership, also known as write locks, can only be obtained when no other thread has exclusive or shared ownership. Shared ownership is also called read lock. If no other thread has exclusive ownership, it can be obtained, but other threads are allowed to obtain shared ownership. shared_mutex class supports lock(), try_lock() and unlock(). These methods acquire and release exclusive locks. In addition, they have the following methods related to shared ownership: lock_shared(),try_lock_shared() and unlock_shared(). These methods work similar to other collections of methods, but attempt to acquire or release shared ownership. Shared is not allowed_ The thread with the lock on mutex obtains the second lock on the mutex, otherwise a deadlock will occur!

  1. Timed mutex class

The standard library provides three timed mutex classes: std::timed_mutex,recursive_timed_mutex and shared_timed_mutex. The first two classes are defined in, and the last class is defined in < shared_ Mutex >. They all support lock(), try_ lock() and unlock() methods, shared_timed_mutex also supports lock_shared(),try_lock shared() and unlock_ shared(). The behavior of all these methods is similar to that described earlier. In addition, they support the following methods.

try_lock_for(rel_time): the calling thread attempts to obtain the lock within a given relative time. If the lock cannot be obtained, the call fails and returns false. If the lock is obtained before the timeout, the call succeeds and returns true. Specify the timeout as std::chrono::duration, as discussed in Chapter 20.

try_lock_until(abs_time): the calling thread will attempt to acquire this lock until the system time equals or exceeds the specified absolute time. If the lock can be obtained before the timeout, the call returns true. If the system time exceeds the given absolute time, no more attempts are made to acquire the lock and false is returned. Specify the absolute time as std::chrono::time point, as discussed in Chapter 20.

shared_timed_mutex also supports try_lock_shared_for() and try_lock_shared_until(). You already have timed_mutex or shared_ timed_ The mutex owned thread is not allowed to obtain the lock on this mutex again, otherwise it may cause deadlock!

recursive_timed_mutex behavior and recursive_ Similar to mutex, people allow a thread to acquire locks multiple times.

warning

Do not manually call the above locking and unlocking methods on any mutex class. Mutexes are resources. Like all resources, they should almost always be obtained using the RAII(Resource Acquisition Is Initialization) paradigm. See Chapter 28. The C + + standard defines some RAII locking classes, which are important to avoid deadlock. When lock objects leave the scope, they will automatically release the mutex, so there is no need to call unlock() manually.

lock

Lock class is RAII class, which can be used to obtain and release locks on mutexes more conveniently and correctly. The destructor of lock class will automatically release the associated mutexes. The C + + standard defines four types of locks: std::lock_guard,unique_lock,shared_lock and scoped_lock. The last one is introduced in C++17.

1.lock_guard

lock_guard is defined in and has two constructors.

 explicit lock_guard(mutex_type& m);

Constructor that receives a mutex reference. This constructor attempts to obtain the lock on the mutex and blocks until the lock is obtained. Chapter 9 discusses the explicit keyword of constructors.

lock_guard(mutex_type& m, adopt_lock_t);

Receive a mutex reference and an std::adopt_lock_ The constructor of the. T instance. C + + provides a predefined adopt_lock_t instance named std::adopt_lock. The lock assumes that the calling thread has obtained the lock on the referenced mutex, manages the lock, and automatically releases the mutex when the lock is destroyed.

unique_lock

std::unique_lock, defined in, is a more complex type of lock. People are allowed to delay the time to obtain the lock until the calculation needs it, far after the declaration time. Using owns_ The lock () method can determine whether a lock has been obtained. unique_lock also has a bool conversion operator, which can be used to check whether a lock has been obtained. Examples of using this conversion operator are given later in this chapter. unique_lock has the following constructors.

explicit unique_lock(mutex_type& m);

Constructor that receives a mutex reference. This constructor attempts to obtain the lock on the mutex and blocks until the lock is obtained.

unique_lock(mutex_type& m, defer_lock_t) noexcept;

Receive a mutex reference and an std::defer_lock_ Constructor of the T instance. C + + provides a predefined defer_lock_t instance, named std::defer_lock. unique_lock stores a reference to the mutex, but does not attempt to obtain the lock immediately. The lock can be obtained later.

unique_lock(mutex_type& m, try_to_lock_t);

Receive a mutex reference and an STD:: defer lock_ Constructor of the T instance. C + + provides a predefined defer_lock_t instance, named std::defer_lock. unique_lock stores a reference to the mutex, but does not attempt to obtain the lock immediately. The lock can be obtained later.

unique_lock(mutex_type& m, adopt_lock_t);

Receive a mutex reference and an std::try_to_lock_ Constructor of the T instance. C + + provides a predefined try_to_lock_t instance, named std::try_to_lock. This lock attempts to acquire a lock on the referenced mutex, but it is not blocked even if it is not acquired. In this case, the lock will be acquired later.

unique_lock(mutex_type& m, const chrono::time_point<Clock, Duration>&abs_time);

Receives a mutex reference and an absolute time constructor. This constructor attempts to acquire a lock until the system time exceeds the given value

unique_lock(mutex_type& m, const chrono::duration<Rep, Period>& rel_time);

Receives a mutex reference and a relative time constructor. This constructor attempts to obtain a lock on a mutex until a given relative timeout is reached.

unique_ The lock class also has the following methods: lock (), try_lock(),try_lock_for(),try_lock_until() and unlock(). The behavior of these methods is consistent with the methods in the timed mutex class described earlier.

  1. shared_lock

shared_lock class in < shared_ Mutex > and its constructor and method are the same as unique_ Same as lock. The difference is shared_ The lock class calls methods related to shared ownership on the underlying shared mutex. Therefore, shared_ The methods of lock are called lock(), try_lock(), but on the underlying shared mutex, they are called lock_shared(),try_lock_shared(), etc. Therefore, shared lock and unique_lock has the same interface and can be used as unique_lock, but you get a shared lock, not an exclusive lock.

  1. Obtain multiple locks at once

C + + has two generic lock functions, which can be used to obtain locks on multiple mutex objects at the same time without deadlock. Both generic lock functions are defined in the std namespace and are variable parameter template functions. Chapter 22 discusses variable parameter template functions. The first function lock() does not lock all given mutex objects in the specified order, and there is no risk of deadlock. If one of the mutex calls throws an exception, unlock() is called on all the locks that have been obtained. The prototype is as follows:

template <class L1, class L2, class... L3> void lock(L1&, L2&, L3&...);

try_ The lock () function has a similar prototype, but it calls the try of each given mutex object sequentially_ Lock(), trying to get locks on all mutex objects. If all try_ If the lock () calls are successful, the function returns - 1. If any try_ If the lock() call fails, unlock() is called for all the locks that have been obtained, and the return value is to call try on them_ The parameter location index of the mutex where lock() failed (calculated from 0). The following example demonstrates how to use the generic function lock(). The process() function first creates two locks, one for each mutex, and then creates an std::defer_lock_t instance is passed in as the second parameter to tell unique_lock do not acquire locks during construction. Then call std::lock() to get these two locks without Deadlock:

mutex mut1;mutex mut2;void process(){    unique_lock lock1(mut1, defer_lock); // C++17    unique_lock lock2(mut2, defer_lock); // C++17    //unique_lock<mutex> lock1(mut1, defer_lock);    //unique_lock<mutex> lock2(mut2, defer_lock);    lock(lock1, lock2);    // Locks acquired} // Locks automatically released
  1. scoped_ lock

std::scoped_lock is defined in and is the same as lock_ Similar to guard, it only receives a variable number of mutexes. In this way, multiple locks can be easily obtained. For example, you can use scoped_lock, write the example that contains the process() function just now, as shown below:

mutex mut1;mutex mut2;void process(){    scoped_lock locks(mut1, mut2);    // Locks acquired} // Locks automatically released

This uses C++17's template parameter derivation for constructors. If the compiler does not yet support this feature, you must write the following code:

scoped_lock<mutex, mutex> locks(mut1, mut2);

std::call_once

Combined with std::call_once() and std::once_flag ensures that a function or method is called exactly once, no matter how many threads try to call_ The same is true for once() (on the same once_flag). There is only one call_ The once () call can actually call a given function or method. If a given function does not throw any exceptions, the call is called a valid call_once() call. If a given function throws an exception, the exception is passed back to the caller, and another caller is selected to execute the function. A specific once_ Valid calls to flag instances are made to the same once_ All other calls of the flag instance_ Completed before the once() call. In the same once_ Call on flag instance_ The other threads of once () are blocked until the valid call ends. Figure 23-3 illustrates this with three threads. Thread 1 performs a valid call_once() call, thread 2 blocks until this valid call is completed, and thread 3 will not block because the valid call of thread 1 has been completed.

The following example demonstrates call_ Use of once(). This example runs a processingFunction() that uses a shared resource and starts three threads. These threads should call initializeSharedResources() once and initialize only once. To do this, each thread uses a global once_flag call_once(), the result is that only one thread executes initializeSharedResources() and only once. Calling call_ During once(), other threads are blocked until initializeSharedResources() returns

once_flag gOnceFlag;
void initializeSharedResources()
{
    // ... Initialize shared resources to be used by multiple threads.
    cout << "Shared resources initialized." << endl;
}
void processingFunction()
{
    // Make sure the shared resources are initialized.
    call_once(gOnceFlag, initializeSharedResources);
    // ... Do some work, including using the shared resources
    cout << "Processing" << endl;
}
int main()
{
    // Launch 3 threads.
    vector<thread> threads(3);
    for (auto& t : threads) 
    {
    	t = thread{ processingFunction };
    }
    // Join on all threads
    for (auto& t : threads) 
    {
    	t.join();
    }
}

output

xz@xiaqiu:~/study/test/test$ ./test
Shared resources initialized.
Processing
Processing
Processing
xz@xiaqiu:~/study/test/test$ ./test
Shared resources initialized.
Processing
Processing
Processing
xz@xiaqiu:~/study/test/test$ ./test
Shared resources initialized.
Processing
Processing
Processing
xz@xiaqiu:~/study/test/test$ ./test
Shared resources initialized.
Processing
Processing
Processing
xz@xiaqiu:~/study/test/test$ 

Of course, in this example, you can also call initializeSharedResources() at the beginning of the main() function before starting the thread, but then you can't demonstrate the call_ The usage of once() is.

Usage examples of mutex objects

Here are a few examples of how to use mutex objects to synchronize multiple threads.

  1. Write stream in thread safe manner

In the previous section on threads in this chapter, an example uses a class called Counter. This example mentions that streams in C + + will not have contention conditions, but the output from different threads will still be interleaved. To solve this problem, you can use a mutex object to ensure that only one thread reads and writes the stream object at a time. The following example synchronizes all access to count in the Counter class. To achieve this synchronization, add a static mutex object to this class. This object should be static because all instances of the class should use the same mutex instance. Before writing cout, use lock_guard obtains the lock on this mutex object. Different codes from previous versions are highlighted below:

class Counter
{
	public:
		Counter(int id,int numIterations)
			:mId(id),mNumIteration(numIterations)
			{
			
			}
		void operator()() const
		{
			for(int i = 0;i<mNumIterations;++i)
			{
				lock_guard lock(sMutex);
				cout<<"Counter "<<mId<<" has value "<<i<<endl;
			}
		}
	private:
		int mId;
		int mNumIterations;
		static mutex sMutex;
};
mutex Counter::sMutex;

This code creates a lock in each under generation of the for loop_ Guard instance. It is recommended to limit the time of owning the lock as much as possible, otherwise the time of blocking other threads will be too long. For example, if lock_ When the guard instance is created once before the for loop, it basically loses all the multithreading features in this code, because a thread has a lock during the whole execution of its for loop, and all other threads wait for the lock to be released.

  1. Use timing lock

The following example demonstrates how to use timed mutexes. This is the same Counter class as before, but this time it is combined with unique_lock uses timed_mutex. Pass the relative time of 200 milliseconds to the unique lock constructor and try to obtain a lock within 200 milliseconds. If the lock cannot be obtained within this time interval, the constructor returns. After that, you can check whether the lock has been obtained. You can perform this check by applying the let statement to the lock variable because it is unique_ The lock class defines the bool conversion operator. Use the chrono library to specify the timeout, which is discussed in Chapter 20.

class Counter
{
	public:
		Counter(int id,int numIterations)
			:mId(id),mNumIterations(numIterations)
			{
			
			}
		void operator()() const
		{
			for(int i = 0; i<mNumIterations;++i)
			{
				unique_lock lock(aTimedMutex,200ms);
				if(lock)
				{
					cout<<"Counter "<<mId<<" has value "<<i<<endl;
				}
				else
				{
					//Lock not acquired in 200ms,skip output
				}
			}
		}
    private:
    	int mId;
    	int mNumIterations;
    	static timed_mutex sTimedMutex;
};
timed_mutex Counter::sTimedMutex;
  1. Double check locking

Double checked locking is actually an anti pattern and should be avoided! It is introduced here because you may encounter it in an existing code base. The double check locking mode is intended to try to avoid using mutex objects. This is a half way attempt to write more efficient code than using mutex objects. If you want to improve the speed in subsequent examples, you may really make mistakes, such as using relaxed atomic (not discussed in this chapter), replacing atomic with ordinary Boolean, etc. This mode is prone to contention conditions and is difficult to correct. Ironically, use call_once() is actually faster, and using magic static (if available) is faster. Call the local static instance of the function magic static. C + + ensures that such local static instances are initialized in a thread safe manner, so there is no need to perform any thread synchronization manually. When chapter 29 discusses the singleton pattern, we will give an example of using magic static.

Warning;

In the new code, avoid the double check locking mode and use other mechanisms, such as simple lock, atomic variable, call once() and magic static.

For example, double check locking can be used to ensure that resources are initialized exactly once. The following example demonstrates how to implement this function. The double check locking algorithm is called because it checks the value of the gImnitialized variable twice, once before obtaining the lock and once after obtaining the lock. The first time you check the segmented variable is to prevent unnecessary locks from being obtained. The second check is used to ensure that no other thread performs initialization between the first generalized check and obtaining the lock.

void initializeSharedResources()
{
    // ... Initialize shared resources to be used by multiple threads.
    cout << "Shared resources initialized." << endl;
}
atomic<bool> gInitialized(false);
mutex gMutex;
void processingFunction()
{
    if (!gInitialized) 
    {
        unique_lock lock(gMutex);
        if (!gInitialized) 
        {
            initializeSharedResources();
            gInitialized = true;
        }
    }
	cout << "OK" << endl;
}	
int main()
{
    vector<thread> threads;
    for (int i = 0; i < 5; ++i) 
    {
    	threads.push_back(thread{ processingFunction });
    }
    for (auto& t : threads) 
    {
    	t.join();
    }
}

The output clearly shows that only one thread has initialized the shared resource:

xz@xiaqiu:~/study/test/test$ ./testShared resources initialized.OKOKOKOKOKxz@xiaqiu:~/study/test/test$ ./testShared resources initialized.OKOKOKOKOKxz@xiaqiu:~/study/test/test$ 

be careful:

For this example, it is recommended to use call once() instead of double checking locking.

Conditional variable

Condition variables allow one thread to block until another thread sets a condition or the system time reaches a specified time. Conditional variables allow explicit inter thread communication. If you are familiar with multithreading programming of Win32 API, you can compare condition variables with event objects in Windows. Need to include < condition_ Variable > header file to use conditional variables. There are two types of conditional variables. std::condition_variable;, You can only wait for condition variables on unique lock; According to the C + + standard, this condition variable can achieve the highest efficiency on a specific platform. std::condition_variable_any: you can wait for condition variables of any object, including custom lock types.

condition_ The variable class supports the following methods.

➤➤ notify_one();
Wake up one of the threads waiting for this condition variable. This is similar to the auto reset event on Windows.
➤➤ notify_all();
Wake up all threads waiting for this condition variable.
➤➤ wait(unique_lock& lk);
The thread calling wait() should have acquired the lock on lk. The effect of calling wait() is to call lk.unlock() atomically, block the thread and wait for notification. When a thread is notified by another thread_ One() or notify_ When all () calls to unblock, this function will call lk.lock() again, which may be blocked by this lock, and then return.
➤➤ wait_for(unique_lock& lk, const chrono::duration<Rep, Period>&
rel_time);
Similar to the previous wait() method, the difference is that this thread will be notified_ One() or notify_ Call all() to unblock, or it is possible to unblock after a given timeout.
➤➤ wait_until(unique_lock& lk, const chrono::time_point<Clock,
Duration>& abs_time);
Similar to the previous wait() method, the difference is that this thread will be notified_ One() or notify_all() calls to unblock, or it may unblock when the system time exceeds a given absolute time.

There are also other versions of wait(), wait_for() and wait_until() receives an additional predicate parameter. For example, a wait() that receives an extra predicate is equivalent to:

while (!predicate())	wait(lk);

condition_variable_ Methods and conditions supported by any class_ The variable class is the same, except condition_variable_any can accept any type of lock class, not just unique_lock. The lock class should provide lock() and unlock() methods.

False Awakening

The thread waiting for the condition variable can call notify in another thread_ One() or notify_ Wake up when all(), or wake up when the system time exceeds a given time, or wake up untimely. This means that even if no other thread calls any notification method, the thread will wake up. Therefore, when a thread waits for a condition variable and wakes up, it needs to check whether it wakes up because of notification. One way to check is to use the wait() version of the receive predicate parameter.

Using conditional variables

For example, condition variables can be used for background threads that process queue items. You can define a queue in which items to process are inserted. An item appears in the background thread waiting queue. When an item is inserted into the queue, the thread wakes up, processes the item, and then continues to sleep, waiting for the next item. Suppose you have the following queues:

queue<string> mQueue;

You need to ensure that only one thread modifies the queue at any time. This can be achieved through mutexes:

mutex mMutex;

In order to notify the background thread when an item is added, a condition variable is required:

condition_variable mCondVar;

The thread that needs to add an item to the queue first obtains the lock on the mutex, then adds an item to the queue, and finally notifies the background thread. Notify can be called regardless of whether the lock is currently owned or not_ One() or notify_all(), they all work normally:

// Lock mutex and add entry to the queue.unique_lock lock(mMutex);mQueue.push(entry);// Notify condition variable to wake up thread.mCondVar.notify_all();

The background thread waits for notification in an infinite loop. Note that the wait() method that receives predicate parameters is used here to correctly handle the situation where the thread wakes up untimely. Predicate checks whether there are queue items in the queue. When the call to wait() returns, you can be sure that there are queue items in the queue.

unique_lock lock(mMutex);
while (true) 
{
    // Wait for a notification.
    mCondVar.wait(lock, [this]{ return !mQueue.empty(); });
    // Condition variable is notified, so something is in the queue.
    // Process queue item...
}

Section 23.7 gives a complete example of how to send notifications to other threads through condition variables. The C + + standard also defines the auxiliary function std::notify_all_at_ thread_exit(cond, lk), where cond is a conditional variable and ik is a unique variable_ Lock instance. The thread calling this function should have obtained the lock. When the thread exits, the following code is automatically executed:

lk.unlock();
cond.notify_all();

be careful:

Keep the lock k locked until the thread exits. Therefore, make sure that this does not cause any deadlocks in the code, such as those caused by the wrong lock order. Deadlock was discussed earlier in this chapter.

future

According to the previous discussion in this chapter, a thread can be started through std::thread to calculate and obtain a result. It is not easy to retrieve the calculated result when the thread ends execution. Another problem associated with std::thread is handling errors such as exceptions. If a thread throws an exception that is not handled by the thread itself, the C + + runtime will call std::terminate(), which usually terminates the entire application. You can use future to more easily obtain the result of the thread and transfer the exception to another thread, and then another thread can handle the exception at will. Of course, you should always try to handle exceptions in the thread itself and don't let exceptions leave the thread. Fate stores the results in promise. The results stored in promise can be obtained through future. That is, promise is the input of the result; Future is the output. Once a function running in the same thread or another thread calculates the value you want to return, put the value in promise. You can then get this value through future. The future/promise pair can be imagined as a communication channel for transmitting results between threads. C + + provides a standard future named std::future. Results can be retrieved from std::future. T is the type of calculation result.

future<T> myFuture = ...; //Is discussed laterT result = myFuture.get();

Call get () to get the result and save it in the variable result. If another thread has not finished calculating the result, the call to get () will block until the result value is available. get() can only be called once in future. By standard, the behavior of the second call is uncertain. Blocking can first be avoided by asking future if the results are available:

if (myFuture.wait_for(0)) { // Value is available	T result = myFuture.get();} else { // Value is not yet available...}

23.6.1 std::promise and std::future

C + + provides the std::promise class as A way to implement the promise concept. Set can be called on promise_ Value() to store the result, or call set_exception(), stores exceptions in promise. Note that set can only be called on A specific promise_ Value () or set_exception() once. If it is called multiple times, STD:: future will be thrown_ Error exception. If thread A starts another thread B to perform calculations, thread A can create an std::promise and pass it to the started thread. Note that promise cannot be copied, but it can be moved to the thread! Thread B uses promise to store the results. Before moving the promise into thread B, thread A calls get future0 on the created promise, so that thread B can access the results after it is completed. Here is A simple example:

void DoWork(promise<int> thePromise){    // ... Do some work ...    // And ultimately store the result in the promise.    thePromise.set_value(42);}int main(){    // Create a promise to pass to the thread.    promise<int> myPromise;    // Get the future of the promise.    auto theFuture = myPromise.get_future();    // Create a thread and move the promise into it.    thread theThread{ DoWork, std::move(myPromise) };    // Do some more work...    // Get the result.    int result = theFuture.get();    cout << "Result: " << result << endl;    // Make sure to join the thread.    theThread.join();}

be careful

This code is for demonstration purposes only. This code starts the calculation in a new thread and then calls get() on fnture. This thread will block until the result is calculated. This sounds like a costly function call. When using the future model in an actual application, you can regularly check whether there are available results in the future (through the wait for() described earlier), or use synchronization mechanisms such as conditional variables. When the results are not available, do something else instead of blocking.

23.6.2 std::packaged_task

With std::packaged_task, it will be easier to use promise instead of using std::promise explicitly as in section 23.6.1. The following code demonstrates this. It creates a packaged task to execute calculatesum (). By calling get_future(), retrieve the future from the packaged task. Start a thread and package it_ Move task into it. Cannot copy packaged_task’! After starting the thread, call get() on the retrieved name future to get the result. It will be blocked until the results are available. Calculatesum () does not need to explicitly store any data in any type of promise. packaged_task automatically creates promise, automatically stores the result of the called function (here is CalculateSum()) in promise, and automatically stores any exceptions thrown by the function in promise.

int CalculateSum(int a, int b) { return a + b; }int main(){    // Create a packaged task to run CalculateSum.    packaged_task<int(int, int)> task(CalculateSum);    // Get the future for the result of the packaged task.    auto theFuture = task.get_future();    // Create a thread, move the packaged task into it, and    // execute the packaged task with the given arguments.    thread theThread{ std::move(task), 39, 3 };    // Do some more work...    // Get the result.    int result = theFuture.get();    cout << result << endl;    // Make sure to join the thread.    theThread.join();}

std::async

If you want the C + + runtime to have more control over whether to create a thread for some kind of calculation, you can use std::async(). It receives a function to be executed and returns the future that can be used to retrieve the results. Async() can run the function in two ways:

Create a new thread and run the provided function asynchronously.

When the get() method is called on the returned future, the function runs synchronously on the calling thread. If async() is not called through additional parameters, the C + + runtime will automatically select one of the two methods according to some factors (such as the number of processors in the system). You can also specify policy parameters to adjust the behavior of the C + + runtime.

launch::async: forces the C + + runtime to execute functions asynchronously on a different thread.

launch::deferred: forces the C + + runtime to execute functions synchronously on the calling thread when calling get().

launch::async | launch::deferred: allows the C + + runtime to choose (= default behavior).

The following example demonstrates the usage of async();

int calculate()
{
	return 123;
}
int main()
{
    auto myFuture = async(calculate);
    //auto myFuture = async(launch::async, calculate);
    //auto myFuture = async(launch::deferred, calculate);
    // Do some more work...
    // Get the result.
    int result = myFuture.get();
    cout << result << endl;
}

As you can see from this example, std::async() is one of the easiest ways to perform some calculations asynchronously (in different threads) or synchronously (in the same thread) and then obtain the results.

Warning:

The future returned by calling async() lock will be blocked in its destructor until the result is available. This means that if the returned future is not captured when calling async(), the async() call will really become a blocked call! For example, the following code line synchronously calls calculate():async (calculate) in this statement, async() Future is created and returned. This future is not captured. Therefore, it is a temporary future. Because it is temporary, its destructor will be called before the statement completes, and the destructor will remain blocked until the result is available.

exception handling

A great advantage of using future is that they automatically pass exceptions between threads. When you call get() on future, you either return the calculation result or re throw any exceptions stored in promise associated with future. Use packaged_task or async() Any exception thrown from the started function is automatically stored in promise. If std::promise is used as promise, set_exception() can be called to store the exception in it. Here is an example using async():

int calculate(){	throw runtime_error("Exception thrown from calculate().");}int main(){    // Use the launch::async policy to force asynchronous execution.    auto myFuture = async(launch::async, calculate);    // Do some more work...    // Get the result.    try     {        int result = myFuture.get();        cout << result << endl;    }     catch (const exception& ex)     {    	cout << "Caught exception: " << ex.what() << endl;    }}

std::shared_future

std::future only requires mobile construction. When you call get() on future, the results will be moved out of future and returned to you. This means that you can only call get() once on future. If you want to call get() multiple times, or even multiple times from multiple threads, you need to use std::shared_future. At this time, T needs a replicable construction. You can use std::future::share() , or pass fnture to the shared_future constructor to create shared_future. Note that future cannot be copied, so you need to move it into the shared_future constructor. Shared_future can be used to wake up multiple threads at the same time. For example, the following code fragment defines two lambda expressions that execute asynchronously on different threads. The first of each lambda expression First set the value to the respective promise to indicate that it has been started. Then call get() in signalFuture , this blocks until the parameters can be obtained through future, after which execution will continue. Each lambda expression captures its own promise by reference and signalFuture by value, so both lambda expressions have a copy of signalFuture. The main thread uses async() , execute the two lambda expressions on different threads until the thread starts, and then set the parameters in signalPromise to wake up the two threads.

promise<void> thread1Started, thread2Started;
promise<int> signalPromise;
auto signalFuture = signalPromise.get_future().share();
//shared_future<int> signalFuture(signalPromise.get_future());
auto function1 = [&thread1Started, signalFuture] {
thread1Started.set_value();
// Wait until parameter is set.
int parameter = signalFuture.get();
// ...
};
auto function2 = [&thread2Started, signalFuture] {
thread2Started.set_value();
// Wait until parameter is set.
int parameter = signalFuture.get();
// ...
};
// Run both
 lambda expressions asynchronously.
// Remember
 to capture the future returned by async()!
auto result1 = async(launch::async, function1);
auto result2 = async(launch::async, function2);
// Wait until both threads have started.
thread1Started.get_future().wait();
thread2Started.get_future().wait();
// Both threads are now waiting for the parameter.
// Set the parameter to wake up both of them.
signalPromise.set_value(42);

Example: multithreaded Logger class

This section demonstrates how to write a multithreaded Logger class using threads, mutex objects, locks, and condition variables. This class allows different threads to add log messages to a queue. The Logger class itself will process the queue in another background thread and write the log information to a file serially. The design of this class has undergone two iterations to illustrate the problems you may encounter when writing multithreaded code. " The C + + standard does not have thread safe queues. Obviously, access to the queue must be protected through some synchronization mechanisms to prevent multiple threads from reading and writing to the queue at the same time. This example uses mutex objects and condition variables to provide synchronization. On this basis, the Logger class can be defined as follows:

class Logger
{
public:
    // Starts a background thread writing log entries to a file.
    Logger();
    // Prevent copy construction and assignment.
    Logger(const Logger& src) = delete;
    Logger& operator=(const Logger& rhs) = delete;
    // Add log entry to the queue.
    void log(std::string_view entry);
private:
    // The function running in the background thread.
    void processEntries();
    // Mutex and condition variable to protect access to the queue.
    std::mutex mMutex;
    std::condition_variable mCondVar;
    std::queue<std::string> mQueue;
    // The background thread.
    std::thread mThread;
};

The implementation is as follows. Note that there are several problems with this initial design. When trying to run this program, it may behave abnormally or even crash. These problems will be discussed and solved in the next iteration of the Logger class. The inner while loop in the processEntries() method is also noteworthy. This loop processes all messages in the queue, one at a time, and obtains and releases locks in each iteration. This is done to ensure that the loop does not remain locked for too long to prevent other threads from running.

Logger::Logger()
{
	//Start background thread
	mThread = thread(& Logger::processEntries,this);
}
void Logger::log(string_view entry)
{
    //Lock mutex and add entry to the queue
    unique_lock lock(mMutex);
    mQueue.push(string(entry));
    //Notify condition variable to wake up thread
}
void Logger::processEntries()
{
    //Open log file
    ofstream logFile("log.txt");
    if(logFile.fail())
    {
        cerr<<"Failed to open logfile "<<endl;
        return ;
    }
    //Start processing loop
    unique_lock lock(mMutex);
    while(true)
    {
        //Wait for a notification
        mCondVar.wait(lock);
        
        //Condition variable notified,something	might be in the qeue
        lock.unlock();
        while(true)
        {
            lock.lock();
            if(mQueue.empty())
            {
                break;
            }
            else
            {
                logFile<<mQueue.front()<<endl;
                mQueue.pop();
            }
            lock.unlock();
        }
    }
}

warning

As you can see from this fairly simple task, it is very difficult to write multithreaded code correctly. Unfortunately, at present, the C + + standard only provides threads, atomic, mutex objects, conditional variables and future, and does not provide any concurrent data structures, at least until C++17. Of course, future versions may change. The Logger class is an example that demonstrates the basic building blocks. For the code in the production environment, it is recommended to use the appropriate third-party concurrent data structure instead of writing it yourself. For example, the open source Boost C + + library (see http://www.boostorg1/ )A lock free queue is implemented, which allows concurrent use without any explicit synchronization.

You can test this Logger class through the following test code. This code starts some threads, and all threads record some information to the same Logger instance,

void logSomeMessage(int id,Logger& logger)
{
	for(int i = 0; i < 10; ++i)
	{
		stringstream ss;
		ss<<"Log entry "<<i<<" from thread "<<id;
		logger.log(ss.str());
	}
}
int main()
{
	Logger logger;
	vector<thread> threads;
	//Create a few threads all working with the same Logger instance
    for(int i = 0; i < 10;i++)
    {
        threads.emplace_back(logSomeMessages, i, ref(logger));
    }
    //Wait for all threads to finish
    for(auto& t : threads)
    {
        t.join();
    }
}

If you build and run this original version, you will find that the application terminates suddenly. The reason is that the application never calls the join () or detach () of the background thread. Reviewing the previous contents of this chapter, we can see that the destructor of thread object is still combinable, that is, std::terminate() is called to stop the running thread and the application itself without calling join() or detach(). This means that messages still in the queue are not written to the disk file. When an application terminates like this, even some runtime libraries will report an error or generate a dump. You need to add a mechanism to shut down the background thread normally and wait for the background thread to shut down completely before the application itself terminates. This can be solved by adding a destructor and a Boolean member variable to the class. The new Logger class is defined as follows:;

class Logger
{
	public:
		//Gracefully shut down background thread
		virtual ~Logger();
		//Other public members omitted for brevity
    	bool mExit = false;
    	//Other members omitted for brevity
};

The destructor sets mexit to true, wakes up the background thread and waits until the background thread is closed. Set mexit to true before calling notify_ Before all(), the destructor obtains a lock on mMutex. This is using processentries () to prevent contention conditions and deadlocks. ProcessEntries () can be placed at the beginning of its while loop, that is, after checking mExit and calling wait(). If the main thread calls the destructor of the Logger class at this time, and the destructor does not obtain a lock on the mMutex, the destructor sets the mexit to true and calls notify after processEntries() checks the mexit and before waiting for the condition variable_ All(), so processEntries() cannot see the new value or receive notifications. At this point, the application is in a deadlock state because the destructor is waiting for the join() call and the background thread is waiting for the condition variable. Note that the destructor must release the lock on the mMutex before the join() call, which explains the additional code block using curly braces.

warning

In general, when setting a wait condition, you should always have a lock on the mutex associated with the condition variable

Logger::~Logger()
{
	{
		unique_lock lock(mMutex);
		//Gracefully shut down the thread by setting mExit
		//to true and notifying the thread
		mExit = true;
		//Notify condition variable to wake up thread
		mCondVar.notify_all();
	}
	//Wait until thread is shut down,This should be outside the above code
	//block because the lock must be released before calling join()
	mThread.join();
}

The processEntries() method needs to check this boolean variable. When this boolean variable is true, terminate the processing loop:

void Logger::processEntries()
{
	//Open log file
	ofstream logFile("log.txt");
	if(logFile.fail)
	{
		cerr<<"Failed to open logfile "<<endl;
		return;
	}
	//Start processing log
	unique_lock lock(mMutex);
	while(true)
	{
		if(!mExit) //Only wait for notifications if we don't have to exit
		{
			//Wait for a notification
            mCondVar.wait(lock);
		}
        //Condition variable is notified,so something might be in the queue
        //and/or we need to shut down this thread
        lock.unlock();
        while(true)
        {
            if(mQueue.empty())
            {
                break;
            }
            else
            {
                logFile<<mQueue.front()<<endl;
                mQueue.pop();
            }
            lock.unlock();
        }
        if(mExit)
        {
            break;
        }
	}
}

Note that you cannot check mExit only in the condition of the outer while loop, because even if mExit is true, there may be log entries to be written in the queue. Artificial delays can be added at special locations in multithreaded code to trigger a behavior. Note that adding this delay should only be used for testing and should be removed from the final code. For example, to test whether the contention condition caused by the destructor is solved, delete all calls to log() in the main program, make it call the destructor of Logger class almost immediately, and add the following delay:

void Logger::processEntries()
{
	//Omitted for brevity
	//Start processing loop
	unique_lock lock(mMutex);
	while(true)
	{
		this_thread::sleep_for(1000ms); //Needs #include<chrono>
		if(!mExit) //Only wait for notification if we don't have exit
		{
			//Wait for a notification
			mConvar.wait(lock);
		}
	}
}

Thread pool

If you do not dynamically create and delete threads throughout the life cycle of the program, you can also create thread pools that can be used as needed. This technique is usually used for programs that need to process certain types of events in threads. In most environments, the ideal number of threads should be equal to the number of processor cores. If the number of threads is greater than the number of processor cores, the thread will only be suspended, allowing other threads to run, which will eventually increase the overhead. Note that although the ideal number of threads is equal to the number of cores, this is only applicable to compute intensive threads. In this case, threads cannot be blocked for other reasons, such as IO. When threads can block, it is often more appropriate to run more threads than the number of cores. In such cases, it is difficult to determine the optimal number of threads, which may involve measuring the throughput of the system under normal load conditions. Since not all processes are equal, threads in the thread pool often receive a function object or lambda expression representing the calculation to be performed as part of the input. Because all threads in the thread pool are pre-existing, the efficiency of the operating system to schedule and run these threads is much higher than that of the operating system to create threads and respond to input. In addition, the use of thread pool allows to manage the number of threads created, so it can be as few as 1 thread or as many as thousands of threads depending on the platform. Several libraries implement thread pools, such as Intel Threading Building Blocks(TBB), Microsoft Parallel Patterns Library(PPL), etc. It is recommended to use such a library for thread pools instead of writing your own implementation. If you do want to implement a thread pool yourself, you can implement it in a similar way to an object pool. Chapter 25 will list an implementation example of object pool.

Thread design and best practices

This section outlines several best practices for multithreaded programming.

e use parallel standard library algorithms: the standard library contains a large number of algorithms. Starting with C++17, more than 60 algorithms support parallel execution. Try to use these parallel algorithms instead of writing your own multithreaded code. See Chapter 18 for details on how to specify parallelism options for the algorithm.

e before terminating the application, make sure that all thread objects are not associative: make sure that join() or detach() is called on all thread objects. The thread destructor that can still be combined will call std::terminate(), which will suddenly terminate all threads and applications.

e the best synchronization is no synchronization: if different threads are designed in a reasonable way, so that all threads can only read from the shared data without writing the shared data, or only write the parts that other threads will not read, then multithreading programming will become much simpler. In this case, no synchronization is required, and there are no contention conditions or deadlocks.

e try to use the single thread ownership mode: this means that the number of threads with one data block at the same time is no more than 1. Having data means that no other thread is allowed to read / write this data. When the thread has finished processing the data, the data can be passed to another thread, which currently has the sole and complete responsibility / ownership of the data. In this case, synchronization is not necessary.

e "use atomic types and operations when possible: atomic types and operations make it easier to write code without contention conditions and deadlocks because they automatically handle synchronization. If it is impossible to use atomic types and operations in multithreaded design and data sharing is required, synchronization mechanisms (such as mutual exclusion) need to be used to ensure the correctness of synchronization.

e "use lock to protect variable shared data: if you need variable shared data that can be written by multiple threads, and atomic types and operations cannot be used, you must use lock mechanism to ensure that reads and writes between different threads are synchronized.

e release the lock as soon as possible; When you need to protect shared data with a lock, be sure to release the lock as soon as possible. When a thread holds a lock, it will cause other threads to block waiting for the lock, which may reduce performance.

e "don't manually acquire multiple locks, use std::lock() or STD:: try instead_ Lock (): if multiple threads need to acquire multiple locks, all threads should acquire these locks in the same order to prevent deadlock. You can use the generic function std::lock() or std::try_lock() gets multiple locks.

e use RAII lock object: use lock_guard,unique_ The lock, shared lock or scoped lock RAI class automatically releases the lock at the right time.

e use the analyzer that supports multithreading: find the performance bottleneck in multithreaded applications through the analyzer that supports multithreading, and analyze whether multiple threads really use all the available processing capacity in the system. An example of an analyzer that supports multithreading is the profiler in some versions of Visual Studio.

e understand the multithreading support features of the debugger: most debuggers provide the most basic support for multithreaded application debugging. You should be able to get the list of all running threads in the application, and you should be able to switch to any thread and view the call stack of the thread. For example, you can check for deadlocks with these features because you can see exactly what each thread is doing.

e. Use a thread pool instead of dynamically creating and destroying a large number of threads: dynamically creating and destroying a large number of threads can lead to performance degradation. In this case, it is best to use thread pools to reuse existing threads.

e. Using the advanced multithreading Library: at present, the C + + standard only provides basic components for writing multithreaded code. It is not easy to use these components correctly. Try to use advanced multithreading libraries, such as Intel Threading Building Blocks(TBB), Microsoft parallel patterns Library (PPL), etc., instead of implementing them yourself. Multithreaded programming is difficult to master and error prone. In addition, your own implementation does not necessarily work as expected.

Write efficient C + + programs

Is C + + an inefficient language

C programmers often resist the use of C + + in high-performance applications. They claim that C + + is inherently less efficient than C or similar procedural languages because C + + contains high-level concepts, such as exceptions and virtual functions. However, this statement is problematic. First, the role of the compiler cannot be ignored. When discussing the efficiency of language, we must separate the performance of language from the effect of compiler optimization. The computer does not execute C or C + + code. The compiler first converts the code into machine language and optimizes it in the process. This means that you can't simply run benchmarks of C and C + + programs and compare the results. This actually compares the effect of the compiler optimizing the language, not the language itself. C + + compiler can optimize many high-level structures in the language and generate machine code similar to that generated by C language. At present, R & D investment is more focused on C + + compiler than C compiler. Therefore, compared with C code, C + + code will actually be better optimized and run faster.

However, critics still believe that some C + + features cannot be optimized. For example, according to the explanation in Chapter 10, virtual functions need a vtable and an indirect level needs to be added at runtime, so they are slower than ordinary non virtual function calls. However, if you think carefully, you will find that this statement is still unconvincing. Virtual function calls are not only function calls, but also choose which function to call at run time. The corresponding non virtual function call may require a conditional statement to select the function to call. If you do not need these additional semantics, you can use a non virtual function. The general design principle of C + + language is: "if you don't use a function, you don't have to pay a price." if you don't use virtual functions, you won't lose performance because you can use virtual functions. Therefore, in C + +, non virtual function calls are equivalent to function calls in C language in performance. However, since the overhead of virtual function calls is so small, it is recommended that all class methods, including destructors (but not constructors), be designed as virtual methods for all non final classes. More importantly, cleaner programs can be written through the high-level structure of C + +. The design level of these programs is more efficient, easier to read, easier to maintain, and can avoid the accumulation of unnecessary code and dead code. We believe that if we choose C + + language instead of procedural language (such as C language), we will have better results in development, performance and maintenance. There are other more advanced object-oriented languages, such as c# and Java, both of which run on virtual machines. The C + + code is executed directly by the CPU, and there is no virtual machine running the code. C + + is closer to hardware, which means that in most cases, it is faster than languages such as c# and Java.

Language level efficiency

Many books, articles, and programmers spend a lot of time trying to convince you to optimize your code at the language level. These tips and techniques are important and can speed up the program in some cases. However, these optimizations are far less important than the algorithms for overall design and program selection. You can pass all the data you need by reference, but if you write to the disk twice as many times as you actually need, passing by reference will not make the program faster. It's easy to get caught up in the optimization of references and pointers and forget the big picture. In addition, some language level skills can be carried out automatically through a good optimization compiler. You should not spend time optimizing a particular domain yourself unless the analyzer indicates that a domain is a bottleneck, as described later in this chapter.

warning

Use language level optimization with caution. It is recommended to establish a clear and well structured design and implementation scheme first, and then use the analyzer to optimize only the parts marked as performance bottlenecks by the analyzer.

Manipulate objects efficiently

C + + has done a lot of work behind the scenes, especially the work related to objects. You should always pay attention to the performance impact of the code you write. If you follow - some simple guidelines, your code will become more efficient. Note that these principles relate only to objects, not to basic types (such as bool, int, float, etc.).

  1. Pass by reference

This rule has been discussed elsewhere in this book, but it is necessary to reiterate it here.

warning

You should try not to pass objects to functions or methods by value.

If the type of the function parameter is a base class and the object of the derived class is passed by value as an argument, the derived object is sliced to conform to the base class type. This leads to the loss of information, see Chapter 10 for details.

Passing by value incurs replication overhead, which can be avoided by passing by reference. One reason this rule is hard to remember is that on the surface, passing by value won't have any problems. Consider the following Person class representing "Person";

class Person
{
	public:
		Person() = default;
		Person(std::string_view firstName,std::string_view lastName,int age);
		virtual ~Person() = default;
		std::string_view getFirstName() const{ return mFirstName;}
		std::string_view getLastName() const{ return mLastName; }
		int getAge() const { return mAge; }
	private:
    	std::string mFirstName,mLastName;
    	int mAge = 0;
};

You can write a function to receive the Person object, as follows:

void processPerson(Person p)
{
	//Process the person
}

This function may be called as follows:

Person me("Marc","Gregoire",38);
processPersion(me);

Writing this function like this does not seem to add much code;

void processPerson(const Person& p)
{
	//Process the person
}

Calls to functions remain unchanged. However, consider what happens when you pass by value in the first version of the function. To initialize the p parameter of processPerson(), me must copy by calling its copy constructor. Even if no copy constructor is written for the Person class, the compiler generates one to copy each data member. This doesn't look so bad: there are only three data members. However, two of the members are strings, both objects with copy constructors. Therefore, their copy constructors are also called. The processperson () version of p received by reference does not have such replication costs. Therefore, in this example, by passing by reference, you can avoid three constructor calls when the code enters the function.

. this example has not been completed so far. In the first version of processPerson(), p is a local variable of the processPerson() function, so it must be destroyed when the function exits. When destroying, you need to call the destructor of the Person class, and the destructor will call the destructors of all data members. The string class has a destructor, so when you exit this function (if passed by value), the destructor will be called three times. If the Person object is passed by reference, no such call needs to be made.

be careful

If the function must modify the object, you can pass the object by reference. If the Dun number should not modify the object, it can be passed through the const reference, as shown in the previous example. For more information about references and const, see Chapter 11.

be careful:

Pointer passing should be avoided because it is relatively outdated compared with reference passing, which is equivalent to going back to C language, and is rarely suitable for C + + (unless passing nullptr in the design has special significance).

  1. Return by reference

Just as you should pass an object to a function by reference, you should return a reference from the function to avoid unnecessary duplication of the object. But sometimes it is impossible to return objects by reference, such as when writing overloaded operator + and other similar operators. Never return a reference or pointer to a local object, which will be destroyed when the function exits. Since C++11, C + + language supports mobile semantics, allowing efficient return of objects by value rather than reference semantics.

  1. Catch exceptions by reference

As described in Chapter 14, exceptions should be caught by reference to avoid fragmentation and additional replication. Throwing exceptions is a big performance overhead, so any small things to improve efficiency are helpful.

  1. Using mobile semantics

The class should implement a move constructor and a move assignment operator to allow the C + + compiler to use move semantics for class objects. According to the "zero rule" (see Chapter 9), when designing classes, it is sufficient for the compiler to generate copy and move constructors and copy and move assignment operators. If the compiler cannot implicitly define these classes, they can be explicitly set to default if allowed. If this does not work, it should be done by itself. When the object uses mobile semantics, returning from the function by value will not cause great replication overhead, so it is more efficient. More about mobile semantics

  1. Avoid creating temporary objects

In some cases, the compiler creates temporary nameless objects. As described in Chapter 9, after writing a global operator + for a class, you can add objects of this class and objects of other types, as long as objects of other types can be converted into objects of this class. For example, some definitions of the SpreadsheetCell class are as follows;

class SpreadsheetCell
{
	public:
		//Other constructors omitted for brevity
		SpreadsheetCell(double initialValue);
		//Reminder omitted	for brevity;
};
SpreadsheetCell operator+(const SpreadsheetCell& lhs,const SpreadsheetCell& rhs);

Keywords: C++

Added by antwonw on Tue, 05 Oct 2021 05:17:52 +0300