Clause 31 Avoid default capture mode

lambda expressions

  • lambda expression is a kind of expression, which is an integral part of the source code.
std::find_if(container.begin(), container.end(),
            [](int val){ return 0 < val && val < 10; });
  • Closures are runtime objects created by lambda. According to different capture modes, closures hold copies or references of data. In the above example, the closure is passed to STD:: find as the third argument at run time_ If object. four
  • A closure class is a class that instantiates a closure. Each lambda expression triggers the compiler to generate a unique closure class. The statement in the closure will become the executable instruction of its closure class member function.

Lambda expressions are often used to create closures and only use them as arguments passed to functions. However, generally speaking, closures can be copied, so a closure type corresponding to a single lambda can have multiple closures.

Avoid default capture mode

There are two default capture modes in C++11: by reference or by value. The default capture mode by reference may lead to dangling references. The default capture mode by value will deceive you, as if it can be immune to dangling references (actually not), and make you think your closures are independent.

Capturing by reference results in closing references that contain references to local variables or to formal parameters within the scope that define lambda expressions. Once the closure created by lambda passes the lifetime of the local variable or parameter, the reference in the closure will be empty. For example, we have a container whose element is a filter function. Each filter function accepts an int and returns a bool to indicate whether the incoming value meets the filter conditions:

using FilterContainer = std::vector<std::function<bool(int)>>;			//About using

FilterContainer filters;			//Element is the container of the filter function

We can add a function that filters multiples of 5 as follows:

filters.emplace_back(
	[](int value){ return value % 5 == 0; }
);

However, we may need to calculate the divisor at run time instead of writing the hard coded "5" into the lambda formula, so the code to add the filter may be similar to the following code.

void addDivisorFilter()
{
    auto calc1 = computeSomeValue1();
    auto calc2 = computeSomeValue2();
    
    auto divisor = computeDivisor(calc1, calc2);
    
    filters.emplace_back(								//DANGER
    [&](int value) { return value % divisor == 0; }		//To the director
    );													//The direction of may be suspended

}

This code can go wrong at any time. lambda is a reference to the local variable divisor, but the variable no longer exists when addDivisorFilter returns. In other words, the destruction of this variable is followed by filters emplace_ The moment back returns. So this means that the function added to the filter dies as soon as it is added. Using this filter will produce undefined behavior from the moment it is created.

Even if you don't do this, the problem remains by explicitly capturing the divisor by reference.

filters.emplace_back(
[&divisor](int value)
    { return value % divisor == 0; }		//Danger, the direction to the director may be suspended
);

However, through explicit capture, it is indeed easier to see that the survival of lambda depends on the lifetime of the divider. Moreover, clearly writing out the name division also reminds us that it reconfirms that the Division has at least as long life as the lambda closure. The explicit indication is more impressive than the painless "to ensure that there is no hanging" advice conveyed by [&].

If you know that closures will be used immediately (for example, passed to the STL algorithm) and will not be copied, there is no risk of using references in lambda expressions.

For example, our filter lambda is only used as STD:: all of C++11_ The function of the latter is to return the judgment of whether all elements in a scope meet certain conditions.

template<typename C>
void workWithContainer(const C& container)
{
    auto calc1 = computeSomeValue1();
    auto calc2 = computeSomeValue2();
    
    auto divisor = computeDivisor(calc1, calc2);
    
    using ContElemT = typename C::value_type;
    
    using std::begin;
    using std::end;
    
    if(std::all_of(								//If all
    	begin(container), end(container),		//Element value in container
    [&](const ContElemT& value)					//They are all divisor s
        { return value % divisor == 0; }))		//Multiple of
    {
        ...
    }
    else
    {
        ...
    }
}

Yes, it's safe to use it like this, but it's precarious. If you find that the lambda expression is useful in other contexts (for example, adding it to the filters container to become a function element), and then copied and pasted to other contexts where the closure has a longer life than the divisor, you will be dragged back to the empty dilemma.

In the long run, explicitly listing the local variables or formal parameters that lambda depends on is a better software engineering practice.

By the way, C++14 provides the ability to use auto in lambda formal parameter declaration, which means that the above code can be simplified in C++14, the declaration of ContElemT can be deleted, and the if condition can be changed as follows.

if(std::all_of(begin(container), end(container),
              [&](const auto& value)
               { return value % divisor == 0; }))

One way to solve this problem is to adopt the default capture mode by value for the divisor. We add lambda to the container.

filters.emplace_back(
	[=](int value) { return value % divisor == 0; }
);

For this example, this is enough. However, in general, default capture by value is not the panacea you think you can avoid hanging. The problem is that after a pointer is captured by value, a copy of the pointer is held in the closure created by lambda, but you can't prevent code other than lambda from deleting the pointer, resulting in the pointer copy hanging.

Suppose that one operation that the Widget class can perform is to add an entry to the filter container:

class Widget{
public:
    ...							//Constructor, etc
    void addFilter() const; 	//Add an entry to filters

private:
	int divisor;				//filters element for Widget    
};

Widget::addFilter may be defined as follows

void Widget::addFilter() const
{
    filters.emplace_back(
    [=](int value){ return value % divisor == 0; }
    );
}

This code is wrong, completely wrong.

Capture can only be targeted at non static local variables (including formal parameters) visible within the scope where lambda is created. The divisor is not a local variable, but a member variable of the Widget class. It cannot be captured at all.

In this way, if the default capture mode is eliminated, the code will not compile:

void Widget::addFilter() const
{
    filters.emplace_back(
    [](int value){ return value % divisor == 0; }
    );
}

Moreover, if you try to explicitly capture the divisor (either by value or by reference), the capture statement cannot be compiled because the divisor is neither a local variable nor a formal parameter;

void Widget::addFilter() const
{
    filters.emplace_back(
    [divisor](int value){ return value % divisor == 0; }
    );
}

The key to why these two statements fail to compile lies in the implicit application of a bare pointer, which is this. Each non static member function holds a this pointer, which is then used whenever the member variable of the class is mentioned. For example, in any member function of a Widget, the compiler will replace the director with this - > director. In the default capture by value version of Widget::addFilter.

void Widget::addFilter() const
{
    filters.emplace_back(
    [=](int values){ return value% divisor == 0; }
    );
}

What is captured is actually the this pointer of the Widget, not the divider. From the compiler's point of view, the above code is equivalent to

void Widget::addFilter() const
{
    auto currentObjectPtr = this;

    filters.emplace_back(
    [currentObjectPtr](int value)
        { return value % currentObject->divisor == 0; }
    );
}

Understanding this is equivalent to understanding that the lifetime of a lambda closure's Widget object that contains a copy of its this pointer is bound together. In particular, consider the following code, which captures the essence of Chapter 4 and uses only smart pointers.

using FilterContainer = std::vector<std::function<bool(int)>>;

FilterContainer filters;

void doSomeWork()
{
    auto pw = std::make_unique<Widget>();
    
    pw->addFilter();		//Add a filter function using Widget::divisor
    
    ...						//The Widget was destroyed and filters now hold null pointers
}

When doSomeWork is called, a filter function is created, which depends on std::make_unique creates a Widget object, that is, a filter function that contains a pointer to the Widget (a copy of the Widget's this pointer). This function is added to filters, but after the execution of doSomeWork, the Widget object is managed by STD:: unique of its life cycle_ PTR is destroyed. From that moment on, the filter contains an element with a dangling pointer.

This particular problem can be solved by copying the member variable you want to capture into the local variable, and then capturing the local copy.

void Widget::addFilter() const
{
    auto divisorCopy = divisor;		//Copy member variables
    
    filters.emplace_back(									//Copy capture
    [divisorCopy](int value){ return value % divisorCopy == 0; }		//Use copy
    );
}

To be honest, if you plan to adopt this method, the default capture by value can also run

void Widget::addFilter() const
{
    auto divisorCopy = divisor;		//Copy member variables
    
    filters.emplace_back(									//Capture copy
    [=](int value){ return value % divisorCopy == 0; }		//Use copy
    );
}

In C++14, a better way to capture member variables is to use generalized lambda capture.

void Widget::addFilter() const
{
    filters.emplace_back(
    [divisor = divisor](int value){ return value % divisor == 0; }	//Copy the divisor into the closure
    );
}

For generalized lambda capture, there is no default capture mode. However, even in C++14, the proposal of this clause is still valid.

Another disadvantage of using default capture patterns is that they seem to indicate that the relevant closures are independent and unaffected by changes in external data. Generally speaking, this is wrong. Because lambda expressions depend not only on local variables and parameters, but also on static memory objects. Such objects are defined in the global or namespace scope, or in the class, in the function, and declared with the static modifier in the file. Such objects can be used in lambda objects, but they cannot be captured, but if the capture by default mode is used, these objects will give the illusion that they can be captured.

void addDivisorFilter()
{
    static auto calc1 = computeSomeValue1();		//Now declare with static modifier
    static auto calc2 = computeSomeValue2();		//Now declare with static modifier
    
    static auto divisor = computeDivisor(calc1, calc2);	//Now declare with static modifier
    
    filters.emplace_back(								
    [=](int value) { return value % divisor == 0; }		//Nothing was captured
    );													
	
    ++divisor;				//The director was modified unexpectedly
}

Readers who read ten lines at a glance will take it for granted that lambda is independent when they see [=] in the code. But lambda is not independent, because it doesn't use any non static local variables and formal parameters, so it can't capture anything. However, lambda's code refers to the static variable divisor. After any lambda is added to filters, the divisor will be incremented. Through this function, many Lambdas will be added to filters, but the behavior of each lambda is different.

From the actual effect, the effect of this lambda implementation is to capture the division by reference, which is in direct contradiction with the implied meaning of default capture by value.

Key points shorthand

  • Default capture by reference can cause dangling pointer problems.
  • Default capture by value is highly susceptible to dangling pointers (especially this) and can mislead people into thinking that lambda is independent.

Added by GamingWarrior on Wed, 09 Feb 2022 16:55:56 +0200