c++ lambda在模板的第二次扩展中没有捕获变量?

时间:2022-11-25 18:59:25

I have some tortuous code in a template that uses @R. Martinho Fernandes's trick to loop unroll some packed parameters in a variadic template and invoke the same code on each argument in the argument list.

我在一个使用@R的模板中有一些曲折的代码。Martinho Fernandes的技巧,在一个变量模板中展开一些压缩参数,并在参数列表中的每个参数上调用相同的代码。

However, it seems as though the lambdas are not being initialized properly and they are instead sharing variables across functor(?) instances, which seems wrong.

然而,似乎lambdas没有被正确初始化,相反,他们在functor(?)实例之间共享变量,这似乎是错误的。

Given this code:

鉴于这种代码:

#include <iostream>
#include <functional>

template<typename... Args>
void foo(Args ... args) {
  int * bar = new int();
  *bar = 42;

  using expand_type = int[];
  expand_type{(
    args([bar]() {
        std::cerr<<std::hex;
        std::cerr<<"&bar="<<(void*)&bar<<std::endl;
        std::cerr<<"  bar="<<(void*)bar<<std::endl;
        std::cerr<<"  bar="<<*bar<<std::endl<<std::endl;
    }),
    0) ... 
  };
};

int main() {
  std::function<void(std::function<void()>)> clbk_func_invoker = [](std::function<void()> f) { f(); };
  foo(clbk_func_invoker, clbk_func_invoker);

  return 0;
}

I get the following output:

得到如下输出:

&bar=0x7ffd22a2b5b0
  bar=0x971c20
  bar=2a

&bar=0x7ffd22a2b5b0
  bar=0
Segmentation fault (core dumped)

So, what I believe I'm seeing is that the two functor instances share the same address for captured variable bar, and after the invocation of the first functor, bar is being set to nullptr, and then the second functor seg'-faults when it tries to dereference the same bar variable ( in the exact same address ).

所以,我相信我看到的是两个函子实例共享相同的地址变量捕获酒吧,和之后的调用第一个函子,酒吧被设置为nullptr,然后第二函子赛格的缺点当它试图废弃同一个酒吧变量(在相同的地址)。

FYI, I realize that I can work around this issue by moving the [bar](){... functor into a variable std::function variable and then capturing that variable. However, I would like to understand why the second functor instance is using the exact same bar address and why it is getting a nullptr value.

我意识到我可以通过移动[bar](){来解决这个问题。将函子转换为变量std::函数变量,然后捕获该变量。但是,我想了解为什么第二个函数实例使用相同的bar地址,以及为什么它会得到nullptr值。

I ran this with GNU's g++ against their trunk version retrieved and compiled yesterday.

我使用GNU的g++在昨天检索和编译的主干版本上运行这个程序。

3 个解决方案

#1


2  

Parameter packs with lambdas in them tend to give compilers fits. One way to avoid that is to move the expansion part and the lambda part separate.

带有lambdas的参数包往往会使编译器适应。避免这种情况的一种方法是将展开部分和lambda部分分开。

template<class F, class...Args>
auto for_each_arg( F&& f ) {
  return [f=std::forward<F>(f)](auto&&...args){
    using expand_type = int[];
    (void)expand_type{0,(void(
      f(decltype(args)(args))
    ),0)...};
  };
}

This takes a lambda f and returns an object that will invoke f on each of its arguments.

它取一个f,并返回一个对象,该对象将在其每个参数上调用f。

We can then rewrite foo to use it:

然后我们可以重写foo来使用它:

template<typename... Args>
void foo(Args ... args) {
  int * bar = new int();
  *bar = 42;

  for_each_arg( [bar](auto&& f){
    f( [bar]() {
      std::cerr<<std::hex;
      std::cerr<<"&bar="<<(void*)&bar<<std::endl;
      std::cerr<<"  bar="<<(void*)bar<<std::endl;
      std::cerr<<"  bar="<<*bar<<std::endl<<std::endl;
    } );
  } )
  ( std::forward<Args>(args)... );
}

live example.

生活的例子。

I initially thought it had to do with the std::function constructor. It does not. A simpler example without a std::function that crashes the same way:

我最初认为它与std::函数构造函数有关。它不。一个简单的例子,没有std::函数,以同样的方式崩溃:

template<std::size_t...Is>
void foo(std::index_sequence<Is...>) {
  int * bar = new int();
  *bar = 42;

  using expand_type = int[];
  expand_type{(
    ([bar]() {
      std::cerr<<"bar="<<*bar<<'\n';
    })(),
    (int)Is) ... 
  };
}

int main() {
  foo(std::make_index_sequence<2>{});

  return 0;
}

we can invoke the segfault without the cerr, giving us disassembly that is easier to read:

我们可以在没有cerr的情况下调用segfault,使我们的拆卸更容易阅读:

void foo<3, 0ul, 1ul>(std::integer_sequence<unsigned long, 0ul, 1ul>)::{lambda()#1}::operator()() const:
    pushq   %rbp
    movq    %rsp, %rbp
    movq    %rdi, -8(%rbp)
    movq    -8(%rbp), %rax
    movq    (%rax), %rax
    movl    $3, (%rax)
    nop
    popq    %rbp
    ret
void foo<3, 0ul, 1ul>(std::integer_sequence<unsigned long, 0ul, 1ul>):
    pushq   %rbp
    movq    %rsp, %rbp
    pushq   %rbx
    subq    $40, %rsp
    movl    $4, %edi
    call    operator new(unsigned long)
    movl    $0, (%rax)
    movq    %rax, -24(%rbp)
    movq    -24(%rbp), %rax
    movl    $42, (%rax)
    movq    -24(%rbp), %rax
    movq    %rax, -48(%rbp)
    leaq    -48(%rbp), %rax
    movq    %rax, %rdi
    call    void foo<3, 0ul, 1ul>(std::integer_sequence<unsigned long, 0ul, 1ul>)::{lambda()#1}::operator()() const
    movabsq $-4294967296, %rax
    andq    %rbx, %rax
    movq    %rax, %rbx
    movq    $0, -32(%rbp)
    leaq    -32(%rbp), %rax
    movq    %rax, %rdi
    call    void foo<3, 0ul, 1ul>(std::integer_sequence<unsigned long, 0ul, 1ul>)::{lambda()#1}::operator()() const
    movl    %ebx, %edx
    movabsq $4294967296, %rax
    orq     %rdx, %rax
    movq    %rax, %rbx
    nop
    addq    $40, %rsp
    popq    %rbx
    popq    %rbp
    ret

I have yet to parse the disassembly, but it obviously trashes the state of the second lambda when playing with the first.

我还没有解析解集,但显然它在处理第一个时破坏了第二个的状态。

#2


2  

First of all I don't have a solution, I'd want to add this extra information as a comment, but unfortunately I can't comment yet.

首先,我没有解决方案,我想添加这个额外的信息作为评论,但不幸的是我还不能评论。

I tried your previous code with Intel 17 c++ compiler and worked fine:

我用Intel 17c++编译器尝试了你之前的代码,运行良好:

&bar=0x7fff29e40c50
  bar=0x616c20
  bar=2a

&bar=0x7fff29e40c50
  bar=0x616c20
  bar=2a

In some cases the &bar (the address of the new variable used to store the captured value) was different between the first call and the second, but it also worked.

在某些情况下,&bar(用于存储捕获值的新变量的地址)在第一个调用和第二个调用之间是不同的,但是它也可以工作。

I also tried your code with GNU's g++ changing the type of bar from int* to int. The captured value was wrong in the second and successive calls even in this case:

我还用GNU的g++将bar的类型从int*更改为int.即使在这种情况下,在第二次和连续的调用中捕获的值也是错误的:

&bar=0x7fffeae12480
  bar=2a
&bar=0x7fffeae12480
  bar=0
&bar=0x7fffeae12480
  bar=0

Finally I tried modifying a bit the code and passing by value and object, so the copy constructor must be called:

最后我修改了一点代码并通过值和对象传递,所以复制构造函数必须被调用:

#include <iostream>
#include <functional>

struct  A {
    A(int x) : _x(x) { 
      std::cerr << "Constructor!" << n++ << std::endl;
    }
    A(const A& a) : _x(a._x) {
      std::cerr << "Copy Constructor!"  << n++ << std::endl;
    }
    static int n;
    int _x;  
};

int A::n = 0;

template<typename... Args>
void foo(Args ... args) {
  A a(42);

  std::cerr << "-------------------------------------------------" << std::endl;
  using expand_type = int[];
  expand_type  {
   (args( [a]() {
          std::cerr << "&a, "<< &a << ", a._x," << a._x << std::endl;
         }
       ),
    0) ... 
  };
std::cerr << "-------------------------------------------------" << std::endl;
}

int main() {
  std::function<void(std::function<void()>)> clbk_func_invoker = [](std::function<void()> f) { f(); };
  foo(clbk_func_invoker, clbk_func_invoker, clbk_func_invoker);
  return 0;
}

My current version of g++ (g++ (GCC) 6.1.0) is not able to compile this code. I also tried with Intel and it worked, although I don't fully understand why the copy constructor is called so many times:

我当前版本的g++ (g++ (GCC) 6.1.0)不能编译这段代码。我也尝试了英特尔,它成功了,尽管我不完全理解为什么拷贝构造函数会被多次调用:

Constructor!0
-------------------------------------------------
Copy Constructor!1
Copy Constructor!2
Copy Constructor!3
&a, 0x617c20, a._x,42
Copy Constructor!4
Copy Constructor!5
Copy Constructor!6
&a, 0x617c20, a._x,42
Copy Constructor!7
Copy Constructor!8
Copy Constructor!9
&a, 0x617c20, a._x,42
-------------------------------------------------

That's all I tested so far.

到目前为止我只测试了这些。

#3


1  

After a few test I found that all was about the evaluation of the lambdas not the pack expansion.

经过几次测试,我发现这一切都是关于对lambdas的评估,而不是对pack扩展的评估。

What you have is a set of lambdas that do not execute until the pack expansion complete so that at the moment of execution all of them observe the same instance of the variables, which would be different if the execution of each lambda correspond with the order of the expansion, then each expansion will get it's own copy of the variable and the lambda would be considered a materialized prvalue which lifetime has ended:

你所拥有的是一组λ不执行,直到包扩张完成目前的执行都遵守相同的实例变量,这将是不同的,如果每个λ的执行与扩张的顺序,那么每个扩张将得到它的变量和λ的副本将被视为物化prvalue这一生已经结束:

template<typename... Args>
void foo(Args ... args) {
    int * bar = new int();
    *bar = 42;

    using expand_type = int[];
    expand_type{( args([bar]{
       std::cerr<<std::hex;
       std::cerr<<"&bar="<<(void*)&bar<<std::endl;
       std::cerr<<"  bar="<<(void*)bar<<std::endl;
       std::cerr<<"  bar="<<*bar<<std::endl<<std::endl;
       return 0;
    }()),0) ...
  };
};

int main() {
    std::function<void(int)> clbk_func_invoker = [](int) {  };    
    foo(clbk_func_invoker, clbk_func_invoker);    
    return 0;
}

live example.

生活的例子。

However, the compiler is able to do a little optimization even when evaluated lambdas are expanded and not executed while expanding for trivial classes under capture by copy.

然而,编译器能够做一点优化,即使当被求值的lambdas被展开而没有被执行时,编译器也能对被捕获的琐碎类进行扩展。

Let put a more simple example:

让我们举一个更简单的例子:

struct A{ };

template<class... T>
auto foo(T... args){
  A a;
  std::cout<< &a << std::endl;
  using expand = int[];

 expand{ 0,(args([a] { 
      std::cout << &a << " " << std::endl; return 0; }),void(),0)... 
 };
}

foo([](auto i){ i(); }, [](auto i){  i(); }); 

Will output the same address of a for every expanded lambda, even when individuals copy of a are expected. since capture by copy produce a constant version of the copied variable and no mutation can be made no these copies, for trivial classes is a kind of performance to share the same instance through all expanded lambdas(because no change is guaranteed).

将为每一个展开的lambda输出相同的a地址,即使是在需要单个复制a的情况下。由于按复制捕获生成复制变量的一个常量版本,因此不能生成任何突变,因此对于普通类来说,通过所有扩展的lambdas共享相同的实例是一种性能(因为保证不进行任何更改)。

But if the type now is not a trivial one, that optimization has been broken and different copies are required for each expanded lambda:

但是,如果现在的类型不是一个平凡的类型,那么这个优化就被破坏了,每个扩展lambda需要不同的副本:

 struct A{ 
    A() = default;
    A(const A&){}
 };

This change on A cause different address for a appear in the output.

在输出中会出现一个不同地址上的更改。

#1


2  

Parameter packs with lambdas in them tend to give compilers fits. One way to avoid that is to move the expansion part and the lambda part separate.

带有lambdas的参数包往往会使编译器适应。避免这种情况的一种方法是将展开部分和lambda部分分开。

template<class F, class...Args>
auto for_each_arg( F&& f ) {
  return [f=std::forward<F>(f)](auto&&...args){
    using expand_type = int[];
    (void)expand_type{0,(void(
      f(decltype(args)(args))
    ),0)...};
  };
}

This takes a lambda f and returns an object that will invoke f on each of its arguments.

它取一个f,并返回一个对象,该对象将在其每个参数上调用f。

We can then rewrite foo to use it:

然后我们可以重写foo来使用它:

template<typename... Args>
void foo(Args ... args) {
  int * bar = new int();
  *bar = 42;

  for_each_arg( [bar](auto&& f){
    f( [bar]() {
      std::cerr<<std::hex;
      std::cerr<<"&bar="<<(void*)&bar<<std::endl;
      std::cerr<<"  bar="<<(void*)bar<<std::endl;
      std::cerr<<"  bar="<<*bar<<std::endl<<std::endl;
    } );
  } )
  ( std::forward<Args>(args)... );
}

live example.

生活的例子。

I initially thought it had to do with the std::function constructor. It does not. A simpler example without a std::function that crashes the same way:

我最初认为它与std::函数构造函数有关。它不。一个简单的例子,没有std::函数,以同样的方式崩溃:

template<std::size_t...Is>
void foo(std::index_sequence<Is...>) {
  int * bar = new int();
  *bar = 42;

  using expand_type = int[];
  expand_type{(
    ([bar]() {
      std::cerr<<"bar="<<*bar<<'\n';
    })(),
    (int)Is) ... 
  };
}

int main() {
  foo(std::make_index_sequence<2>{});

  return 0;
}

we can invoke the segfault without the cerr, giving us disassembly that is easier to read:

我们可以在没有cerr的情况下调用segfault,使我们的拆卸更容易阅读:

void foo<3, 0ul, 1ul>(std::integer_sequence<unsigned long, 0ul, 1ul>)::{lambda()#1}::operator()() const:
    pushq   %rbp
    movq    %rsp, %rbp
    movq    %rdi, -8(%rbp)
    movq    -8(%rbp), %rax
    movq    (%rax), %rax
    movl    $3, (%rax)
    nop
    popq    %rbp
    ret
void foo<3, 0ul, 1ul>(std::integer_sequence<unsigned long, 0ul, 1ul>):
    pushq   %rbp
    movq    %rsp, %rbp
    pushq   %rbx
    subq    $40, %rsp
    movl    $4, %edi
    call    operator new(unsigned long)
    movl    $0, (%rax)
    movq    %rax, -24(%rbp)
    movq    -24(%rbp), %rax
    movl    $42, (%rax)
    movq    -24(%rbp), %rax
    movq    %rax, -48(%rbp)
    leaq    -48(%rbp), %rax
    movq    %rax, %rdi
    call    void foo<3, 0ul, 1ul>(std::integer_sequence<unsigned long, 0ul, 1ul>)::{lambda()#1}::operator()() const
    movabsq $-4294967296, %rax
    andq    %rbx, %rax
    movq    %rax, %rbx
    movq    $0, -32(%rbp)
    leaq    -32(%rbp), %rax
    movq    %rax, %rdi
    call    void foo<3, 0ul, 1ul>(std::integer_sequence<unsigned long, 0ul, 1ul>)::{lambda()#1}::operator()() const
    movl    %ebx, %edx
    movabsq $4294967296, %rax
    orq     %rdx, %rax
    movq    %rax, %rbx
    nop
    addq    $40, %rsp
    popq    %rbx
    popq    %rbp
    ret

I have yet to parse the disassembly, but it obviously trashes the state of the second lambda when playing with the first.

我还没有解析解集,但显然它在处理第一个时破坏了第二个的状态。

#2


2  

First of all I don't have a solution, I'd want to add this extra information as a comment, but unfortunately I can't comment yet.

首先,我没有解决方案,我想添加这个额外的信息作为评论,但不幸的是我还不能评论。

I tried your previous code with Intel 17 c++ compiler and worked fine:

我用Intel 17c++编译器尝试了你之前的代码,运行良好:

&bar=0x7fff29e40c50
  bar=0x616c20
  bar=2a

&bar=0x7fff29e40c50
  bar=0x616c20
  bar=2a

In some cases the &bar (the address of the new variable used to store the captured value) was different between the first call and the second, but it also worked.

在某些情况下,&bar(用于存储捕获值的新变量的地址)在第一个调用和第二个调用之间是不同的,但是它也可以工作。

I also tried your code with GNU's g++ changing the type of bar from int* to int. The captured value was wrong in the second and successive calls even in this case:

我还用GNU的g++将bar的类型从int*更改为int.即使在这种情况下,在第二次和连续的调用中捕获的值也是错误的:

&bar=0x7fffeae12480
  bar=2a
&bar=0x7fffeae12480
  bar=0
&bar=0x7fffeae12480
  bar=0

Finally I tried modifying a bit the code and passing by value and object, so the copy constructor must be called:

最后我修改了一点代码并通过值和对象传递,所以复制构造函数必须被调用:

#include <iostream>
#include <functional>

struct  A {
    A(int x) : _x(x) { 
      std::cerr << "Constructor!" << n++ << std::endl;
    }
    A(const A& a) : _x(a._x) {
      std::cerr << "Copy Constructor!"  << n++ << std::endl;
    }
    static int n;
    int _x;  
};

int A::n = 0;

template<typename... Args>
void foo(Args ... args) {
  A a(42);

  std::cerr << "-------------------------------------------------" << std::endl;
  using expand_type = int[];
  expand_type  {
   (args( [a]() {
          std::cerr << "&a, "<< &a << ", a._x," << a._x << std::endl;
         }
       ),
    0) ... 
  };
std::cerr << "-------------------------------------------------" << std::endl;
}

int main() {
  std::function<void(std::function<void()>)> clbk_func_invoker = [](std::function<void()> f) { f(); };
  foo(clbk_func_invoker, clbk_func_invoker, clbk_func_invoker);
  return 0;
}

My current version of g++ (g++ (GCC) 6.1.0) is not able to compile this code. I also tried with Intel and it worked, although I don't fully understand why the copy constructor is called so many times:

我当前版本的g++ (g++ (GCC) 6.1.0)不能编译这段代码。我也尝试了英特尔,它成功了,尽管我不完全理解为什么拷贝构造函数会被多次调用:

Constructor!0
-------------------------------------------------
Copy Constructor!1
Copy Constructor!2
Copy Constructor!3
&a, 0x617c20, a._x,42
Copy Constructor!4
Copy Constructor!5
Copy Constructor!6
&a, 0x617c20, a._x,42
Copy Constructor!7
Copy Constructor!8
Copy Constructor!9
&a, 0x617c20, a._x,42
-------------------------------------------------

That's all I tested so far.

到目前为止我只测试了这些。

#3


1  

After a few test I found that all was about the evaluation of the lambdas not the pack expansion.

经过几次测试,我发现这一切都是关于对lambdas的评估,而不是对pack扩展的评估。

What you have is a set of lambdas that do not execute until the pack expansion complete so that at the moment of execution all of them observe the same instance of the variables, which would be different if the execution of each lambda correspond with the order of the expansion, then each expansion will get it's own copy of the variable and the lambda would be considered a materialized prvalue which lifetime has ended:

你所拥有的是一组λ不执行,直到包扩张完成目前的执行都遵守相同的实例变量,这将是不同的,如果每个λ的执行与扩张的顺序,那么每个扩张将得到它的变量和λ的副本将被视为物化prvalue这一生已经结束:

template<typename... Args>
void foo(Args ... args) {
    int * bar = new int();
    *bar = 42;

    using expand_type = int[];
    expand_type{( args([bar]{
       std::cerr<<std::hex;
       std::cerr<<"&bar="<<(void*)&bar<<std::endl;
       std::cerr<<"  bar="<<(void*)bar<<std::endl;
       std::cerr<<"  bar="<<*bar<<std::endl<<std::endl;
       return 0;
    }()),0) ...
  };
};

int main() {
    std::function<void(int)> clbk_func_invoker = [](int) {  };    
    foo(clbk_func_invoker, clbk_func_invoker);    
    return 0;
}

live example.

生活的例子。

However, the compiler is able to do a little optimization even when evaluated lambdas are expanded and not executed while expanding for trivial classes under capture by copy.

然而,编译器能够做一点优化,即使当被求值的lambdas被展开而没有被执行时,编译器也能对被捕获的琐碎类进行扩展。

Let put a more simple example:

让我们举一个更简单的例子:

struct A{ };

template<class... T>
auto foo(T... args){
  A a;
  std::cout<< &a << std::endl;
  using expand = int[];

 expand{ 0,(args([a] { 
      std::cout << &a << " " << std::endl; return 0; }),void(),0)... 
 };
}

foo([](auto i){ i(); }, [](auto i){  i(); }); 

Will output the same address of a for every expanded lambda, even when individuals copy of a are expected. since capture by copy produce a constant version of the copied variable and no mutation can be made no these copies, for trivial classes is a kind of performance to share the same instance through all expanded lambdas(because no change is guaranteed).

将为每一个展开的lambda输出相同的a地址,即使是在需要单个复制a的情况下。由于按复制捕获生成复制变量的一个常量版本,因此不能生成任何突变,因此对于普通类来说,通过所有扩展的lambdas共享相同的实例是一种性能(因为保证不进行任何更改)。

But if the type now is not a trivial one, that optimization has been broken and different copies are required for each expanded lambda:

但是,如果现在的类型不是一个平凡的类型,那么这个优化就被破坏了,每个扩展lambda需要不同的副本:

 struct A{ 
    A() = default;
    A(const A&){}
 };

This change on A cause different address for a appear in the output.

在输出中会出现一个不同地址上的更改。