在c++ 11中,thread_local意味着什么?

时间:2022-06-17 21:05:26

I am confused with the description of thread_local in C++11. My understanding is, each thread has unique copy of local variables in a function. The global/static variables can be accessed by all the threads (possibly synchronized access using locks). And the thread_local variables are visible to all the threads but can only modified by the thread for which they are defined? Is it correct?

我对c++ 11中对thread_local的描述感到困惑。我的理解是,每个线程都有一个函数的局部变量的唯一副本。所有线程都可以访问全局/静态变量(可能使用锁进行同步访问)。thread_local变量对所有线程都是可见的,但是只能通过它们被定义的线程进行修改?是正确的吗?

3 个解决方案

#1


78  

Thread-local storage duration is a term used to refer to data that is seemingly global or static storage duration (from the viewpoint of the functions using it) but in actual fact, there is one copy per thread.

线程本地存储持续时间(thread -local storage duration)是指看似全局或静态的数据(从使用它的函数的角度来看),但实际上,每个线程都有一个副本。

It adds to the current automatic (exists during a block/function), static (exists for the program duration) and dynamic (exists on the heap between allocation and deallocation).

它添加到当前的自动(存在于块/函数中)、静态(存在于程序持续时间中)和动态(存在于分配和释放之间的堆上)。

Something that is thread-local is brought into existence at thread creation and disposed of when the thread stops.

线程本地的一些内容在线程创建时产生并在线程停止时处理。

Some examples follow.

一些例子。

Think of a random number generator where the seed must be maintained on a per-thread basis. Using a thread-local seed means that each thread gets its own random number sequence, independent of other threads.

请考虑一个随机数生成器,其中种子必须以每个线程为基础维护。使用线程本地种子意味着每个线程都有自己的随机数序列,独立于其他线程。

If your seed was a local variable within the random function, it would be initialised every time you called it, giving you the same number each time. If it was a global, threads would interfere with each other's sequences.

如果你的种子是随机函数中的一个局部变量,那么每次你调用它时它都会被初始化,每次都会给你相同的数字。如果它是全局的,线程会相互干扰对方的序列。

Another example is something like strtok where the tokenisation state is stored on a thread-specific basis. That way, a single thread can be sure that other threads won't screw up its tokenisation efforts, while still being able to maintain state over multiple calls to strtok - this basically renders strtok_r (the thread-safe version) redundant.

另一个例子是strtok,它将标记状态存储在特定于线程的基础上。这样,单个线程就可以确保其他线程不会在对strtok的多次调用中维护状态,同时也可以确保其他线程不会出错。

Both these examples allow for the thread local variable to exist within the function that uses it. In pre-threaded code, it would simply be a static storage duration variable within the function. For threads, that's modified to thread local storage duration.

这两个示例都允许线程局部变量存在于使用它的函数中。在预线程代码中,它只是函数中的一个静态存储持续时间变量。对于线程,它被修改为线程本地存储持续时间。

Yet another example would be something like errno. You don't want separate threads modifying errno after one of your calls fails but before you can check the variable, and yet you only want one copy per thread.

另一个例子可能是errno。您不希望在您的一个调用失败之后,单独的线程修改errno,但是在您检查变量之前,您只需要每个线程一个副本。

This site has a reasonable description of the different storage duration specifiers.

这个站点对不同的存储持续时间说明符有一个合理的描述。

#2


77  

When you declare a variable thread_local then each thread has its own copy. When you refer to it by name, then the copy associated with the current thread is used. e.g.

当您声明一个变量thread_local时,每个线程都有自己的副本。当您按名称引用它时,将使用与当前线程关联的副本。如。

thread_local int i=0;

void f(int newval){
    i=newval;
}

void g(){
    std::cout<<i;
}

void threadfunc(int id){
    f(id);
    ++i;
    g();
}

int main(){
    i=9;
    std::thread t1(threadfunc,1);
    std::thread t2(threadfunc,2);
    std::thread t3(threadfunc,3);

    t1.join();
    t2.join();
    t3.join();
    std::cout<<i<<std::endl;
}

This code will output "2349", "3249", "4239", "4329", "2439" or "3429", but never anything else. Each thread has its own copy of i, which is assigned to, incremented and then printed. The thread running main also has its own copy, which is assigned to at the beginning and then left unchanged. These copies are entirely independent, and each has a different address.

此代码将输出“2349”、“3249”、“4239”、“4329”、“2439”或“3429”,但不输出任何其他内容。每个线程都有自己的i副本,i被分配、递增并打印。运行main的线程也有自己的副本,在开始时分配给它,然后保持不变。这些副本是完全独立的,每个都有不同的地址。

It is only the name that is special in that respect --- if you take the address of a thread_local variable then you just have a normal pointer to a normal object, which you can freely pass between threads. e.g.

在这方面,它只是一个特殊的名称——如果您取一个thread_local变量的地址,那么您就有一个指向一个普通对象的普通指针,您可以在线程之间*传递它。如。

thread_local int i=0;

void thread_func(int*p){
    *p=42;
}

int main(){
    i=9;
    std::thread t(thread_func,&i);
    t.join();
    std::cout<<i<<std::endl;
}

Since the address of i is passed to the thread function, then the copy of i belonging to the main thread can be assigned to even though it is thread_local. This program will thus output "42". If you do this, then you need to take care that *p is not accessed after the thread it belongs to has exited, otherwise you get a dangling pointer and undefined behaviour just like any other case where the pointed-to object is destroyed.

由于我的地址被传递给线程函数,所以即使它是thread_local,也可以将我属于主线程的副本分配给线程。这个程序将因此输出“42”。如果您这样做,那么您需要注意,*p在其所属的线程退出之后不会被访问,否则您将获得一个悬浮指针和未定义的行为,就像其他任何有指向对象被销毁的情况一样。

thread_local variables are initialized "before first use", so if they are never touched by a given thread then they are not necessarily ever initialized. This is to allow compilers to avoid constructing every thread_local variable in the program for a thread that is entirely self-contained and doesn't touch any of them. e.g.

thread_local变量在“第一次使用之前”被初始化,因此如果它们从未被给定的线程接触过,那么它们就不必初始化。这允许编译器避免为完全自包含且不涉及任何线程的线程构建程序中的每个thread_local变量。如。

struct my_class{
    my_class(){
        std::cout<<"hello";
    }
    ~my_class(){
        std::cout<<"goodbye";
    }
};

void f(){
    thread_local my_class;
}

void do_nothing(){}

int main(){
    std::thread t1(do_nothing);
    t1.join();
}

In this program there are 2 threads: the main thread and the manually-created thread. Neither thread calls f, so the thread_local object is never used. It is therefore unspecified whether the compiler will construct 0, 1 or 2 instances of my_class, and the output may be "", "hellohellogoodbyegoodbye" or "hellogoodbye".

在这个程序中有两个线程:主线程和手工创建的线程。这两个线程都不调用f,因此不会使用thread_local对象。因此,不确定编译器是否会构造my_class的0、1或2个实例,输出可能是“”、“hellohellohellogoodbyegoodbye”或“hellogoodbye”。

#3


15  

Thread-local storage is in every aspect like static (= global) storage, only that each thread has a separate copy of the object. The object's life time starts either at thread start (for global variables) or at first initialization (for block-local statics), and ends when the thread ends (i.e. when join() is called).

线程本地存储在每个方面,比如静态(= global)存储,只是每个线程都有对象的单独副本。对象的生命周期开始于线程启动(对于全局变量)或初始化(对于块本地静态变量),结束于线程结束(例如,调用join()))。

Consequently, only variables that could also be declared static may be declared as thread_local, i.e. global variables (more precisely: variables "at namespace scope"), static class members, and block-static variables (in which case static is implied).

因此,只有可以声明为静态的变量才可以声明为thread_local,即全局变量(更准确地说:变量“在命名空间范围”)、静态类成员和块静态变量(在这种情况下,静态变量是隐含的)。

As an example, suppose you have a thread pool and want to know how well your work load was being balanced:

例如,假设您有一个线程池,并想知道您的工作负载平衡得如何:

thread_local Counter c;

void do_work()
{
    c.increment();
    // ...
}

int main()
{
    std::thread t(do_work);   // your thread-pool would go here
    t.join();
}

This would print thread usage statistics, e.g. with an implementation like this:

这将打印线程使用统计信息,例如:

struct Counter
{
     unsigned int c = 0;
     void increment() { ++c; }
     ~Counter()
     {
         std::cout << "Thread #" << std::this_thread::id() << " was called "
                   << c << " times" << std::endl;
     }
};

#1


78  

Thread-local storage duration is a term used to refer to data that is seemingly global or static storage duration (from the viewpoint of the functions using it) but in actual fact, there is one copy per thread.

线程本地存储持续时间(thread -local storage duration)是指看似全局或静态的数据(从使用它的函数的角度来看),但实际上,每个线程都有一个副本。

It adds to the current automatic (exists during a block/function), static (exists for the program duration) and dynamic (exists on the heap between allocation and deallocation).

它添加到当前的自动(存在于块/函数中)、静态(存在于程序持续时间中)和动态(存在于分配和释放之间的堆上)。

Something that is thread-local is brought into existence at thread creation and disposed of when the thread stops.

线程本地的一些内容在线程创建时产生并在线程停止时处理。

Some examples follow.

一些例子。

Think of a random number generator where the seed must be maintained on a per-thread basis. Using a thread-local seed means that each thread gets its own random number sequence, independent of other threads.

请考虑一个随机数生成器,其中种子必须以每个线程为基础维护。使用线程本地种子意味着每个线程都有自己的随机数序列,独立于其他线程。

If your seed was a local variable within the random function, it would be initialised every time you called it, giving you the same number each time. If it was a global, threads would interfere with each other's sequences.

如果你的种子是随机函数中的一个局部变量,那么每次你调用它时它都会被初始化,每次都会给你相同的数字。如果它是全局的,线程会相互干扰对方的序列。

Another example is something like strtok where the tokenisation state is stored on a thread-specific basis. That way, a single thread can be sure that other threads won't screw up its tokenisation efforts, while still being able to maintain state over multiple calls to strtok - this basically renders strtok_r (the thread-safe version) redundant.

另一个例子是strtok,它将标记状态存储在特定于线程的基础上。这样,单个线程就可以确保其他线程不会在对strtok的多次调用中维护状态,同时也可以确保其他线程不会出错。

Both these examples allow for the thread local variable to exist within the function that uses it. In pre-threaded code, it would simply be a static storage duration variable within the function. For threads, that's modified to thread local storage duration.

这两个示例都允许线程局部变量存在于使用它的函数中。在预线程代码中,它只是函数中的一个静态存储持续时间变量。对于线程,它被修改为线程本地存储持续时间。

Yet another example would be something like errno. You don't want separate threads modifying errno after one of your calls fails but before you can check the variable, and yet you only want one copy per thread.

另一个例子可能是errno。您不希望在您的一个调用失败之后,单独的线程修改errno,但是在您检查变量之前,您只需要每个线程一个副本。

This site has a reasonable description of the different storage duration specifiers.

这个站点对不同的存储持续时间说明符有一个合理的描述。

#2


77  

When you declare a variable thread_local then each thread has its own copy. When you refer to it by name, then the copy associated with the current thread is used. e.g.

当您声明一个变量thread_local时,每个线程都有自己的副本。当您按名称引用它时,将使用与当前线程关联的副本。如。

thread_local int i=0;

void f(int newval){
    i=newval;
}

void g(){
    std::cout<<i;
}

void threadfunc(int id){
    f(id);
    ++i;
    g();
}

int main(){
    i=9;
    std::thread t1(threadfunc,1);
    std::thread t2(threadfunc,2);
    std::thread t3(threadfunc,3);

    t1.join();
    t2.join();
    t3.join();
    std::cout<<i<<std::endl;
}

This code will output "2349", "3249", "4239", "4329", "2439" or "3429", but never anything else. Each thread has its own copy of i, which is assigned to, incremented and then printed. The thread running main also has its own copy, which is assigned to at the beginning and then left unchanged. These copies are entirely independent, and each has a different address.

此代码将输出“2349”、“3249”、“4239”、“4329”、“2439”或“3429”,但不输出任何其他内容。每个线程都有自己的i副本,i被分配、递增并打印。运行main的线程也有自己的副本,在开始时分配给它,然后保持不变。这些副本是完全独立的,每个都有不同的地址。

It is only the name that is special in that respect --- if you take the address of a thread_local variable then you just have a normal pointer to a normal object, which you can freely pass between threads. e.g.

在这方面,它只是一个特殊的名称——如果您取一个thread_local变量的地址,那么您就有一个指向一个普通对象的普通指针,您可以在线程之间*传递它。如。

thread_local int i=0;

void thread_func(int*p){
    *p=42;
}

int main(){
    i=9;
    std::thread t(thread_func,&i);
    t.join();
    std::cout<<i<<std::endl;
}

Since the address of i is passed to the thread function, then the copy of i belonging to the main thread can be assigned to even though it is thread_local. This program will thus output "42". If you do this, then you need to take care that *p is not accessed after the thread it belongs to has exited, otherwise you get a dangling pointer and undefined behaviour just like any other case where the pointed-to object is destroyed.

由于我的地址被传递给线程函数,所以即使它是thread_local,也可以将我属于主线程的副本分配给线程。这个程序将因此输出“42”。如果您这样做,那么您需要注意,*p在其所属的线程退出之后不会被访问,否则您将获得一个悬浮指针和未定义的行为,就像其他任何有指向对象被销毁的情况一样。

thread_local variables are initialized "before first use", so if they are never touched by a given thread then they are not necessarily ever initialized. This is to allow compilers to avoid constructing every thread_local variable in the program for a thread that is entirely self-contained and doesn't touch any of them. e.g.

thread_local变量在“第一次使用之前”被初始化,因此如果它们从未被给定的线程接触过,那么它们就不必初始化。这允许编译器避免为完全自包含且不涉及任何线程的线程构建程序中的每个thread_local变量。如。

struct my_class{
    my_class(){
        std::cout<<"hello";
    }
    ~my_class(){
        std::cout<<"goodbye";
    }
};

void f(){
    thread_local my_class;
}

void do_nothing(){}

int main(){
    std::thread t1(do_nothing);
    t1.join();
}

In this program there are 2 threads: the main thread and the manually-created thread. Neither thread calls f, so the thread_local object is never used. It is therefore unspecified whether the compiler will construct 0, 1 or 2 instances of my_class, and the output may be "", "hellohellogoodbyegoodbye" or "hellogoodbye".

在这个程序中有两个线程:主线程和手工创建的线程。这两个线程都不调用f,因此不会使用thread_local对象。因此,不确定编译器是否会构造my_class的0、1或2个实例,输出可能是“”、“hellohellohellogoodbyegoodbye”或“hellogoodbye”。

#3


15  

Thread-local storage is in every aspect like static (= global) storage, only that each thread has a separate copy of the object. The object's life time starts either at thread start (for global variables) or at first initialization (for block-local statics), and ends when the thread ends (i.e. when join() is called).

线程本地存储在每个方面,比如静态(= global)存储,只是每个线程都有对象的单独副本。对象的生命周期开始于线程启动(对于全局变量)或初始化(对于块本地静态变量),结束于线程结束(例如,调用join()))。

Consequently, only variables that could also be declared static may be declared as thread_local, i.e. global variables (more precisely: variables "at namespace scope"), static class members, and block-static variables (in which case static is implied).

因此,只有可以声明为静态的变量才可以声明为thread_local,即全局变量(更准确地说:变量“在命名空间范围”)、静态类成员和块静态变量(在这种情况下,静态变量是隐含的)。

As an example, suppose you have a thread pool and want to know how well your work load was being balanced:

例如,假设您有一个线程池,并想知道您的工作负载平衡得如何:

thread_local Counter c;

void do_work()
{
    c.increment();
    // ...
}

int main()
{
    std::thread t(do_work);   // your thread-pool would go here
    t.join();
}

This would print thread usage statistics, e.g. with an implementation like this:

这将打印线程使用统计信息,例如:

struct Counter
{
     unsigned int c = 0;
     void increment() { ++c; }
     ~Counter()
     {
         std::cout << "Thread #" << std::this_thread::id() << " was called "
                   << c << " times" << std::endl;
     }
};