I want to create a C++11 thread which I want it to run on my first core. I find that pthread_setaffinity_np
and sched_setaffinity
can change the CPU affinity of a thread and migrate it to the specified CPU. However this affinity specification changes after the thread has run.
我想创建一个C ++ 11线程,我希望它在我的第一个核心上运行。我发现pthread_setaffinity_np和sched_setaffinity可以改变线程的CPU亲和性并将其迁移到指定的CPU。但是,在线程运行后,此关联性规范会发生更改。
How can I create a C++11 thread with specific CPU affinity (a cpu_set_t
object)?
如何创建具有特定CPU关联性的C ++ 11线程(cpu_set_t对象)?
If it is impossible to specify the affinity when initializing a C++11 thread, how can I do it with pthread_t
in C?
如果在初始化C ++ 11线程时无法指定关联,那么如何在C中使用pthread_t呢?
My environment is G++ on Ubuntu. A piece of code is appreciated.
我的环境是Ubuntu上的G ++。一段代码表示赞赏。
3 个解决方案
#1
18
I am sorry to be the "myth buster" here, but setting thread affinity has great importance, and it grows in importance over time as the systems we all use become more and more NUMA (Non-Uniform Memory Architecture) by nature. Even a trivial dual socket server these days has RAM connected separately to each socket, and the difference in access to memory from a socket to its own RAM to that of the neighboring processor socket (remote RAM) is substantial. In the near future, processors are hitting the market in which the internal set of cores is NUMA in itself (separate memory controllers for separate groups of cores, etc). There is no need for me to repeat the work of others here, just look for "NUMA and thread affinity" online - and you can learn from years of experience of other engineers.
我很抱歉在这里成为“神话破坏者”,但设置线程亲和力非常重要,随着时间的推移它变得越来越重要,因为我们所使用的系统本质上变得越来越多的NUMA(非统一内存架构)。如今,即使是一个简单的双插槽服务器也会将RAM单独连接到每个插槽,并且从插槽到其自己的RAM访问存储器与相邻处理器插槽(远程RAM)的存储器的差异很大。在不久的将来,处理器正在进入市场,其中内部核心集合本身就是NUMA(用于单独核心组的独立存储器控制器等)。我不需要在这里重复其他人的工作,只需在线查找“NUMA和线程亲和力” - 您可以从其他工程师的多年经验中学习。
Not setting thread affinity is effectively equal to "hoping" that the OS scheduler will handle thread affinity correctly. Let me explain: You have a system with some NUMA nodes (processing and memory domains). You start a thread, and the thread does some stuff with memory, e.g. malloc some memory and then process etc. Modern OS (at least Linux, others probably too) do a good job thus far, the memory is, by default, allocated (if available) from the same domain of the CPU where the thread is running. Come time, the time-sharing OS (all modern OS) will put the thread to sleep. When the thread is put back into running state, it may be made runnable on any of the cores in the system (as you did not set an affinity mask to it), and the larger your system is, the higher the chance it will be "woken up" on a CPU which is remote from the memory it previously allocated or used. Now, all your memory accesses would be remote (not sure what this means to your application performance? read more about remote memory access on NUMA systems online)
不设置线程亲和性实际上等于“希望”OS调度程序将正确处理线程关联。让我解释一下:你有一个带有一些NUMA节点(处理和内存域)的系统。你启动一个线程,线程用内存做一些事情,例如malloc一些内存然后进行处理等现代操作系统(至少Linux,其他可能也是)到目前为止做得很好,默认情况下,内存是从运行线程的CPU的同一域分配(如果可用) 。来吧,分时操作系统(所有现代操作系统)都会让线程进入休眠状态。当线程重新进入运行状态时,它可以在系统中的任何核心上运行(因为你没有为它设置一个亲和力掩码),系统越大,它的可能性就越高。在远离先前分配或使用的内存的CPU上“唤醒”。现在,您的所有内存访问都是远程的(不确定这对您的应用程序性能意味着什么?了解有关NUMA系统上的远程内存访问的更多信息)
So, to summarize, affinity setting interfaces are VERY important when running code on systems that have more-than-trivial architecture -- which is rapidly becoming "any system" these days. Some thread runtime environments/libs allow for control of this at runtime without any specific programming (see OpenMP, for example in Intel's implementation of KMP_AFFINITY environment variable) - and it would be the right thing for C++11 implementers to include similar mechanisms in their runtime libs and language options (and until then, if your code is aimed for use on servers, I strongly recommend that you implement affinity control in your code)
因此,总而言之,在具有更重要的体系结构的系统上运行代码时,亲和设置接口非常重要 - 这些日子正迅速成为“任何系统”。一些线程运行时环境/库允许在运行时控制它而无需任何特定的编程(参见OpenMP,例如在英特尔的KMP_AFFINITY环境变量的实现中) - 对于C ++ 11实现者来说,包含类似的机制是正确的。他们的运行时库和语言选项(在此之前,如果您的代码旨在用于服务器,我强烈建议您在代码中实现关联控制)
#2
1
In C++ 11 you cannot set the thread affinity when the thread is created (unless the function that is being run in the thread does it on its own), but once the thread is created, you can set the affinity via whatever native interface you have by getting the native handle for the thread (thread.native_handle()), so for Linux you can get the pthread id via:
在C ++ 11中,您无法在创建线程时设置线程关联(除非在线程中运行的函数单独执行),但是一旦创建了线程,您可以通过任何本机接口设置关联通过获取线程的本机句柄(thread.native_handle()),因此对于Linux,您可以通过以下方式获取pthread id:
pthread_t my_thread_native = my_thread.native_handle();
pthread_t my_thread_native = my_thread.native_handle();
Then you can use any of the pthread calls passing in my_thread_native where it wants the pthread thread id.
然后你可以使用在my_thread_native中传递的任何pthread调用,它需要pthread线程id。
Note that most thread facilities are implementation specific, i.e. pthreads, windows threads, native threads for other OSes all have their own interface and types this portion of your code would not be very portable.
请注意,大多数线程工具都是特定于实现的,即pthread,windows线程,其他操作系统的本机线程都有自己的接口和类型,这部分代码不会非常便携。
#3
-6
After searching for a while, it seems that we cannot set CPU affinity when we create a C++ thread
.
搜索一段时间后,我们似乎无法在创建C ++线程时设置CPU亲和性。
The reason is that, there is NO NEED to specify the affinity when create a thread. So, why bother make it possible in the language.
原因是,创建线程时无需指定关联。那么,为什么要在语言中使它成为可能呢。
Say, we want the workload f()
to be bound to CPU0. We can just change the affinity to CPU0 right before the real workload by calling pthread_setaffinity_np
.
比如说,我们希望将工作负载f()绑定到CPU0。我们可以通过调用pthread_setaffinity_np在实际工作负载之前更改与CPU0的亲缘关系。
However, we CAN specify the affinity when create a thread in C. (thanks to the comment from Tony D). For example, the following code outputs "Hello pthread".
但是,我们可以在C中创建一个线程时指定亲和力。(感谢Tony D的评论)。例如,以下代码输出“Hello pthread”。
void *f(void *p) {
std::cout<<"Hello pthread"<<std::endl;
}
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(0, &cpuset);
pthread_attr_t pta;
pthread_attr_init(&pta);
pthread_attr_setaffinity_np(&pta, sizeof(cpuset), &cpuset);
pthread_t thread;
if (pthread_create(&thread, &pta, f, NULL) != 0) {
std::cerr << "Error in creating thread" << std::endl;
}
pthread_join(thread, NULL);
pthread_attr_destroy(&pta);
#1
18
I am sorry to be the "myth buster" here, but setting thread affinity has great importance, and it grows in importance over time as the systems we all use become more and more NUMA (Non-Uniform Memory Architecture) by nature. Even a trivial dual socket server these days has RAM connected separately to each socket, and the difference in access to memory from a socket to its own RAM to that of the neighboring processor socket (remote RAM) is substantial. In the near future, processors are hitting the market in which the internal set of cores is NUMA in itself (separate memory controllers for separate groups of cores, etc). There is no need for me to repeat the work of others here, just look for "NUMA and thread affinity" online - and you can learn from years of experience of other engineers.
我很抱歉在这里成为“神话破坏者”,但设置线程亲和力非常重要,随着时间的推移它变得越来越重要,因为我们所使用的系统本质上变得越来越多的NUMA(非统一内存架构)。如今,即使是一个简单的双插槽服务器也会将RAM单独连接到每个插槽,并且从插槽到其自己的RAM访问存储器与相邻处理器插槽(远程RAM)的存储器的差异很大。在不久的将来,处理器正在进入市场,其中内部核心集合本身就是NUMA(用于单独核心组的独立存储器控制器等)。我不需要在这里重复其他人的工作,只需在线查找“NUMA和线程亲和力” - 您可以从其他工程师的多年经验中学习。
Not setting thread affinity is effectively equal to "hoping" that the OS scheduler will handle thread affinity correctly. Let me explain: You have a system with some NUMA nodes (processing and memory domains). You start a thread, and the thread does some stuff with memory, e.g. malloc some memory and then process etc. Modern OS (at least Linux, others probably too) do a good job thus far, the memory is, by default, allocated (if available) from the same domain of the CPU where the thread is running. Come time, the time-sharing OS (all modern OS) will put the thread to sleep. When the thread is put back into running state, it may be made runnable on any of the cores in the system (as you did not set an affinity mask to it), and the larger your system is, the higher the chance it will be "woken up" on a CPU which is remote from the memory it previously allocated or used. Now, all your memory accesses would be remote (not sure what this means to your application performance? read more about remote memory access on NUMA systems online)
不设置线程亲和性实际上等于“希望”OS调度程序将正确处理线程关联。让我解释一下:你有一个带有一些NUMA节点(处理和内存域)的系统。你启动一个线程,线程用内存做一些事情,例如malloc一些内存然后进行处理等现代操作系统(至少Linux,其他可能也是)到目前为止做得很好,默认情况下,内存是从运行线程的CPU的同一域分配(如果可用) 。来吧,分时操作系统(所有现代操作系统)都会让线程进入休眠状态。当线程重新进入运行状态时,它可以在系统中的任何核心上运行(因为你没有为它设置一个亲和力掩码),系统越大,它的可能性就越高。在远离先前分配或使用的内存的CPU上“唤醒”。现在,您的所有内存访问都是远程的(不确定这对您的应用程序性能意味着什么?了解有关NUMA系统上的远程内存访问的更多信息)
So, to summarize, affinity setting interfaces are VERY important when running code on systems that have more-than-trivial architecture -- which is rapidly becoming "any system" these days. Some thread runtime environments/libs allow for control of this at runtime without any specific programming (see OpenMP, for example in Intel's implementation of KMP_AFFINITY environment variable) - and it would be the right thing for C++11 implementers to include similar mechanisms in their runtime libs and language options (and until then, if your code is aimed for use on servers, I strongly recommend that you implement affinity control in your code)
因此,总而言之,在具有更重要的体系结构的系统上运行代码时,亲和设置接口非常重要 - 这些日子正迅速成为“任何系统”。一些线程运行时环境/库允许在运行时控制它而无需任何特定的编程(参见OpenMP,例如在英特尔的KMP_AFFINITY环境变量的实现中) - 对于C ++ 11实现者来说,包含类似的机制是正确的。他们的运行时库和语言选项(在此之前,如果您的代码旨在用于服务器,我强烈建议您在代码中实现关联控制)
#2
1
In C++ 11 you cannot set the thread affinity when the thread is created (unless the function that is being run in the thread does it on its own), but once the thread is created, you can set the affinity via whatever native interface you have by getting the native handle for the thread (thread.native_handle()), so for Linux you can get the pthread id via:
在C ++ 11中,您无法在创建线程时设置线程关联(除非在线程中运行的函数单独执行),但是一旦创建了线程,您可以通过任何本机接口设置关联通过获取线程的本机句柄(thread.native_handle()),因此对于Linux,您可以通过以下方式获取pthread id:
pthread_t my_thread_native = my_thread.native_handle();
pthread_t my_thread_native = my_thread.native_handle();
Then you can use any of the pthread calls passing in my_thread_native where it wants the pthread thread id.
然后你可以使用在my_thread_native中传递的任何pthread调用,它需要pthread线程id。
Note that most thread facilities are implementation specific, i.e. pthreads, windows threads, native threads for other OSes all have their own interface and types this portion of your code would not be very portable.
请注意,大多数线程工具都是特定于实现的,即pthread,windows线程,其他操作系统的本机线程都有自己的接口和类型,这部分代码不会非常便携。
#3
-6
After searching for a while, it seems that we cannot set CPU affinity when we create a C++ thread
.
搜索一段时间后,我们似乎无法在创建C ++线程时设置CPU亲和性。
The reason is that, there is NO NEED to specify the affinity when create a thread. So, why bother make it possible in the language.
原因是,创建线程时无需指定关联。那么,为什么要在语言中使它成为可能呢。
Say, we want the workload f()
to be bound to CPU0. We can just change the affinity to CPU0 right before the real workload by calling pthread_setaffinity_np
.
比如说,我们希望将工作负载f()绑定到CPU0。我们可以通过调用pthread_setaffinity_np在实际工作负载之前更改与CPU0的亲缘关系。
However, we CAN specify the affinity when create a thread in C. (thanks to the comment from Tony D). For example, the following code outputs "Hello pthread".
但是,我们可以在C中创建一个线程时指定亲和力。(感谢Tony D的评论)。例如,以下代码输出“Hello pthread”。
void *f(void *p) {
std::cout<<"Hello pthread"<<std::endl;
}
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(0, &cpuset);
pthread_attr_t pta;
pthread_attr_init(&pta);
pthread_attr_setaffinity_np(&pta, sizeof(cpuset), &cpuset);
pthread_t thread;
if (pthread_create(&thread, &pta, f, NULL) != 0) {
std::cerr << "Error in creating thread" << std::endl;
}
pthread_join(thread, NULL);
pthread_attr_destroy(&pta);