I'm using CentOS 5.4 x86_64 and Boost 1.42.0 on a cluster that uses Open-MPI 1.3.3. I'm writing a shared library that uses shared memory to store large amounts of data for multiple processes to use. There's also a loader application that will read in the data from the files and load them into the shared memory.
我在使用Open-MPI 1.3.3的集群上使用CentOS 5.4 x86_64和Boost 1.42.0。我正在编写一个共享库,它使用共享内存来存储大量数据,供多个进程使用。还有一个加载器应用程序将读取文件中的数据并将它们加载到共享内存中。
When I run the loader application, it determines the amount of memory that it needs to store the data exactly then adds 25% for overhead. For just about every file, it'll be over 2 gigs worth of data. When I make the memory request using Boost's Interprocess library, it says it has successfully reserved the requested amount of memory. But when I use start to use it, I get a "Bus error". From what I can tell, the bus error is a result of accessing memory outside the range that is available for the memory segment.
当我运行加载器应用程序时,它确定了准确存储数据所需的内存量,然后增加了25%的开销。对于几乎每个文件,它将超过2演出数据。当我使用Boost的Interprocess库发出内存请求时,它表示已成功保留所请求的内存量。但是当我使用start开始使用它时,我得到一个“总线错误”。据我所知,总线错误是访问内存段可用范围之外的内存的结果。
So I started looking into how the shared memory is on Linux and what to check to make sure my system is correctly configured to allow that large amount of shared memory.
所以我开始研究如何在Linux上共享内存以及检查什么以确保我的系统配置正确以允许大量共享内存。
- I looked at the "files" at
/proc/sys/kernel/shm*
:-
shmall
- 4294967296 (4 Gb) - 小天使 - 4294967296(4 Gb)
-
shmmax
- 68719476736 (68 Gb) - shmmax - 68719476736(68 Gb)
-
shmmni
- 4096 - shmmni - 4096
-
- 我查看了/ proc / sys / kernel / shm *中的“文件”:shmall - 4294967296(4 Gb)shmmax - 68719476736(68 Gb)shmmni - 4096
- I called the
ipcs -lm
command:------ Shared Memory Limits -------- max number of segments = 4096 max seg size (kbytes) = 67108864 max total shared memory (kbytes) = 17179869184 min seg size (bytes) = 1
- 我调用了ipcs -lm命令:------共享内存限制-------- 最大段数= 4096 max seg size(kbytes)= 67108864 最大总共享内存(千字节)= 17179869184 min seg size(bytes)= 1
From what I can tell, those settings indicate that I should be able to allocate enough shared memory for my purposes. So I created a stripped down program that created large amounts of data in shared memory:
据我所知,这些设置表明我应该能够为我的目的分配足够的共享内存。所以我创建了一个在共享内存中创建大量数据的精简程序:
#include <iostream>
#include <boost/interprocess/managed_shared_memory.hpp>
#include <boost/interprocess/allocators/allocator.hpp>
#include <boost/interprocess/containers/vector.hpp>
namespace bip = boost::interprocess;
typedef bip::managed_shared_memory::segment_manager segment_manager_t;
typedef bip::allocator<long, segment_manager_t> long_allocator;
typedef bip::vector<long, long_allocator> long_vector;
int main(int argc, char ** argv) {
struct shm_remove {
shm_remove() { bip::shared_memory_object::remove("ShmTest"); }
~shm_remove() { bip::shared_memory_object::remove("ShmTest"); }
} remover;
size_t szLength = 280000000;
size_t szRequired = szLength * sizeof(long);
size_t szRequested = (size_t) (szRequired * 1.05);
bip::managed_shared_memory segment(bip::create_only, "ShmTest", szRequested);
std::cout <<
"Length: " << szLength << "\n" <<
"sizeof(long): " << sizeof(long) << "\n" <<
"Required: " << szRequired << "\n" <<
"Requested: " << szRequested << "\n" <<
"Allocated: " << segment.get_size() << "\n" <<
"Overhead: " << segment.get_size() - segment.get_free_memory() << "\n" <<
"Free: " << segment.get_free_memory() << "\n\n";
long_allocator alloc(segment.get_segment_manager());
long_vector vector(alloc);
if (argc > 1) {
std::cout << "Reserving Length of " << szLength << "\n";
vector.reserve(szLength);
std::cout << "Vector Capacity: " << vector.capacity() << "\tFree: " << segment.get_free_memory() << "\n\n";
}
for (size_t i = 0; i < szLength; i++) {
if ((i % (szLength / 100)) == 0) {
std::cout << i << ": " << "\tVector Capacity: " << vector.capacity() << "\tFree: " << segment.get_free_memory() << "\n";
}
vector.push_back(i);
}
std::cout << "end: " << "\tVector Capacity: " << vector.capacity() << "\tFree: " << segment.get_free_memory() << "\n";
return 0;
}
Compiled it with the line:
用线编译它:
g++ ShmTest.cpp -lboost_system -lrt
Then ran it with the following output (edited to make it smaller):
然后使用以下输出运行它(编辑使其变小):
Length: 280000000 sizeof(long): 8 Required: 2240000000 Requested: 2352000000 Allocated: 2352000000 Overhead: 224 Free: 2351999776 0: Vector Capacity: 0 Free: 2351999776 2800000: Vector Capacity: 3343205 Free: 2325254128 5600000: Vector Capacity: 8558607 Free: 2283530912 8400000: Vector Capacity: 8558607 Free: 2283530912 11200000: Vector Capacity: 13693771 Free: 2242449600 14000000: Vector Capacity: 21910035 Free: 2176719488 ... 19600000: Vector Capacity: 21910035 Free: 2176719488 22400000: Vector Capacity: 35056057 Free: 2071551312 ... 33600000: Vector Capacity: 35056057 Free: 2071551312 36400000: Vector Capacity: 56089691 Free: 1903282240 ... 56000000: Vector Capacity: 56089691 Free: 1903282240 58800000: Vector Capacity: 89743507 Free: 1634051712 ... 89600000: Vector Capacity: 89743507 Free: 1634051712 92400000: Vector Capacity: 143589611 Free: 1203282880 ... 142800000: Vector Capacity: 143589611 Free: 1203282880 145600000: Vector Capacity: 215384417 Free: 628924432 ... 212800000: Vector Capacity: 215384417 Free: 628924432 215600000: Vector Capacity: 293999969 Free: 16 ... 260400000: Vector Capacity: 293999969 Free: 16 Bus error
If you run the program with the a parameter (any will work, just need to increase the argc
), it preallocate the vector but will still result in a bus error at the same array index.
如果使用a参数运行程序(任何都可以,只需要增加argc),它会预先分配向量,但仍然会在同一个数组索引处导致总线错误。
I checked the size of the "files" at /dev/shm
using the ls -ash /dev/shm
command:
我使用ls -ash / dev / shm命令检查了/ dev / shm中“文件”的大小:
total 2.0G 0 . 0 .. 2.0G ShmTest
And just like with my original application it the size of the allocated shared memory is capped at 2 gigs. Given that it "successfully" allocated 2352000000 bytes of memory, in gigabytes (using 1024*1024*1024) it should be 2.19 Gb.
就像我的原始应用程序一样,分配的共享内存的大小上限为2 gigs。鉴于它“成功”分配了2352000000字节的内存,以千兆字节(使用1024 * 1024 * 1024),它应该是2.19 Gb。
When I run my actual program to load data using MPI, I get this error output:
当我运行我的实际程序使用MPI加载数据时,我收到此错误输出:
Requested: 2808771120 Recieved: 2808771120 [c1-master:13894] *** Process received signal *** [c1-master:13894] Signal: Bus error (7) [c1-master:13894] Signal code: (2) [c1-master:13894] Failing at address: 0x2b3190157000 [c1-master:13894] [ 0] /lib64/libpthread.so.0 [0x3a64e0e7c0] [c1-master:13894] [ 1] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN5boost12interprocess26uninitialized_copy_or_moveINS0_10offset_ptrIlEEPlEET0_T_S6_S5_PNS_10disable_ifINS0_11move_detail16is_move_iteratorIS6_EEvE4typeE+0x218) [0x2b310dcf3fb8] [c1-master:13894] [ 2] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN5boost9container6vectorIlNS_12interprocess9allocatorIlNS2_15segment_managerIcNS2_15rbtree_best_fitINS2_12mutex_familyENS2_10offset_ptrIvEELm0EEENS2_10iset_indexEEEEEE15priv_assign_auxINS7_IlEEEEvT_SG_St20forward_iterator_tag+0xa75) [0x2b310dd0a335] [c1-master:13894] [ 3] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN5boost9container17containers_detail25advanced_insert_aux_proxyINS0_6vectorIlNS_12interprocess9allocatorIlNS4_15segment_managerIcNS4_15rbtree_best_fitINS4_12mutex_familyENS4_10offset_ptrIvEELm0EEENS4_10iset_indexEEEEEEENS0_17constant_iteratorISF_lEEPSF_E25uninitialized_copy_all_toESI_+0x1d7) [0x2b310dd0b817] [c1-master:13894] [ 4] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN5boost9container6vectorINS1_IlNS_12interprocess9allocatorIlNS2_15segment_managerIcNS2_15rbtree_best_fitINS2_12mutex_familyENS2_10offset_ptrIvEELm0EEENS2_10iset_indexEEEEEEENS3_ISD_SB_EEE17priv_range_insertENS7_ISD_EEmRNS0_17containers_detail23advanced_insert_aux_intISD_PSD_EE+0x771) [0x2b310dd0d521] [c1-master:13894] [ 5] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN5boost12interprocess6detail8Ctor3ArgINS_9container6vectorINS4_IlNS0_9allocatorIlNS0_15segment_managerIcNS0_15rbtree_best_fitINS0_12mutex_familyENS0_10offset_ptrIvEELm0EEENS0_10iset_indexEEEEEEENS5_ISF_SD_EEEELb0EiSF_NS5_IvSD_EEE11construct_nEPvmRm+0x157) [0x2b310dd0d9a7] [c1-master:13894] [ 6] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN5boost12interprocess15segment_managerIcNS0_15rbtree_best_fitINS0_12mutex_familyENS0_10offset_ptrIvEELm0EEENS0_10iset_indexEE28priv_generic_named_constructIcEEPvmPKT_mbbRNS0_6detail18in_place_interfaceERNS7_INSE_12index_configISB_S6_EEEENSE_5bool_ILb1EEE+0x6fd) [0x2b310dd0c85d] [c1-master:13894] [ 7] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN5boost12interprocess15segment_managerIcNS0_15rbtree_best_fitINS0_12mutex_familyENS0_10offset_ptrIvEELm0EEENS0_10iset_indexEE22priv_generic_constructEPKcmbbRNS0_6detail18in_place_interfaceE+0xf8) [0x2b310dd0dd58] [c1-master:13894] [ 8] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN7POP_LTL16ExportPopulation22InitializeSharedMemoryEPKc+0x1609) [0x2b310dceea99] [c1-master:13894] [ 9] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN7POP_LTL10InitializeEPKc+0x349) [0x2b310dd0ebb9] [c1-master:13894] [10] MPI_Release/LookupPopulation.MpiLoader(main+0x372) [0x4205d2] [c1-master:13894] [11] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3a6461d994] [c1-master:13894] [12] MPI_Release/LookupPopulation.MpiLoader(__gxx_personality_v0+0x239) [0x420009] [c1-master:13894] *** End of error message *** -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 13894 on node c1-master exited on signal 7 (Bus error). --------------------------------------------------------------------------
I'm really not sure where to go with this. Does anyone have any suggestions of what to try?
我真的不知道该怎么做。有没有人有什么建议尝试?
Posted to the Boost bug trac at: https://svn.boost.org/trac/boost/ticket/4374
发布到Boost bug trac:https://svn.boost.org/trac/boost/ticket/4374
1 个解决方案
#1
8
Well, if you keep looking for the answer long enough...
好吧,如果你一直在寻找答案......
On Linux the shared memory mechanisms it uses (tmpfs
) by default limits it to half the system RAM. So on the my cluster it’s 2 Gb because we have 4 Gb system RAM. So when it tried to allocate the shared memory segment, it allocated up to the max size left on the /dev/shm
.
在Linux上,它默认使用的共享内存机制(tmpfs)将其限制为系统RAM的一半。所以在我的集群上,它是2 Gb,因为我们有4 Gb系统RAM。因此,当它尝试分配共享内存段时,它会分配到/ dev / shm上剩余的最大大小。
But issue came when the Boost library didn’t indicate an error or even the report correct amount of free memory when it couldn’t allocate the requested amount of memory. It was just happy to apparently chug along until it reached the end of the segment and then errored.
但是,当Boost库无法分配所请求的内存量时,Boost库没有指出错误甚至报告正确的可用内存量时出现了问题。很高兴看起来很明显,直到它到达片段结束然后出错。
The long term solution is to update the /etc/fstab
file to make the change permanently, but a command line call can be run to increase the size of the available shared memory on each node until reboot.
长期解决方案是更新/ etc / fstab文件以永久更改,但可以运行命令行调用以增加每个节点上可用共享内存的大小,直到重新引导。
mount -o remount,size=XXX /dev/shm
mount -o remount,size = XXX / dev / shm
Where the XXX
is the amount of memory to make available (for example size=4G
).
其中XXX是要提供的内存量(例如size = 4G)。
This was figured out/taken from http://www.cyberciti.biz/tips/what-is-devshm-and-its-practical-usage.html
这是从http://www.cyberciti.biz/tips/what-is-devshm-and-its-practical-usage.html中找到的。
#1
8
Well, if you keep looking for the answer long enough...
好吧,如果你一直在寻找答案......
On Linux the shared memory mechanisms it uses (tmpfs
) by default limits it to half the system RAM. So on the my cluster it’s 2 Gb because we have 4 Gb system RAM. So when it tried to allocate the shared memory segment, it allocated up to the max size left on the /dev/shm
.
在Linux上,它默认使用的共享内存机制(tmpfs)将其限制为系统RAM的一半。所以在我的集群上,它是2 Gb,因为我们有4 Gb系统RAM。因此,当它尝试分配共享内存段时,它会分配到/ dev / shm上剩余的最大大小。
But issue came when the Boost library didn’t indicate an error or even the report correct amount of free memory when it couldn’t allocate the requested amount of memory. It was just happy to apparently chug along until it reached the end of the segment and then errored.
但是,当Boost库无法分配所请求的内存量时,Boost库没有指出错误甚至报告正确的可用内存量时出现了问题。很高兴看起来很明显,直到它到达片段结束然后出错。
The long term solution is to update the /etc/fstab
file to make the change permanently, but a command line call can be run to increase the size of the available shared memory on each node until reboot.
长期解决方案是更新/ etc / fstab文件以永久更改,但可以运行命令行调用以增加每个节点上可用共享内存的大小,直到重新引导。
mount -o remount,size=XXX /dev/shm
mount -o remount,size = XXX / dev / shm
Where the XXX
is the amount of memory to make available (for example size=4G
).
其中XXX是要提供的内存量(例如size = 4G)。
This was figured out/taken from http://www.cyberciti.biz/tips/what-is-devshm-and-its-practical-usage.html
这是从http://www.cyberciti.biz/tips/what-is-devshm-and-its-practical-usage.html中找到的。