What limits the size of a memory-mapped file? I know it can't be bigger than the largest continuous chunk of unallocated address space, and that there should be enough free disk space. But are there other limits?
什么限制了内存映射文件的大小?我知道它不能超过最大的连续未分配地址空间块,并且应该有足够的可用磁盘空间。但还有其他限制吗?
7 个解决方案
#1
You're being too conservative: A memory-mapped file can be larger than the address space. The view of the memory-mapped file is limited by OS memory constraints, but that's only the part of the file you're looking at at one time. (And I guess technically you could map multiple views of discontinuous parts of the file at once, so aside from overhead and page length constraints, it's only the total # of bytes you're looking at that poses a limit. You could look at bytes [0 to 1024] and bytes [240 to 240 + 1024] with two separate views.)
你太保守了:内存映射文件可能比地址空间大。内存映射文件的视图受操作系统内存限制的限制,但这只是您一次查看的文件的一部分。 (我想从技术上讲,你可以同时映射文件不连续部分的多个视图,所以除了开销和页面长度限制之外,它只是你所看到的总字节数构成一个限制。你可以查看字节[0到1024]和字节[240到240 + 1024],带有两个单独的视图。)
In MS Windows, look at the MapViewOfFile function. It effectively takes a 64-bit file offset and a 32-bit length.
在MS Windows中,查看MapViewOfFile函数。它实际上需要64位文件偏移和32位长度。
#2
This has been my experience when using memory-mapped files under Win32:
这是我在Win32下使用内存映射文件时的经验:
If your map the entire file into one segment, it normally taps out at around 750 MB, because it can't find a bigger contiguous block of memory. If you split it up into smaller segments, say 100MB each, you can get around 1500MB-1800MB depending on what else is running.
如果将整个文件映射到一个段中,它通常会以大约750 MB的速度进行抽取,因为它找不到更大的连续内存块。如果你把它分成更小的部分,比如说每个100MB,你可以得到大约1500MB-1800MB,具体取决于正在运行的是什么。
If you use the /3g switch you can get more than 2GB up to about 2700MB but OS performance is penalized.
如果你使用/ 3g交换机,你可以获得超过2GB,最高可达2700MB,但操作系统性能会受到惩罚。
I'm not sure about 64-bit, I've never tried it but I presume the max file size is then limited only by the amount of physical memory you have.
我不确定64位,我从来没有尝试过,但我认为最大文件大小仅限于你拥有的物理内存量。
#3
There should be no other limits. Aren't those enough? ;-)
应该没有其他限制。那些不够吗? ;-)
#4
Under Windows: "The size of a file view is limited to the largest available contiguous block of unreserved virtual memory. This is at most 2 GB minus the virtual memory already reserved by the process. "
在Windows下:“文件视图的大小限制为最大可用的连续虚拟内存块。这最多为2 GB减去进程已保留的虚拟内存。”
From MDSN.
I'm not sure about LINUX/OSX/Whatever Else, but it's probably also related to address space.
我不确定LINUX / OSX / Whatever Else,但它可能也与地址空间有关。
#5
With FUSE on linux you could also make an in-memory filesystem that extends to disk on demand. I'm not sure that qualifies as memory mapped, and the distinction gets kind of blurred.
使用Linux上的FUSE,您还可以创建一个按需扩展到磁盘的内存文件系统。我不确定它是否有资格作为内存映射,并且区别有点模糊。
#6
Yes, there are limits to memory-mapped files. Most shockingly is:
是的,内存映射文件有限制。最令人震惊的是:
Memory-mapped files cannot be larger than 2GB on 32-bit systems.
在32位系统上,内存映射文件不能大于2GB。
When a memmap causes a file to be created or extended beyond its current size in the filesystem, the contents of the new part are unspecified. On systems with POSIX filesystem semantics, the extended part will be filled with zero bytes.
当memmap导致文件系统中创建或扩展文件超出其当前大小时,新部件的内容未指定。在具有POSIX文件系统语义的系统上,扩展部分将填充零字节。
Even on my 64-bit, 32GB RAM system, I get the following error if I try to read in one big numpy memory-mapped file instead of taking portions of it using byte-offsets:
即使在我的64位,32GB RAM系统上,如果我尝试读取一个大的numpy内存映射文件而不是使用字节偏移量来获取部分内容,我会收到以下错误:
Overflow Error: memory mapped size must be positive
溢出错误:内存映射大小必须为正
Big datasets are really a pain to work with.
大数据集真的很难用。
#7
Wikipedia entry on the subject: http://en.wikipedia.org/wiki/Memory-mapped_file
*关于该主题的条目:http://en.wikipedia.org/wiki/Memory-mapped_file
#1
You're being too conservative: A memory-mapped file can be larger than the address space. The view of the memory-mapped file is limited by OS memory constraints, but that's only the part of the file you're looking at at one time. (And I guess technically you could map multiple views of discontinuous parts of the file at once, so aside from overhead and page length constraints, it's only the total # of bytes you're looking at that poses a limit. You could look at bytes [0 to 1024] and bytes [240 to 240 + 1024] with two separate views.)
你太保守了:内存映射文件可能比地址空间大。内存映射文件的视图受操作系统内存限制的限制,但这只是您一次查看的文件的一部分。 (我想从技术上讲,你可以同时映射文件不连续部分的多个视图,所以除了开销和页面长度限制之外,它只是你所看到的总字节数构成一个限制。你可以查看字节[0到1024]和字节[240到240 + 1024],带有两个单独的视图。)
In MS Windows, look at the MapViewOfFile function. It effectively takes a 64-bit file offset and a 32-bit length.
在MS Windows中,查看MapViewOfFile函数。它实际上需要64位文件偏移和32位长度。
#2
This has been my experience when using memory-mapped files under Win32:
这是我在Win32下使用内存映射文件时的经验:
If your map the entire file into one segment, it normally taps out at around 750 MB, because it can't find a bigger contiguous block of memory. If you split it up into smaller segments, say 100MB each, you can get around 1500MB-1800MB depending on what else is running.
如果将整个文件映射到一个段中,它通常会以大约750 MB的速度进行抽取,因为它找不到更大的连续内存块。如果你把它分成更小的部分,比如说每个100MB,你可以得到大约1500MB-1800MB,具体取决于正在运行的是什么。
If you use the /3g switch you can get more than 2GB up to about 2700MB but OS performance is penalized.
如果你使用/ 3g交换机,你可以获得超过2GB,最高可达2700MB,但操作系统性能会受到惩罚。
I'm not sure about 64-bit, I've never tried it but I presume the max file size is then limited only by the amount of physical memory you have.
我不确定64位,我从来没有尝试过,但我认为最大文件大小仅限于你拥有的物理内存量。
#3
There should be no other limits. Aren't those enough? ;-)
应该没有其他限制。那些不够吗? ;-)
#4
Under Windows: "The size of a file view is limited to the largest available contiguous block of unreserved virtual memory. This is at most 2 GB minus the virtual memory already reserved by the process. "
在Windows下:“文件视图的大小限制为最大可用的连续虚拟内存块。这最多为2 GB减去进程已保留的虚拟内存。”
From MDSN.
I'm not sure about LINUX/OSX/Whatever Else, but it's probably also related to address space.
我不确定LINUX / OSX / Whatever Else,但它可能也与地址空间有关。
#5
With FUSE on linux you could also make an in-memory filesystem that extends to disk on demand. I'm not sure that qualifies as memory mapped, and the distinction gets kind of blurred.
使用Linux上的FUSE,您还可以创建一个按需扩展到磁盘的内存文件系统。我不确定它是否有资格作为内存映射,并且区别有点模糊。
#6
Yes, there are limits to memory-mapped files. Most shockingly is:
是的,内存映射文件有限制。最令人震惊的是:
Memory-mapped files cannot be larger than 2GB on 32-bit systems.
在32位系统上,内存映射文件不能大于2GB。
When a memmap causes a file to be created or extended beyond its current size in the filesystem, the contents of the new part are unspecified. On systems with POSIX filesystem semantics, the extended part will be filled with zero bytes.
当memmap导致文件系统中创建或扩展文件超出其当前大小时,新部件的内容未指定。在具有POSIX文件系统语义的系统上,扩展部分将填充零字节。
Even on my 64-bit, 32GB RAM system, I get the following error if I try to read in one big numpy memory-mapped file instead of taking portions of it using byte-offsets:
即使在我的64位,32GB RAM系统上,如果我尝试读取一个大的numpy内存映射文件而不是使用字节偏移量来获取部分内容,我会收到以下错误:
Overflow Error: memory mapped size must be positive
溢出错误:内存映射大小必须为正
Big datasets are really a pain to work with.
大数据集真的很难用。
#7
Wikipedia entry on the subject: http://en.wikipedia.org/wiki/Memory-mapped_file
*关于该主题的条目:http://en.wikipedia.org/wiki/Memory-mapped_file