Now I have a file with many data in it. And I know the data I need begins at position (long)x and has a given size sizeof(y) How can I get this data?
现在我有一个包含许多数据的文件。我知道我需要的数据从位置(长)x开始,并且具有给定的大小sizeof(y)我如何获得这些数据?
3 个解决方案
#1
Use the seek
method:
使用搜索方法:
ifstream strm;
strm.open ( ... );
strm.seekg (x);
strm.read (buffer, y);
#2
You should use fseek() to change your "current position" in the file to the desired offset. So, if "f" is your FILE* variable and offset is the offset this is how the call should look like (modulo my leaky memory):
您应该使用fseek()将文件中的“当前位置”更改为所需的偏移量。所以,如果“f”是你的FILE *变量而offset是偏移量,那么这就是调用的样子(模数我的泄漏内存):
fseek(f, offset, SEEK_SET);
#3
Besides the usual seek-and-read techniques mentioned above, you can also map the file into your process space using something like mmap() and access the data directly.
除了上面提到的常用搜索和读取技术,您还可以使用mmap()之类的方法将文件映射到您的进程空间,并直接访问数据。
For example, given the following data file "foo.dat":
例如,给定以下数据文件“foo.dat”:
one two three
The following code will print all text after the first four bytes using an mmap() based approach:
以下代码将使用基于mmap()的方法打印前四个字节后的所有文本:
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <iostream>
int main()
{
int result = -1;
int const fd = open("foo.dat", O_RDONLY);
struct stat s;
if (fd != -1 && fstat(fd, &s) == 0)
{
void * const addr = mmap(0, s.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
if (addr != MAP_FAILED)
{
char const * const text = static_cast<char *>(addr);
// Print all text after the first 4 bytes.
std::cout << text + 4 << std::endl;
munmap(addr, s.st_size);
result = 0;
}
close(fd);
}
return result;
}
You can even use this approach to write directly to a file (remember to msync() if necessary).
您甚至可以使用此方法直接写入文件(如有必要,请记得使用msync())。
Libraries like Boost and ACE provide nice C++ encapsulations for mmap() (and the equivalent Windows function).
像Boost和ACE这样的库为mmap()(以及等效的Windows函数)提供了很好的C ++封装。
This approach is probably overkill for small files, but it can be huge win for large files. As usual, profile your code to determine which approach is best.
对于小文件来说,这种方法可能有些过分,但对于大型文件来说,这可能是一个巨大的胜利。像往常一样,分析您的代码以确定哪种方法最佳。
#1
Use the seek
method:
使用搜索方法:
ifstream strm;
strm.open ( ... );
strm.seekg (x);
strm.read (buffer, y);
#2
You should use fseek() to change your "current position" in the file to the desired offset. So, if "f" is your FILE* variable and offset is the offset this is how the call should look like (modulo my leaky memory):
您应该使用fseek()将文件中的“当前位置”更改为所需的偏移量。所以,如果“f”是你的FILE *变量而offset是偏移量,那么这就是调用的样子(模数我的泄漏内存):
fseek(f, offset, SEEK_SET);
#3
Besides the usual seek-and-read techniques mentioned above, you can also map the file into your process space using something like mmap() and access the data directly.
除了上面提到的常用搜索和读取技术,您还可以使用mmap()之类的方法将文件映射到您的进程空间,并直接访问数据。
For example, given the following data file "foo.dat":
例如,给定以下数据文件“foo.dat”:
one two three
The following code will print all text after the first four bytes using an mmap() based approach:
以下代码将使用基于mmap()的方法打印前四个字节后的所有文本:
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <iostream>
int main()
{
int result = -1;
int const fd = open("foo.dat", O_RDONLY);
struct stat s;
if (fd != -1 && fstat(fd, &s) == 0)
{
void * const addr = mmap(0, s.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
if (addr != MAP_FAILED)
{
char const * const text = static_cast<char *>(addr);
// Print all text after the first 4 bytes.
std::cout << text + 4 << std::endl;
munmap(addr, s.st_size);
result = 0;
}
close(fd);
}
return result;
}
You can even use this approach to write directly to a file (remember to msync() if necessary).
您甚至可以使用此方法直接写入文件(如有必要,请记得使用msync())。
Libraries like Boost and ACE provide nice C++ encapsulations for mmap() (and the equivalent Windows function).
像Boost和ACE这样的库为mmap()(以及等效的Windows函数)提供了很好的C ++封装。
This approach is probably overkill for small files, but it can be huge win for large files. As usual, profile your code to determine which approach is best.
对于小文件来说,这种方法可能有些过分,但对于大型文件来说,这可能是一个巨大的胜利。像往常一样,分析您的代码以确定哪种方法最佳。