I want to save a vector of objects into file. What is the most efficient way to do it? Should I load the whole vector when the program starts, operate on it locally and then save it as the program exits or access the file every time I need to change something inside vector?
我想将对象的矢量保存到文件中。最有效的方法是什么?我应该在程序启动时加载整个向量,在本地操作它然后在程序退出时保存它或每次我需要更改向量内部时访问文件吗?
Also, is it even possible to save the whole vector at once or I need to save elements one by one?
此外,是否可以立即保存整个矢量或者我需要逐个保存元素?
2 个解决方案
#1
1
If your data is POD (and thus each element is the same size), you can use the following ideas. If not, the following ideas might betoo difficult to use easily.
如果您的数据是POD(因此每个元素的大小相同),您可以使用以下想法。如果没有,以下想法可能很难轻易使用。
The following code shows that you 'can' use "ostream::write" to put the binary data out to a file. vector<> promises that the objects are front-to-back (or back-to-front, as you wish) in memory, so they are lined up and relatively compact for transfer.
以下代码显示您可以“使用”ostream :: write“将二进制数据输出到文件中。 vector <>承诺在内存中对象是前后(或从前到后,如你所愿),因此它们排成一行并且相对紧凑以便传输。
Reading back in to memory is similar, using "istream::read". But here you need to allocate the buffering 1st.
使用“istream :: read”读回内存是类似的。但是在这里你需要分配缓冲1。
In the following code, I allocated 6 objects for write, then found the beginning of the vector memory and wrote all 6 objects in with one 'write'.
在下面的代码中,我分配了6个对象进行写入,然后找到了向量内存的开头,并用一个'write'写入了所有6个对象。
For the read, I instantiated and loaded 6 objects into a new vector (note that vector provides a slicker way to allocate and instantiate the objects using the default ctor).
对于读取,我实例化并将6个对象加载到新向量中(注意,向量提供了一种使用默认ctor分配和实例化对象的更流畅的方式)。
How you pass the number of objects from the write effort to the read? This can be computed from the file size. (see stat)
如何将写入工作中的对象数传递给读取?这可以从文件大小计算。 (见统计)
Using the reinterpret-cast is not (generally) acceptable, and may not be portable. But sometimes you just gotta do it.
使用重新解释 - 演员不是(通常)可接受的,并且可能不是可移植的。但有时你只需要这样做。
Hopefully this 'evil' will prompt a number of better choices.
希望这个“邪恶”会促使一些更好的选择。
I left in some debug std::cout's ... hope you find them helpful.
我离开了一些调试std :: cout的...希望你发现它们有用。
edit - 4/8 -- cleaned the code somewhat
编辑 - 4/8 - 稍微清理了一下代码
// create a vector of 6 clear elements to save to file
std::vector<Something> vec6W(6); // default ctor clears contents
// change several elements by init'ing them
vec6W[0].init();
// [1] is clear (all 0's)
vec6W[2].init();
vec6W[3].init();
// [4] is clear (all 0's)
vec6W[5].init();
std::cout << " vec6W.size(): " << vec6W.size() << std::endl; // 6
std::cout << " sizeof(vec6W): " << sizeof(vec6W) << std::endl; // 12
std::cout << " sizeof(something): " << sizeof(Something) << std::endl; // 100
std::cout << " sizeof(vec6W[3]): " << sizeof(vec6W[3]) << std::endl; // 100
// #elements bytes per element
std::streamsize nBytes = (vec6W.size() * sizeof(Something));
std::cout << " nBytes : " << nBytes << std::endl; // 600
// simulate a file using
std::stringstream ss;
// note size
std::cout << " ss.str().size(): " << ss.str().size() << std::endl; // 0
// identify address of 1st element of vector
// cast to a 'const char*' for 'write()' method
const char* wBuff = reinterpret_cast<const char*>(&vec6W[0]);
// report address
std::cout << " static_cast<const void*>(wBuff): " << std::hex
<< static_cast<const void*>(wBuff) << std::dec << std::endl;
//write vector to ss
ss.write(wBuff, nBytes);
std::cout << "\n note ss content size change " << std::endl;
std::cout << " ss.str().size() : " << ss.str().size() << std::endl;
// //////////////////////////////////////////////////////////////////
// build a vector of 6 clear elements to create buffer for read from file
std::vector<Something> vec6R(6);
// identify address of 1st element of vector
std::cout << " &vec6R[0] : " << std::hex << (&vec6R[0]) << std::dec << std::endl;
char* rBuff = reinterpret_cast<char*>(&vec6R[0]);
// read vector from ss
(void)ss.read(rBuff, nBytes); // read back same number of bytes
// //////////////////////////////////////////////////////////////////
// confirm vec6R matches what was written vec6W
int diff = memcmp(wBuff, rBuff, nBytes);
std::cout << (diff ? "FAILED" : "wBuff MATCHES rBuff: SUCCESS") << std::endl;
// now consider comparing vec6R to vec6W element by element
There may be additional issues when the class has virtual methods.
当类具有虚方法时,可能还有其他问题。
Good luck.
祝你好运。
edit -----
编辑-----
Pointers can be handled, but create additional work, and some asymmetry.
指针可以处理,但创建额外的工作,一些不对称。
Related work might be called "persistant storage".
相关工作可能被称为“持久存储”。
Also, there exists tools to simplify the additional steps of non-POD data (sorry, I have forgotten the name.)
此外,还有工具来简化非POD数据的附加步骤(抱歉,我忘记了名字。)
#2
1
There is no single answer to this question.
这个问题没有一个答案。
An appropriate approach depends on the needs of your application, why it is saving the file, and what will be done with the file. For example, a file that is intended to opened in another program and understood by a human may be written very differently from a file that just saves program state (i.e. that only a software program needs to make sense of it).
适当的方法取决于应用程序的需求,保存文件的原因以及对文件执行的操作。例如,打算在另一个程序中打开并且由人理解的文件可以与仅保存程序状态的文件(即,只有软件程序需要理解它)非常不同地编写。
The most efficient depends on your measure of efficiency. Some possible measures include speed of writing, speed of reading, file size, size of code to do the writing, etc etc. Not all of these things go together - for example, an archiver program may choose a slow approach to writing a file, in order to achieve fast read speeds.
效率最高取决于您的效率指标。一些可能的措施包括写入速度,读取速度,文件大小,写入代码的大小等等。并非所有这些都在一起 - 例如,归档程序可能选择一种缓慢的方法来编写文件,为了实现快速读取速度。
Usually, writing a collection of objects involves writing all the objects individually, plus some additional book-keeping (e.g. output the number of objects first), particularly if the file needs to be read later. However, smarter algorithms might derive some sort of summary data from a set of objects. For example, assume a vector containing the integers 1 to 20 in order. One way of writing is to write all 20 values. Another is simply to emit the string "1-20".
通常,编写对象集合涉及单独编写所有对象,以及一些额外的簿记(例如,首先输出对象的数量),特别是如果稍后需要读取文件。但是,更智能的算法可能会从一组对象中获取某种摘要数据。例如,假设按顺序包含整数1到20的向量。一种写作方式是写入所有20个值。另一个是简单地发出字符串“1-20”。
#1
1
If your data is POD (and thus each element is the same size), you can use the following ideas. If not, the following ideas might betoo difficult to use easily.
如果您的数据是POD(因此每个元素的大小相同),您可以使用以下想法。如果没有,以下想法可能很难轻易使用。
The following code shows that you 'can' use "ostream::write" to put the binary data out to a file. vector<> promises that the objects are front-to-back (or back-to-front, as you wish) in memory, so they are lined up and relatively compact for transfer.
以下代码显示您可以“使用”ostream :: write“将二进制数据输出到文件中。 vector <>承诺在内存中对象是前后(或从前到后,如你所愿),因此它们排成一行并且相对紧凑以便传输。
Reading back in to memory is similar, using "istream::read". But here you need to allocate the buffering 1st.
使用“istream :: read”读回内存是类似的。但是在这里你需要分配缓冲1。
In the following code, I allocated 6 objects for write, then found the beginning of the vector memory and wrote all 6 objects in with one 'write'.
在下面的代码中,我分配了6个对象进行写入,然后找到了向量内存的开头,并用一个'write'写入了所有6个对象。
For the read, I instantiated and loaded 6 objects into a new vector (note that vector provides a slicker way to allocate and instantiate the objects using the default ctor).
对于读取,我实例化并将6个对象加载到新向量中(注意,向量提供了一种使用默认ctor分配和实例化对象的更流畅的方式)。
How you pass the number of objects from the write effort to the read? This can be computed from the file size. (see stat)
如何将写入工作中的对象数传递给读取?这可以从文件大小计算。 (见统计)
Using the reinterpret-cast is not (generally) acceptable, and may not be portable. But sometimes you just gotta do it.
使用重新解释 - 演员不是(通常)可接受的,并且可能不是可移植的。但有时你只需要这样做。
Hopefully this 'evil' will prompt a number of better choices.
希望这个“邪恶”会促使一些更好的选择。
I left in some debug std::cout's ... hope you find them helpful.
我离开了一些调试std :: cout的...希望你发现它们有用。
edit - 4/8 -- cleaned the code somewhat
编辑 - 4/8 - 稍微清理了一下代码
// create a vector of 6 clear elements to save to file
std::vector<Something> vec6W(6); // default ctor clears contents
// change several elements by init'ing them
vec6W[0].init();
// [1] is clear (all 0's)
vec6W[2].init();
vec6W[3].init();
// [4] is clear (all 0's)
vec6W[5].init();
std::cout << " vec6W.size(): " << vec6W.size() << std::endl; // 6
std::cout << " sizeof(vec6W): " << sizeof(vec6W) << std::endl; // 12
std::cout << " sizeof(something): " << sizeof(Something) << std::endl; // 100
std::cout << " sizeof(vec6W[3]): " << sizeof(vec6W[3]) << std::endl; // 100
// #elements bytes per element
std::streamsize nBytes = (vec6W.size() * sizeof(Something));
std::cout << " nBytes : " << nBytes << std::endl; // 600
// simulate a file using
std::stringstream ss;
// note size
std::cout << " ss.str().size(): " << ss.str().size() << std::endl; // 0
// identify address of 1st element of vector
// cast to a 'const char*' for 'write()' method
const char* wBuff = reinterpret_cast<const char*>(&vec6W[0]);
// report address
std::cout << " static_cast<const void*>(wBuff): " << std::hex
<< static_cast<const void*>(wBuff) << std::dec << std::endl;
//write vector to ss
ss.write(wBuff, nBytes);
std::cout << "\n note ss content size change " << std::endl;
std::cout << " ss.str().size() : " << ss.str().size() << std::endl;
// //////////////////////////////////////////////////////////////////
// build a vector of 6 clear elements to create buffer for read from file
std::vector<Something> vec6R(6);
// identify address of 1st element of vector
std::cout << " &vec6R[0] : " << std::hex << (&vec6R[0]) << std::dec << std::endl;
char* rBuff = reinterpret_cast<char*>(&vec6R[0]);
// read vector from ss
(void)ss.read(rBuff, nBytes); // read back same number of bytes
// //////////////////////////////////////////////////////////////////
// confirm vec6R matches what was written vec6W
int diff = memcmp(wBuff, rBuff, nBytes);
std::cout << (diff ? "FAILED" : "wBuff MATCHES rBuff: SUCCESS") << std::endl;
// now consider comparing vec6R to vec6W element by element
There may be additional issues when the class has virtual methods.
当类具有虚方法时,可能还有其他问题。
Good luck.
祝你好运。
edit -----
编辑-----
Pointers can be handled, but create additional work, and some asymmetry.
指针可以处理,但创建额外的工作,一些不对称。
Related work might be called "persistant storage".
相关工作可能被称为“持久存储”。
Also, there exists tools to simplify the additional steps of non-POD data (sorry, I have forgotten the name.)
此外,还有工具来简化非POD数据的附加步骤(抱歉,我忘记了名字。)
#2
1
There is no single answer to this question.
这个问题没有一个答案。
An appropriate approach depends on the needs of your application, why it is saving the file, and what will be done with the file. For example, a file that is intended to opened in another program and understood by a human may be written very differently from a file that just saves program state (i.e. that only a software program needs to make sense of it).
适当的方法取决于应用程序的需求,保存文件的原因以及对文件执行的操作。例如,打算在另一个程序中打开并且由人理解的文件可以与仅保存程序状态的文件(即,只有软件程序需要理解它)非常不同地编写。
The most efficient depends on your measure of efficiency. Some possible measures include speed of writing, speed of reading, file size, size of code to do the writing, etc etc. Not all of these things go together - for example, an archiver program may choose a slow approach to writing a file, in order to achieve fast read speeds.
效率最高取决于您的效率指标。一些可能的措施包括写入速度,读取速度,文件大小,写入代码的大小等等。并非所有这些都在一起 - 例如,归档程序可能选择一种缓慢的方法来编写文件,为了实现快速读取速度。
Usually, writing a collection of objects involves writing all the objects individually, plus some additional book-keeping (e.g. output the number of objects first), particularly if the file needs to be read later. However, smarter algorithms might derive some sort of summary data from a set of objects. For example, assume a vector containing the integers 1 to 20 in order. One way of writing is to write all 20 values. Another is simply to emit the string "1-20".
通常,编写对象集合涉及单独编写所有对象,以及一些额外的簿记(例如,首先输出对象的数量),特别是如果稍后需要读取文件。但是,更智能的算法可能会从一组对象中获取某种摘要数据。例如,假设按顺序包含整数1到20的向量。一种写作方式是写入所有20个值。另一个是简单地发出字符串“1-20”。