This is a C# problem. I have a big object in memory at a certain time. I want to serialize it to a file. There are two steps to do it. 1st, I need to change the object to csv string. 2nd, I need to serialize the csv string.
这是一个C#问题。我在某个时间在记忆中有一个大物体。我想将它序列化为一个文件。有两个步骤可以做到这一点。 1,我需要将对象更改为csv字符串。 2,我需要序列化csv字符串。
I have a utility tool, which can append strings to a MemoryStream. I use this utility tool to convert the big object to csv string (in a big trunk of MemoryStream). After converting the big object to a MemoryStream, I create a StreamReader of the MemoryStream and call its method StreamReader.ReadToEnd() to convert the MemoryStream to a (long) string. Then I call info.AddValue("BigObject", string); to serialize the string.
我有一个实用工具,可以将字符串附加到MemoryStream。我使用这个实用工具将大对象转换为csv字符串(在MemoryStream的大树干中)。在将大对象转换为MemoryStream之后,我创建了MemoryStream的StreamReader,并调用其方法StreamReader.ReadToEnd()将MemoryStream转换为(长)字符串。然后我调用info.AddValue(“BigObject”,string);序列化字符串。
As one can see, in the memory, I will actually hold three copies of the big object. The first one is the object itself, the second will be the MemoryStream, holding the csv string and the third is the string, which is actually a redundant of the MemoryStream.
可以看出,在内存中,我实际上会拥有三个大对象的副本。第一个是对象本身,第二个是MemoryStream,持有csv字符串,第三个是字符串,实际上是MemoryStream的冗余。
Is there any way to reduce the memory consumption in this procedure? It seems that if not MemoryStream, I will anyway need to use a StringBuilder to hold the csv string of the big object and I will anyway need to call StringBuilder.ToString() to get the final string. Then the final string and the StringBuilder will coexist in the memory and consume the same amount of memory as currently the MemoryStream and string.
有没有办法减少此过程中的内存消耗?看来如果不是MemoryStream,我无论如何都需要使用StringBuilder来保存大对象的csv字符串,无论如何我都需要调用StringBuilder.ToString()来获取最终的字符串。然后最后的字符串和StringBuilder将共存于内存中,并消耗与当前MemoryStream和字符串相同的内存量。
Any idea is welcomed. Thank you.
欢迎任何想法。谢谢。
5 个解决方案
#1
If you're worried about peak memory usage, I suppose you could manually force a garbage collection after you're done with the orignal object and then again after you're done with the memory stream.
如果您担心峰值内存使用量,我想您可以在完成orignal对象后手动强制进行垃圾回收,然后在完成内存流后再次强制进行垃圾回收。
(Let me just point out that, while there are a few cases where taking control of garbage collection is necessary, it's generally a bad idea. Usually, it's better to let things get collected in due time.)
(我要指出的是,虽然有一些案例需要控制垃圾收集,但这通常是一个坏主意。通常情况下,最好让事情在适当的时候收集。)
#2
Give the following a try.
尝试以下内容。
public void SerializeToFile<T>(T target, string filename)
{
XmlSerializer serializer = new XmlSerializer(typeof (T));
using (FileStream stream = new FileStream(filename, FileMode.Create, FileAccess.Write))
{
serializer.Serialize(stream, target);
}
}
Edit: Assuming you can get your object to implement ISerializable and tie your utility into the GetObjectData method.
编辑:假设您可以让您的对象实现ISerializable并将您的实用程序绑定到GetObjectData方法。
Edit2: Missed the CSV part. Icky. Try using an XSLT on the XML after serializing it.
编辑2:错过了CSV部分。恶心。在序列化之后尝试在XML上使用XSLT。
Link to an article about converting xml to csv via an xslt.
链接到有关通过xslt将xml转换为csv的文章。
#3
You don't have to implement your own serialization. You can leave it to the .NET framework. A good starting point can be found here.
您不必实现自己的序列化。您可以将它留给.NET框架。这里有一个很好的起点。
#4
What kind of data are we talking about? If it's text data then you could use in memery compression and save a lot of memory that way.
我们在谈论什么样的数据?如果它是文本数据,那么你可以使用memery压缩并以这种方式节省大量内存。
#5
Rather than having the intermediate step of converting the object into a CSV string, you may want to try writing the object to the file as you serialize it. Just use a file stream in place of your MemoryStream when building the CSV. Better yet, create a SerializeToStream method on your object that takes any sort of stream as a parameter.
您可能希望在序列化时将对象写入文件,而不是将对象转换为CSV字符串。在构建CSV时,只需使用文件流代替MemoryStream。更好的是,在对象上创建一个SerializeToStream方法,该方法将任何类型的流作为参数。
#1
If you're worried about peak memory usage, I suppose you could manually force a garbage collection after you're done with the orignal object and then again after you're done with the memory stream.
如果您担心峰值内存使用量,我想您可以在完成orignal对象后手动强制进行垃圾回收,然后在完成内存流后再次强制进行垃圾回收。
(Let me just point out that, while there are a few cases where taking control of garbage collection is necessary, it's generally a bad idea. Usually, it's better to let things get collected in due time.)
(我要指出的是,虽然有一些案例需要控制垃圾收集,但这通常是一个坏主意。通常情况下,最好让事情在适当的时候收集。)
#2
Give the following a try.
尝试以下内容。
public void SerializeToFile<T>(T target, string filename)
{
XmlSerializer serializer = new XmlSerializer(typeof (T));
using (FileStream stream = new FileStream(filename, FileMode.Create, FileAccess.Write))
{
serializer.Serialize(stream, target);
}
}
Edit: Assuming you can get your object to implement ISerializable and tie your utility into the GetObjectData method.
编辑:假设您可以让您的对象实现ISerializable并将您的实用程序绑定到GetObjectData方法。
Edit2: Missed the CSV part. Icky. Try using an XSLT on the XML after serializing it.
编辑2:错过了CSV部分。恶心。在序列化之后尝试在XML上使用XSLT。
Link to an article about converting xml to csv via an xslt.
链接到有关通过xslt将xml转换为csv的文章。
#3
You don't have to implement your own serialization. You can leave it to the .NET framework. A good starting point can be found here.
您不必实现自己的序列化。您可以将它留给.NET框架。这里有一个很好的起点。
#4
What kind of data are we talking about? If it's text data then you could use in memery compression and save a lot of memory that way.
我们在谈论什么样的数据?如果它是文本数据,那么你可以使用memery压缩并以这种方式节省大量内存。
#5
Rather than having the intermediate step of converting the object into a CSV string, you may want to try writing the object to the file as you serialize it. Just use a file stream in place of your MemoryStream when building the CSV. Better yet, create a SerializeToStream method on your object that takes any sort of stream as a parameter.
您可能希望在序列化时将对象写入文件,而不是将对象转换为CSV字符串。在构建CSV时,只需使用文件流代替MemoryStream。更好的是,在对象上创建一个SerializeToStream方法,该方法将任何类型的流作为参数。