二进制和XML序列化之间是否存在性能差异?

时间:2021-03-13 17:18:55

in terms of both parsing (serializing, deserializing) and sending packets over the network is there any good estimation of performance differences between binary and xml serialization?

就解析(序列化、反序列化)和通过网络发送数据包而言,二进制和xml序列化之间的性能差异是否有很好的估计?

4 个解决方案

#1


15  

Nope.

不。

It depends highly on what sort of data is inside the XML document itself. If you have a lot of structured data, the overhead for XML will be large. For example, if your data looks like:

它高度依赖于XML文档内部的数据类型。如果您有大量结构化数据,那么XML的开销将很大。例如,如果您的数据如下:

<person>
  <name>Dave</dave>
  <ssn>000-00-0000</ssn>
  <email1>xxxxxx/email1>
</person>
...

You'll have a lot more overhead than if you have an XML document that looks like:

如果XML文档看起来像:

<book name="bible">
 In the beginning God created the heavens and the earth. 
 Now the earth was formless and empty ... 
 And if any man shall take away from the words of the book of this prophecy, God shall take away his part out of the book of life, and out of the holy city, and from the things which are written in this book. He which testifieth these things saith, Surely I come quickly. Amen. Even so, come, Lord Jesus.
</book>

So it's not really a fair question. It depends highly on the data YOU intend to send, and how/if you're compressing it.

所以这不是一个公平的问题。这在很大程度上取决于您打算发送的数据,以及如何/如果您正在压缩它。

#2


5  

The biggest difference between BinaryFormatter and xml serialization is portability; BinaryFormatter is hard to guarantee between versions, so is only really suitable for shortterm storage or transfer.

BinaryFormatter和xml序列化最大的区别在于可移植性;在版本之间很难保证二进制格式化程序,因此只适合短期存储或传输。

However, you can get the best of both, and have it smaller and have it quicker, by using bespoke binary serialization - and you don't even have to do it yourself ;-p

但是,通过使用定制的二进制序列化,您可以同时获得这两种方法的优点,并使其更小、更快

protobuf-net is a .NET implementation of Google's protocol buffers binary serialization spec; it is smaller than either XmlSerializer or BinaryFormatter, fully portable (not just between versions - you can load a pb stream in, for example, java, etc), extensible, and fast. It is also pretty comprehensively tested, with a fair number of users.

protobuf-net是谷歌协议缓冲二进制序列化规范的。net实现;它比XmlSerializer或BinaryFormatter中的任何一个都要小,完全可移植(不仅仅是在不同版本之间——您可以在其中加载pb流,例如java等),可扩展且快速。它也得到了相当全面的测试,拥有相当数量的用户。

A full breakdown of size and speed, covering XmlSerializer, BinaryFormatter, DataContractSerializer and protobuf-net is here.

本文详细介绍了XmlSerializer、BinaryFormatter、DataContractSerializer和protobuf-net的大小和速度。

#3


0  

Instinctively you would want to say that binary is more efficient, but it actually depends on the data being serialized.

本能地,您可能想说二进制文件更有效,但它实际上取决于被序列化的数据。

Check out this article: http://www.nablasoft.com/alkampfer/index.php/2008/10/31/binary-versus-xml-serialization-size/

看看这篇文章:http://www.nablasoft.com/alkampfer/index.php/2008/10/31/bin-versus -xml- serializ-size/

#4


0  

Just pointing out the performance is not the only metric you may want to look at.

仅仅指出性能并不是您想要查看的唯一指标。

  • Ease of construction. Do you have several days/weeks to build a serialiser/deserialiser routine and test it thoroughly or could that time be better spent on features.
  • 易于施工。您是否有几天或几周的时间来构建一个serialiser/deserialiser例程并对其进行彻底的测试,或者您是否可以将时间花在特性上。
  • Ease of consuming the data. Can a client use a pre-built open-source parser or do they need to implement a bunch of (potentially buggy) code themselves?
  • 易于使用数据。客户端是否可以使用预构建的开源解析器,或者他们是否需要自己实现一堆(可能存在bug)代码?
  • Ease of debugging. Will being able to view the data in transit help to debug? Then a binary format will tend to obfuscate any issues.
  • 易于调试。能够查看传输中的数据有助于调试吗?然后二进制格式会混淆任何问题。
  • What is the maintenance cost for each method?
  • 每种方法的维护成本是多少?

Personally, I would use a published XML standard and open source parsing libraries until a performance bottleneck is proven by actual testing.

就我个人而言,我将使用已发布的XML标准和开放源码解析库,直到实际测试证明存在性能瓶颈。

#1


15  

Nope.

不。

It depends highly on what sort of data is inside the XML document itself. If you have a lot of structured data, the overhead for XML will be large. For example, if your data looks like:

它高度依赖于XML文档内部的数据类型。如果您有大量结构化数据,那么XML的开销将很大。例如,如果您的数据如下:

<person>
  <name>Dave</dave>
  <ssn>000-00-0000</ssn>
  <email1>xxxxxx/email1>
</person>
...

You'll have a lot more overhead than if you have an XML document that looks like:

如果XML文档看起来像:

<book name="bible">
 In the beginning God created the heavens and the earth. 
 Now the earth was formless and empty ... 
 And if any man shall take away from the words of the book of this prophecy, God shall take away his part out of the book of life, and out of the holy city, and from the things which are written in this book. He which testifieth these things saith, Surely I come quickly. Amen. Even so, come, Lord Jesus.
</book>

So it's not really a fair question. It depends highly on the data YOU intend to send, and how/if you're compressing it.

所以这不是一个公平的问题。这在很大程度上取决于您打算发送的数据,以及如何/如果您正在压缩它。

#2


5  

The biggest difference between BinaryFormatter and xml serialization is portability; BinaryFormatter is hard to guarantee between versions, so is only really suitable for shortterm storage or transfer.

BinaryFormatter和xml序列化最大的区别在于可移植性;在版本之间很难保证二进制格式化程序,因此只适合短期存储或传输。

However, you can get the best of both, and have it smaller and have it quicker, by using bespoke binary serialization - and you don't even have to do it yourself ;-p

但是,通过使用定制的二进制序列化,您可以同时获得这两种方法的优点,并使其更小、更快

protobuf-net is a .NET implementation of Google's protocol buffers binary serialization spec; it is smaller than either XmlSerializer or BinaryFormatter, fully portable (not just between versions - you can load a pb stream in, for example, java, etc), extensible, and fast. It is also pretty comprehensively tested, with a fair number of users.

protobuf-net是谷歌协议缓冲二进制序列化规范的。net实现;它比XmlSerializer或BinaryFormatter中的任何一个都要小,完全可移植(不仅仅是在不同版本之间——您可以在其中加载pb流,例如java等),可扩展且快速。它也得到了相当全面的测试,拥有相当数量的用户。

A full breakdown of size and speed, covering XmlSerializer, BinaryFormatter, DataContractSerializer and protobuf-net is here.

本文详细介绍了XmlSerializer、BinaryFormatter、DataContractSerializer和protobuf-net的大小和速度。

#3


0  

Instinctively you would want to say that binary is more efficient, but it actually depends on the data being serialized.

本能地,您可能想说二进制文件更有效,但它实际上取决于被序列化的数据。

Check out this article: http://www.nablasoft.com/alkampfer/index.php/2008/10/31/binary-versus-xml-serialization-size/

看看这篇文章:http://www.nablasoft.com/alkampfer/index.php/2008/10/31/bin-versus -xml- serializ-size/

#4


0  

Just pointing out the performance is not the only metric you may want to look at.

仅仅指出性能并不是您想要查看的唯一指标。

  • Ease of construction. Do you have several days/weeks to build a serialiser/deserialiser routine and test it thoroughly or could that time be better spent on features.
  • 易于施工。您是否有几天或几周的时间来构建一个serialiser/deserialiser例程并对其进行彻底的测试,或者您是否可以将时间花在特性上。
  • Ease of consuming the data. Can a client use a pre-built open-source parser or do they need to implement a bunch of (potentially buggy) code themselves?
  • 易于使用数据。客户端是否可以使用预构建的开源解析器,或者他们是否需要自己实现一堆(可能存在bug)代码?
  • Ease of debugging. Will being able to view the data in transit help to debug? Then a binary format will tend to obfuscate any issues.
  • 易于调试。能够查看传输中的数据有助于调试吗?然后二进制格式会混淆任何问题。
  • What is the maintenance cost for each method?
  • 每种方法的维护成本是多少?

Personally, I would use a published XML standard and open source parsing libraries until a performance bottleneck is proven by actual testing.

就我个人而言,我将使用已发布的XML标准和开放源码解析库,直到实际测试证明存在性能瓶颈。