十六进制值0x00是无效字符加载XML文档

时间:2022-12-29 20:21:47

I recently had an XML which would not load. The error message was

我最近有一个不会加载的XML。错误消息是

Hexadecimal value 0x00 is a invalid character

十六进制值0x00是无效字符

received by the minimum of code in LinqPad (C# statements):

通过LinqPad中的最少代码(C#语句)收到:

var xmlDocument = new XmlDocument();
xmlDocument.Load(@"C:\Users\Thomas\AppData\Local\Temp\tmp485D.tmp");

I went through the XML with a hex editor but could not find a 0x00 character. I minimized the XML to

我使用十六进制编辑器浏览了XML,但找不到0x00字符。我把XML最小化了

<?xml version="1.0" encoding="UTF-8"?>
<x>
</x>

In my hex editor it shows up as

在我的十六进制编辑器中,它显示为

Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000000  FF FE 3C 00 3F 00 78 00 6D 00 6C 00 20 00 76 00  ÿþ<.?.x.m.l. .v.
00000010  65 00 72 00 73 00 69 00 6F 00 6E 00 3D 00 22 00  e.r.s.i.o.n.=.".
00000020  31 00 2E 00 30 00 22 00 20 00 65 00 6E 00 63 00  1...0.". .e.n.c.
00000030  6F 00 64 00 69 00 6E 00 67 00 3D 00 22 00 55 00  o.d.i.n.g.=.".U.
00000040  54 00 46 00 2D 00 38 00 22 00 3F 00 3E 00 0D 00  T.F.-.8.".?.>...
00000050  0A 00 3C 00 78 00 3E 00 0D 00 0A 00 3C 00 2F 00  ..<.x.>.....<./.
00000060  78 00 3E 00                                      x.>.

So it's very easy to see that there is no 00 00 character anywhere. All even columns contain values other than 00.

所以很容易看到任何地方都没有00 00字符。所有偶数列都包含00以外的值。

Why does it complain about invalid 0x00 character?

为什么抱怨无效的0x00字符?

1 个解决方案

#1


13  

The problem is in the encoding. The byte order marks FF FE are for UTF-16, but the XML header defines encoding="UTF-8".

问题在于编码。字节顺序标记FF FE用于UTF-16,但XML标头定义encoding =“UTF-8”。

If you generate the XML yourself, there are two options:

如果您自己生成XML,则有两种选择:

a) write a UTF-8 header: EF BB BF

a)写一个UTF-8头:EF BB BF

b) define UTF-16 encoding: encoding="UTF-16"

b)定义UTF-16编码:encoding =“UTF-16”

If you receive the XML from someone else, there are also two options:

如果您从其他人那里收到XML,还有两个选项:

A) tell the author to fix the XML according a) or b)

A)告诉作者根据a)或b)修复XML

B) sanitize the input in your application (not preferred)

B)清理应用程序中的输入(不是首选)

#1


13  

The problem is in the encoding. The byte order marks FF FE are for UTF-16, but the XML header defines encoding="UTF-8".

问题在于编码。字节顺序标记FF FE用于UTF-16,但XML标头定义encoding =“UTF-8”。

If you generate the XML yourself, there are two options:

如果您自己生成XML,则有两种选择:

a) write a UTF-8 header: EF BB BF

a)写一个UTF-8头:EF BB BF

b) define UTF-16 encoding: encoding="UTF-16"

b)定义UTF-16编码:encoding =“UTF-16”

If you receive the XML from someone else, there are also two options:

如果您从其他人那里收到XML,还有两个选项:

A) tell the author to fix the XML according a) or b)

A)告诉作者根据a)或b)修复XML

B) sanitize the input in your application (not preferred)

B)清理应用程序中的输入(不是首选)