I have an XSD file that is encoded in UTF-8, and any text editor I run it through doesn't show any character at the beginning of the file, but when I pull it up in Visual Studio's debugger, I clearly see an empty box in front of the file.
我有一个用UTF-8编码的XSD文件,我运行它的任何文本编辑器在文件的开头都不会显示任何字符,但是当我在Visual Studio的调试器中打开它时,我清楚地看到文件前面有一个空框。
I also get the error:
我还得到了一个错误:
Data at the root level is invalid. Line 1, position 1.
根级的数据无效。1号线,位置1。
Anyone know what this is?
有人知道这是什么吗?
Update: Edited post to qualify type of file. It's an XSD file created by Microsoft's XSD creator.
更新:编辑后的文章,以限定类型的文件。它是由微软的XSD创建者创建的XSD文件。
2 个解决方案
#1
53
It turns out, the answer is that what I'm seeing is a Byte Order Mark, which is a character that tells whatever is loading the document what it is encoded in. In my case, it's encoded in utf-8, so the corresponding BOM was EF BB BF
, as shown below. To remove it, I opened it up in Notepad++ and clicked on "Encode in UTF-8 without BOM", as shown below:
结果是,我看到的是一个字节顺序标记,它是一个字符,它告诉任何正在加载文档的东西它被编码在什么里面。在我的例子中,它是用utf-8编码的,所以对应的BOM是EF BB BF,如下所示。为了删除它,我在Notepad++中打开它,点击“没有BOM的UTF-8编码”,如下所示:
.
。
To actually see the BOM, I had to open it up in TextPad in Binary mode:, and conducted a Google search for "EF BB BF
".
为了真正看到BOM,我必须用二进制模式打开它:,并对“EF BB BF”进行谷歌搜索。
It took me about 8 hours to find out this was what was causing it, so I thought I'd share this with everyone.
我花了大约8个小时才发现是什么引起的,所以我想和大家分享一下。
Update: If I had read Joel Spolsky's blog post: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!), then I might not have had this problem.
更新:如果我读过Joel Spolsky的博客:绝对最少的每个软件开发人员绝对必须知道Unicode和字符集(没有借口!)
#2
29
here's how you do it with vim:
以下是你如何使用vim:
# vim file.xml
:set nobomb
:wq
#1
53
It turns out, the answer is that what I'm seeing is a Byte Order Mark, which is a character that tells whatever is loading the document what it is encoded in. In my case, it's encoded in utf-8, so the corresponding BOM was EF BB BF
, as shown below. To remove it, I opened it up in Notepad++ and clicked on "Encode in UTF-8 without BOM", as shown below:
结果是,我看到的是一个字节顺序标记,它是一个字符,它告诉任何正在加载文档的东西它被编码在什么里面。在我的例子中,它是用utf-8编码的,所以对应的BOM是EF BB BF,如下所示。为了删除它,我在Notepad++中打开它,点击“没有BOM的UTF-8编码”,如下所示:
.
。
To actually see the BOM, I had to open it up in TextPad in Binary mode:, and conducted a Google search for "EF BB BF
".
为了真正看到BOM,我必须用二进制模式打开它:,并对“EF BB BF”进行谷歌搜索。
It took me about 8 hours to find out this was what was causing it, so I thought I'd share this with everyone.
我花了大约8个小时才发现是什么引起的,所以我想和大家分享一下。
Update: If I had read Joel Spolsky's blog post: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!), then I might not have had this problem.
更新:如果我读过Joel Spolsky的博客:绝对最少的每个软件开发人员绝对必须知道Unicode和字符集(没有借口!)
#2
29
here's how you do it with vim:
以下是你如何使用vim:
# vim file.xml
:set nobomb
:wq