“3字节UTF-8序列的无效字节2”是什么意思?

I changed a file in Orbeon Forms, and the next time I load the page, I get an error message saying Invalid byte 2 of a 3-byte UTF-8 sequence. How can I solve this problem?

我在Orbeon窗体中更改了一个文件，下次加载页面时，我得到一个错误消息，表示3字节UTF-8序列的无效字节2。我该如何解决这个问题?

8 个解决方案

#1

This happens when Orbeon Forms reads an XML file and expects it to use the UTF-8 encoding, but somehow the file isn't properly encoded in UTF-8. To solve this, make sure that:

当Orbeon表单读取一个XML文件并期望它使用UTF-8编码时，就会发生这种情况，但是这个文件在UTF-8中并没有正确编码。为了解决这个问题，请确保:

You have an XML declaration at the beginning of the file saying the file is in UTF-8:

在文件开头有一个XML声明，表示文件在UTF-8中:
```
<?xml version="1.0" encoding="UTF-8" ?>
```
Your editor is XML-aware, so it can parse the XML declaration and consequently use the UTF-8 encoding. If your editor isn't XML aware, and you don't want to use another editor, look for an option or preference allowing you to specify that the editor must use UTF-8.

您的编辑器是支持XML的，因此它可以解析XML声明，从而使用UTF-8编码。如果编辑器不知道XML，而且不希望使用其他编辑器，那么可以选择一个选项或首选项，以便指定编辑器必须使用UTF-8。

#2

A three byte UTF-8 sequence looks like:

一个3字节的UTF-8序列看起来像:

1110xxxx 10xxxxxx 10xxxxxx

Your error message may mean that the first byte of the three is incorrectly flagging the start of a three byte sequence or else that the second byte is malformed.

您的错误消息可能意味着这三个字节的第一个字节错误地标记了三个字节序列的开始，或者第二个字节是错误的。

As @avernet says, you need to make sure that all elements in your system are producing and expecting UTF-8.

正如@avernet所说，您需要确保系统中的所有元素都产生并期望UTF-8。

#3

When you start your program, use the following Java command line argument:

启动程序时，使用以下Java命令行参数:

-Dfile.encoding=UTF-8

For example,

例如,

java -Dfile.encoding=UTF-8 -jar foo.jar

#4

I got the same problem in Eclipse, I just tried by changing the file type.

我在Eclipse中遇到了相同的问题，我只是尝试更改文件类型。

Right click on file -> Resource -> Text file encoding (UTF-8)

右击文件->资源->文本文件编码(UTF-8)

This solution worked for me.

这个办法对我起作用了。

Thanks.

谢谢。

#5

I am using Eclipse and I also had to change the Text file encoding in:

我正在使用Eclipse，我还必须改变文本文件的编码方式:

->Windows->Preferences->Workspace

- >窗口- >首选项- >工作区

Then it worked fine.

那么它工作得很好。

Thanks

谢谢

#6

You might need to configure your Tomcat with the following parameter:

您可能需要使用以下参数配置Tomcat:

-Dfile.encoding=UTF-8

-Dfile.encoding = utf - 8

#7

Had same problem.

有同样的问题。

Problem > I'm getting X509 certificate values (multiple encoding source) to generate a PDF report. The PDF is generated throught a webservice that waits for an UTF-8 xml request and I've to reencode the values before marshalling.

问题>我得到了X509证书值(多个编码源)来生成PDF报告。PDF是通过一个webservice生成的，它等待UTF-8 xml请求，我必须在编组之前重新编码值。

Solution > http://fabioangelini.wordpress.com/2011/08/04/converting-java-string-fromto-utf-8/

解决方案> http://fabioangelini.wordpress.com/2011/08/04/converting-java-string-fromto-utf-8/

Using this class:

使用这个类:

public class StringHelper {

// convert from UTF-8 -> internal Java String format
public static String convertFromUTF8(String s) {
    String out = null;
    try {
        out = new String(s.getBytes("ISO-8859-1"), "UTF-8");
    } catch (java.io.UnsupportedEncodingException e) {
        return null;
    }
    return out;
}

// convert from internal Java String format -> UTF-8
public static String convertToUTF8(String s) {
    String out = null;
    try {
        out = new String(s.getBytes("UTF-8"), "ISO-8859-1");
    } catch (java.io.UnsupportedEncodingException e) {
        return null;
    }
    return out;
}
}

Usage:

用法:

//getSummaryAttMap() returns a HashMap
String value = (String) getSummaryAttMap().get(key);
if(value != null)
value = StringHelper.convertToUTF8(value);
else
value = "";

#8

I'll provide a special coding answer. When you check the xml file and there's nothing wrong, and you're using Java and running Tomcat Server. Your source code may neglect specify the encoding yourself, and thus JVM uses default encoding when read in xml contents as string or something else that repesents string, which in turn refer to Tomcat's default encoding. If encoding of xml and Tomcat are inconsistent, it might also report same error message.

我将提供一个特殊的编码答案。当您检查xml文件并没有什么错误时，您正在使用Java并运行Tomcat服务器。您的源代码可能忽略了您自己的编码，因此JVM在读取xml内容时使用默认编码作为字符串或其他的repesents字符串，这反过来引用了Tomcat的默认编码。如果xml和Tomcat的编码不一致，它也可能报告相同的错误消息。

#1