尝试使用Javamail解析带有外来字符的电子邮件时获得奇怪的输出

时间:2021-09-10 23:49:11

We are using javamail to fetch emails from email accounts & lately we have had emails sent with Chinese, Japanese characters.

我们正在使用javamail从电子邮件帐户中获取电子邮件,最近我们收到了带有中文,日文字符的电子邮件。

For example, here's some japanese content:

例如,这里有一些日文内容:

限定クリエイティブツールのコレクションを含む高速写真編集ソフトウェア。

And it would probably get outputted like this:

它可能会像这样输出:

<div>=1B$B$"=1B(B =1B$B$$=1B(B =1B$B$&=1B(B =1B$B$(=1B(B =1B$B$*=1B(B =1B$B=
$+=1B(B =1B$B$-=1B(B =1B$B$/=1B(B =1B$B$1=1B(B =1B$B$3=1B(B =1B$B$5=1B(B =
=1B$B$7=1B(B =1B$B$9=1B(B =1B$B$;=1B(B =1B$B$=3D=1B(B =1B$B$,=1B(B =1B$B$.=
=1B(B =1B$B$0=1B(B =1B$B$2=1B(B =1B$B$4=1B(B =1B$B$Q=1B(B =1B$B$T=1B(B =1B$=
B$W=1B(B =1B$B$Z=1B(B =1B$B$]=1B(B</div>

And content-type is usually text/html; charset=UTF-8.

而content-type通常是text / html;字符集= UTF-8。

We are using writeTo method to get all the headers and content.

我们使用writeTo方法来获取所有标头和内容。

I tried doing the following but it didn't work:

我尝试了以下操作,但它不起作用:

ByteArrayOutputStream baos = new ByteArrayOutputStream();
m.writeTo(baos);
pm.setUnProcessedMessage(baos.toString("UTF-8")); //Here I am explicitly stating the encoding

Also, I believe the issue might be because we are using an old version of JavaMail (1.5.0).

此外,我认为问题可能是因为我们使用旧版本的JavaMail(1.5.0)。

What can we do here to handle foreign characters?

我们在这里可以做些什么来处理外国人物?

1 个解决方案

#1


1  

Using the writeTo method gives you the MIME encoded content of the message. It sounds like you want the decoded content, for which you should use the getContent or getInputStream method. The getContent method will return a String of Unicode characters, which you can use directly. The getInputStream method will return a byte string with the character encoding specified by the charset parameter; you'll need to wrap it with a Reader to get the Unicode characters.

使用writeTo方法为您提供消息的MIME编码内容。听起来你想要解码的内容,你应该使用getContent或getInputStream方法。 getContent方法将返回一个Unicode字符串,您可以直接使用它。 getInputStream方法将返回一个字节字符串,其中包含charset参数指定的字符编码;你需要用Reader包装它来获取Unicode字符。

If you also want the headers, e.g., to display them along with the message content, you should use the getSubject, getRecipients, etc. methods, which again will return you decoded content. You can use the getHeader method to get other headers, but you'll need to decode the content yourself using the MimeUtility methods.

如果您还想要标题,例如,将它们与消息内容一起显示,您应该使用getSubject,getRecipients等方法,这些方法将再次返回已解码的内容。您可以使用getHeader方法获取其他标头,但您需要使用MimeUtility方法自行解码内容。

#1


1  

Using the writeTo method gives you the MIME encoded content of the message. It sounds like you want the decoded content, for which you should use the getContent or getInputStream method. The getContent method will return a String of Unicode characters, which you can use directly. The getInputStream method will return a byte string with the character encoding specified by the charset parameter; you'll need to wrap it with a Reader to get the Unicode characters.

使用writeTo方法为您提供消息的MIME编码内容。听起来你想要解码的内容,你应该使用getContent或getInputStream方法。 getContent方法将返回一个Unicode字符串,您可以直接使用它。 getInputStream方法将返回一个字节字符串,其中包含charset参数指定的字符编码;你需要用Reader包装它来获取Unicode字符。

If you also want the headers, e.g., to display them along with the message content, you should use the getSubject, getRecipients, etc. methods, which again will return you decoded content. You can use the getHeader method to get other headers, but you'll need to decode the content yourself using the MimeUtility methods.

如果您还想要标题,例如,将它们与消息内容一起显示,您应该使用getSubject,getRecipients等方法,这些方法将再次返回已解码的内容。您可以使用getHeader方法获取其他标头,但您需要使用MimeUtility方法自行解码内容。