I'm getting emails from a client where they have nested a multipart/alternative message inside a multipart/mixed message. When I get the body of the message it just returns the multipart/alternative level when what I really want is the text/html part which is contained in the multipart/alternative.
我收到一个客户的邮件,他们在一个多部分/混合消息中嵌套了一个多部分/替代消息。当我得到消息体时,它只返回multipart/alternative级别,而我真正想要的是包含在multipart/alternative中的文本/html部分。
I've looked through the javadocs for javax.mail and I can't find a simple way to get the body of a bodypart that is itself a multipart or skip the first multipart/mixed part and go into the multipart/alternative body to read the text/html and text/plain pieces.
我在javadocs中查找过javax。邮件和我找不到一个简单的方法来获得身体的身体部分,它本身是一个多部分或跳过第一个多部分/混合部分,进入多部分/替代体阅读文本/html和文本/普通片段。
The email structure looks like this:
电子邮件结构如下:
...
Content-Type: multipart/mixed;
boundary="----=_Part_19487_1145362154.1418138792683"
------=_Part_19487_1145362154.1418138792683
Content-Type: multipart/alternative;
boundary="----=_Part_19486_1391901275.1418138792683"
------=_Part_19486_1391901275.1418138792683
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=ISO-8859-1
...
------=_Part_19486_1391901275.1418138792683
Content-Transfer-Encoding: 7bit
Content-Type: text/html; charset=ISO-8859-1
...
------=_Part_19486_1391901275.1418138792683--
------=_Part_19487_1145362154.1418138792683--
This is an outline of the code used to parse the emails:
这是用来分析电子邮件的代码的概要:
Message [] found = fldr.search(searchCondition);
for (int i = 0; i < found.length; i++) {
Message m = found[i];
Object o = m.getContent();
if (o instanceof Multipart) {
log.info("**This is a Multipart Message. ");
Multipart mp = (Multipart)o;
log.info("The Multipart message has " + mp.getCount() + " parts.");
for (int j = 0; j < mp.getCount(); j++) {
BodyPart b = mp.getBodyPart(j);
// Loop if the content type is multipart then get the content that is in that part,
// make it the new container and restart the loop in that part of the message.
if (b.getContentType().contains("multipart")) {
mp = (Multipart)b.getContent();
j = 0;
continue;
}
log.info("This content type is " + b.getContentType());
if(!b.getContentType().contains("text/html")) {
continue;
}
Object o2 = b.getContent();
if (o2 instanceof String) {
<do things with content here>
}
}
}
}
It appears to keep stopping at the second boundary and not parsing anything further. In the case of the above message it stops at boundary="----=_Part_19486_1391901275.1418138792683" and never gets to the text of the message.
它似乎继续停留在第二个边界,没有进一步解析任何内容。对于上面的消息,它会在boundary=“——————_Part_19486_1391901275.1418138792683”处停止,并且永远不会到达消息的文本。
2 个解决方案
#1
2
In this block :
在这一块:
if (b.getContentType().contains("multipart"))
{
mp = (Multipart)b.getContent();
j = 0;
continue;
}
You set j
to 0 and ask the loop to continue, hoping it will start again at zero. But the increment operation j++
will come before and your loop will start at 1, not 0.
将j设为0,并让循环继续,希望它在0处重新开始。但增量运算j++会在之前,你的循环将从1开始,而不是0。
Set j
to -1 to solve your issue.
把j设为-1来解决你的问题。
if (b.getContentType().contains("multipart"))
{
mp = (Multipart)b.getContent();
j = -1;
continue;
}
#2
1
I have tested your code and failed for me as well.
我已经测试了你的代码,也失败了。
In my case, b.getContentType()
returns all uppercase characters (e.g. "TEXT/HTML; charset=UTF-8"). So I have converted that to lowercase and it worked.
在我的例子中,b.getContentType()返回所有大写字符(例如。“TEXT / HTML;charset = utf - 8”)。我把它换成了小写的,这样就行了。
String contentType=b.getContentType().toLowerCase(Locale.ENGLISH);
if(!contentType.contains("text/html")) {
continue;
}
#1
2
In this block :
在这一块:
if (b.getContentType().contains("multipart"))
{
mp = (Multipart)b.getContent();
j = 0;
continue;
}
You set j
to 0 and ask the loop to continue, hoping it will start again at zero. But the increment operation j++
will come before and your loop will start at 1, not 0.
将j设为0,并让循环继续,希望它在0处重新开始。但增量运算j++会在之前,你的循环将从1开始,而不是0。
Set j
to -1 to solve your issue.
把j设为-1来解决你的问题。
if (b.getContentType().contains("multipart"))
{
mp = (Multipart)b.getContent();
j = -1;
continue;
}
#2
1
I have tested your code and failed for me as well.
我已经测试了你的代码,也失败了。
In my case, b.getContentType()
returns all uppercase characters (e.g. "TEXT/HTML; charset=UTF-8"). So I have converted that to lowercase and it worked.
在我的例子中,b.getContentType()返回所有大写字符(例如。“TEXT / HTML;charset = utf - 8”)。我把它换成了小写的,这样就行了。
String contentType=b.getContentType().toLowerCase(Locale.ENGLISH);
if(!contentType.contains("text/html")) {
continue;
}