Exception in thread "main" <strong><span style="font-size:18px;">org.jsoup.UnsupportedMimeTypeException:</span></strong> Unhandled content type. Must be text/*, application/xml, or application/xhtml+xml. Mimetype=application/json; charset=utf-8, URL=
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:487)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:434)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:181)
at org.jsoup.helper.HttpConnection.get(HttpConnection.java:170)
做新浪微博爬虫的时候,jsoup请求网址出现这样的错误,解决方法是添加蓝色部分代码
<pre name="code" class="java">Jsoup.connect("http://").ignoreContentType(true).get();
可参考以下API解释:
ignoreContentType Connection ignoreContentType(boolean ignoreContentType)
Ignore the document's Content-Type when parsing the response. By default this is false, an unrecognised content-type will cause an IOException to be thrown. (This is to prevent producing garbage by attempting to parse a JPEG binary image, for example.) Set to true to force a parse attempt regardless of content type.
Parameters:
ignoreContentType - set to true if you would like the content type ignored on parsing the response into a Document.
Returns:
this Connection, for chaining