Does anyone know of a quick way that I can get information from a webpage in Java? For instance, if I'm looking at a page like this: http://www.ncbi.nlm.nih.gov/pubmed/?term=10952317 and i want to extract the list of words beneath the heading "MeSH Terms", how would I go about doing so?
有谁知道我可以从Java网页获取信息的快捷方式?例如,如果我正在查看这样的页面:http://www.ncbi.nlm.nih.gov/pubmed/?term = 10952317,我想提取“MeSH Terms”标题下的单词列表,我该怎么做呢?
I have something that can read the source but it is full of HTML tags and such...
我有一些东西可以读取源,但它充满了HTML标签等...
Any help is much appreciated!
任何帮助深表感谢!
2 个解决方案
#1
3
As has been mentioned on here countless times before have a look at JSoup, which is a HTML parsing library for Java. Or write your own (not recommended).
正如前面已经提到的那样,无数次看看JSoup,这是一个用于Java的HTML解析库。或者自己写(不推荐)。
#1
3
As has been mentioned on here countless times before have a look at JSoup, which is a HTML parsing library for Java. Or write your own (not recommended).
正如前面已经提到的那样,无数次看看JSoup,这是一个用于Java的HTML解析库。或者自己写(不推荐)。