如何从Java网页获取信息?

时间:2022-12-12 01:51:16

Does anyone know of a quick way that I can get information from a webpage in Java? For instance, if I'm looking at a page like this: http://www.ncbi.nlm.nih.gov/pubmed/?term=10952317 and i want to extract the list of words beneath the heading "MeSH Terms", how would I go about doing so?

有谁知道我可以从Java网页获取信息的快捷方式?例如,如果我正在查看这样的页面:http://www.ncbi.nlm.nih.gov/pubmed/?term = 10952317,我想提取“MeSH Terms”标题下的单词列表,我该怎么做呢?

I have something that can read the source but it is full of HTML tags and such...

我有一些东西可以读取源,但它充满了HTML标签等...

Any help is much appreciated!

任何帮助深表感谢!

2 个解决方案

#1


3  

As has been mentioned on here countless times before have a look at JSoup, which is a HTML parsing library for Java. Or write your own (not recommended).

正如前面已经提到的那样,无数次看看JSoup,这是一个用于Java的HTML解析库。或者自己写(不推荐)。

#2


0  

Probably TagSoup is for you.

可能TagSoup适合你。

#1


3  

As has been mentioned on here countless times before have a look at JSoup, which is a HTML parsing library for Java. Or write your own (not recommended).

正如前面已经提到的那样,无数次看看JSoup,这是一个用于Java的HTML解析库。或者自己写(不推荐)。

#2


0  

Probably TagSoup is for you.

可能TagSoup适合你。