Web浏览器显示正确的值,但是当我使用Jsoup时,HTML没有值

时间:2021-07-22 08:18:18

I'm trying to get some values from a site but these values only appears when I use a Browser, like Mozilla. When I use the Jsoup I can get the HTML from the site but without values, only with the tags.

我试图从网站获取一些值,但这些值仅在我使用浏览器时出现,如Mozilla。当我使用Jsoup时,我可以从网站获取HTML但没有值,只有标签。

This is the site I'm trying to parse:

这是我要解析的网站:

http://www.submarinoviagens.com.br/Passagens/selecionarvoo?Origem=nat&Destino=mia&Data=05/11/2012&Hora=&Origem=mia&Destino=nat&Data=09/11/2012&Hora=&NumADT=1&NumCHD=0&NumINF=0&SomenteDireto=0&Cia=&SelCabin=&utm_source=&utm_medium=&utm_campaign=&CPId=

I'm trying to get the values that appears inside these span tags:

我正在尝试获取这些span标记内显示的值:

If I access the previous URL from a web browser I can see the following values: '', 'R$ 2634,22' and 'R$ 2634,22', but when I use the following code the values disapears.

如果我从Web浏览器访问以前的URL,我可以看到以下值:'','R $ 2634,22'和'R $ 2634,22',但是当我使用下面的代码时,值会消失。

URL url = new URL("http://www.submarinoviagens.com.br/Passagens/selecionarvoo?Origem=nat&Destino=mia&Data=05/11/2012&Hora=&Origem=mia&Destino=nat"+
            "&Data=09/11/2012&Hora=&NumADT=1&NumCHD=0&NumINF=0&SomenteDireto=0&Cia=&SelCabin=&utm_source=&utm_medium=&utm_campaign=&CPId=");
Document doc =  Jsoup.parse(url, 100000);
String title = doc.title(); 
System.out.println(doc.toString());

If I try to see the source code via Mozilla Firefox the values disapears too. But If I use the firebug plugin I can see them.

如果我尝试通过Mozilla Firefox查看源代码,那么值也会消失。但如果我使用firebug插件,我可以看到它们。

Thank's for the help!

谢谢您的帮助!

2 个解决方案

#1


0  

The website uses JavaScript to populate all of the values you are trying to parse. You will have to use a library that can compute the javascript within the page. Not sure if there is one though.

该网站使用JavaScript来填充您要解析的所有值。您将不得不使用可以在页面内计算javascript的库。不确定是否有一个。

anyone else?

#2


0  

Htmlunit is a headless browser that renders Javascript and should be able to present this page correctly.

Htmlunit是一个无头浏览器,可以呈现Javascript并且应该能够正确显示此页面。

#1


0  

The website uses JavaScript to populate all of the values you are trying to parse. You will have to use a library that can compute the javascript within the page. Not sure if there is one though.

该网站使用JavaScript来填充您要解析的所有值。您将不得不使用可以在页面内计算javascript的库。不确定是否有一个。

anyone else?

#2


0  

Htmlunit is a headless browser that renders Javascript and should be able to present this page correctly.

Htmlunit是一个无头浏览器,可以呈现Javascript并且应该能够正确显示此页面。