I'm trying to get some values from a site but these values only appears when I use a Browser, like Mozilla. When I use the Jsoup I can get the HTML from the site but without values, only with the tags.
我试图从网站获取一些值,但这些值仅在我使用浏览器时出现,如Mozilla。当我使用Jsoup时,我可以从网站获取HTML但没有值,只有标签。
This is the site I'm trying to parse:
这是我要解析的网站:
I'm trying to get the values that appears inside these span tags:
我正在尝试获取这些span标记内显示的值:
If I access the previous URL from a web browser I can see the following values: '', 'R$ 2634,22' and 'R$ 2634,22', but when I use the following code the values disapears.
如果我从Web浏览器访问以前的URL,我可以看到以下值:'','R $ 2634,22'和'R $ 2634,22',但是当我使用下面的代码时,值会消失。
URL url = new URL("http://www.submarinoviagens.com.br/Passagens/selecionarvoo?Origem=nat&Destino=mia&Data=05/11/2012&Hora=&Origem=mia&Destino=nat"+
"&Data=09/11/2012&Hora=&NumADT=1&NumCHD=0&NumINF=0&SomenteDireto=0&Cia=&SelCabin=&utm_source=&utm_medium=&utm_campaign=&CPId=");
Document doc = Jsoup.parse(url, 100000);
String title = doc.title();
System.out.println(doc.toString());
If I try to see the source code via Mozilla Firefox the values disapears too. But If I use the firebug plugin I can see them.
如果我尝试通过Mozilla Firefox查看源代码,那么值也会消失。但如果我使用firebug插件,我可以看到它们。
Thank's for the help!
谢谢您的帮助!
2 个解决方案
#1
0
The website uses JavaScript to populate all of the values you are trying to parse. You will have to use a library that can compute the javascript within the page. Not sure if there is one though.
该网站使用JavaScript来填充您要解析的所有值。您将不得不使用可以在页面内计算javascript的库。不确定是否有一个。
anyone else?
#2
0
Htmlunit is a headless browser that renders Javascript and should be able to present this page correctly.
Htmlunit是一个无头浏览器,可以呈现Javascript并且应该能够正确显示此页面。
#1
0
The website uses JavaScript to populate all of the values you are trying to parse. You will have to use a library that can compute the javascript within the page. Not sure if there is one though.
该网站使用JavaScript来填充您要解析的所有值。您将不得不使用可以在页面内计算javascript的库。不确定是否有一个。
anyone else?
#2
0
Htmlunit is a headless browser that renders Javascript and should be able to present this page correctly.
Htmlunit是一个无头浏览器,可以呈现Javascript并且应该能够正确显示此页面。