用Jsoup的select()获取数字

时间:2022-08-22 20:31:24

I have the following html code and i'm trying to extract the date from it.How can i do that with method select() of Jsoup?

我有以下的html代码,我正试图从中提取日期。我怎么能用Jsoup的方法select()做到这一点?

<span class="lead">Written on</span> 05.01.2013 at 12:16 <br /> 

1 个解决方案

#1


0  

Here:

final String html = "<span class=\"lead\">Written on</span> 05.01.2013 at 12:16 <br />";

Document doc = Jsoup.parse(html);

for( Element element : doc.select("span.lead") )
{
    // Simple output of the date; 'toString()' gives you the value
    System.out.println(element.nextSibling().toString());
}

Output:

 05.01.2013 at 12:16 

Explanation:

  1. With doc.select("span.lead") you get the span-tag of your Html.
  2. 使用doc.select(“span.lead”),您将获得Html的span-tag。

  3. You iterate over each span-tag (there's only one in this example)
  4. 迭代每个span-tag(在这个例子中只有一个)

  5. With element.nextSibling() you get the next Node after the span - the textnode you look for
  6. 使用element.nextSibling(),您可以获得span之后的下一个Node - 您查找的textnode

Since there's a leading blank you may use trim() to remove it: element.nextSibling().toString().trim()

由于有一个前导空白,你可以使用trim()来删除它:element.nextSibling()。toString()。trim()

#1


0  

Here:

final String html = "<span class=\"lead\">Written on</span> 05.01.2013 at 12:16 <br />";

Document doc = Jsoup.parse(html);

for( Element element : doc.select("span.lead") )
{
    // Simple output of the date; 'toString()' gives you the value
    System.out.println(element.nextSibling().toString());
}

Output:

 05.01.2013 at 12:16 

Explanation:

  1. With doc.select("span.lead") you get the span-tag of your Html.
  2. 使用doc.select(“span.lead”),您将获得Html的span-tag。

  3. You iterate over each span-tag (there's only one in this example)
  4. 迭代每个span-tag(在这个例子中只有一个)

  5. With element.nextSibling() you get the next Node after the span - the textnode you look for
  6. 使用element.nextSibling(),您可以获得span之后的下一个Node - 您查找的textnode

Since there's a leading blank you may use trim() to remove it: element.nextSibling().toString().trim()

由于有一个前导空白,你可以使用trim()来删除它:element.nextSibling()。toString()。trim()