I have a html line where there are tags inside tags, a single tag my contain multiple class. I need to extract the text with single class name(i know only one class name)
我有一个html行,其中标签内有标签,单个标签我包含多个类。我需要用单个类名提取文本(我只知道一个类名)
<p class="Body1"><span class="style3"></span><span class="style1">W</span><span class="Allsmall style5">extract this text </span><span class="style5">unwanted text </span></p>
I know the class name Allsmall alone i want to extract the text "extract this text" from the html line using Jsoup in java.
我知道单独的类名Allsmall我想用java中的Jsoup从html行中提取文本“extract this text”。
1 个解决方案
#1
1
You can use the selector syntax to retrieve a specific element based on its CSS class attribute:
您可以使用选择器语法根据其CSS类属性检索特定元素:
Document doc = Jsoup.parse(
new File("input.html"),
"UTF-8",
"http://sample.com/");
Element allSmallSpan = doc.select("span.Allsmall").first(); // Retrive the first <span> element which belongs to "Allsmall" class
#1
1
You can use the selector syntax to retrieve a specific element based on its CSS class attribute:
您可以使用选择器语法根据其CSS类属性检索特定元素:
Document doc = Jsoup.parse(
new File("input.html"),
"UTF-8",
"http://sample.com/");
Element allSmallSpan = doc.select("span.Allsmall").first(); // Retrive the first <span> element which belongs to "Allsmall" class