I have the following html code that I'd like to parse (some elements are stripped to enhance readability):
我有以下要解析的html代码(一些元素被剥离以增强可读性):
</div>
<article class="article-detail-description">
<h1 class="page-heading">
Postulat operacyjności definicji w naukach społecznych
<br /><small>Definition’s Operativeness Postulate in Social Sciences</small>
</h1>
<div>
<strong>Author(s): </strong>Jakub Karpiński<br /><strong>Subject(s): </strong>Social Sciences<br /><strong>Published by: </strong>Instytut Filozofii i Socjologii Polskiej Akademii Nauk<br/><strong>Keywords: </strong>operationism; definition of property; definition of indicator; concepts selection
<br/>
</div>
<p class="summary"><strong>Summary/Abstract: </strong>
The article’s primary goal is to demonstrate the problems inherited in “operationism – antioperationism” polemics.
</p>
<ul class="nav nav-tabs">
<li class="active" ><a href="#details" data-toggle="tab">Details</a></li>
<li><a href="#tableOfContents" data-toggle="tab">Contents</a></li>
</ul>
<div class="tab-content">
<div class="tab-pane fade active in" id="details">
<p class="journal-link"><strong>Journal: </strong><a href="/search/journal-detail?id=10">Studia Socjologiczne</a></p>
<ul class="article-additional-info">
<li><strong>Issue Year:</strong> 2011</li><li><strong>Issue No:</strong> 1 (200)</li><li><strong>Page Range:</strong> 65-80</li><li><strong>Page Count:</strong> 15</li><li><strong>Language:</strong> Polish</li>
</ul>
</div>
I can read all of it by using
我可以通过使用来阅读所有内容
document.getElementsByClassName("article-detail-description")[0].textContent
.
To read only <p class="summary"
I use:
仅读
getElementsByClassName("summary")[0].textContent
However, the latter is imperfect because it shows also Summary/Abstract:
)
然而,后者是不完美的,因为它也显示摘要/摘要:)
I am interested in many elements, let's take the following as examples:
我对许多元素感兴趣,让我们以下面的例子为例:
1. Postulat operacyjności definicji w naukach społecznych
1.Postulatopenracyjnościdefinicjiwnaukachspełecznych
I can get:
我可以得到:
Postulat operacyjności definicji w naukach społecznych
Definition’s Operativeness Postulate in Social Sciences
To get it I use: document.getElementsByClassName("page-heading")[0].innerText
为了得到它,我使用:document.getElementsByClassName(“page-heading”)[0] .innerText
How do I get separately Postulat operacyjności definicji w naukach społecznych
and Definition’s Operativeness Postulate in Social Sciences
?
我如何单独获得Postulatopenopenjnościdefinicjiwnaukachspołecznych和定义的社会科学的操作性假设?
2. I'd like to get e.g. 2011
from:
我想要... 2011年起:
`<li><strong>Issue Year:</strong> 2011</li><li>`
This time I have no clue whatsoever as to getting this information. The same is true for Issue No:
and others.
这次我不知道获取这些信息。问题号和其他问题也是如此。
1 个解决方案
#1
0
It depends on whether the structure is stable; but you can go access text nodes:
这取决于结构是否稳定;但你可以访问文本节点:
var heading = document.getElementsByClassName('page-heading')[0];
var polish = heading.childNodes[0].textContent.trim();
var english = heading.childNodes[2].textContent.trim();
console.log("Polish:", polish);
console.log("English:", english);
var li = document.querySelector('.article-additional-info li');
var issueYear = li.childNodes[1].textContent.trim();
console.log("Issue Year:", issueYear);
</div>
<article class="article-detail-description">
<h1 class="page-heading">
Postulat operacyjności definicji w naukach społecznych
<br /><small>Definition’s Operativeness Postulate in Social Sciences</small>
</h1>
<div>
<strong>Author(s): </strong>Jakub Karpiński<br /><strong>Subject(s): </strong>Social Sciences<br /><strong>Published by: </strong>Instytut Filozofii i Socjologii Polskiej Akademii Nauk<br/><strong>Keywords: </strong>operationism; definition of property; definition of indicator; concepts selection
<br/>
</div>
<p class="summary"><strong>Summary/Abstract: </strong>
The article’s primary goal is to demonstrate the problems inherited in “operationism – antioperationism” polemics.
</p>
<ul class="nav nav-tabs">
<li class="active" ><a href="#details" data-toggle="tab">Details</a></li>
<li><a href="#tableOfContents" data-toggle="tab">Contents</a></li>
</ul>
<div class="tab-content">
<div class="tab-pane fade active in" id="details">
<p class="journal-link"><strong>Journal: </strong><a href="/search/journal-detail?id=10">Studia Socjologiczne</a></p>
<ul class="article-additional-info">
<li><strong>Issue Year:</strong> 2011</li><li><strong>Issue No:</strong> 1 (200)</li><li><strong>Page Range:</strong> 65-80</li><li><strong>Page Count:</strong> 15</li><li><strong>Language:</strong> Polish</li>
</ul>
</div>
#1
0
It depends on whether the structure is stable; but you can go access text nodes:
这取决于结构是否稳定;但你可以访问文本节点:
var heading = document.getElementsByClassName('page-heading')[0];
var polish = heading.childNodes[0].textContent.trim();
var english = heading.childNodes[2].textContent.trim();
console.log("Polish:", polish);
console.log("English:", english);
var li = document.querySelector('.article-additional-info li');
var issueYear = li.childNodes[1].textContent.trim();
console.log("Issue Year:", issueYear);
</div>
<article class="article-detail-description">
<h1 class="page-heading">
Postulat operacyjności definicji w naukach społecznych
<br /><small>Definition’s Operativeness Postulate in Social Sciences</small>
</h1>
<div>
<strong>Author(s): </strong>Jakub Karpiński<br /><strong>Subject(s): </strong>Social Sciences<br /><strong>Published by: </strong>Instytut Filozofii i Socjologii Polskiej Akademii Nauk<br/><strong>Keywords: </strong>operationism; definition of property; definition of indicator; concepts selection
<br/>
</div>
<p class="summary"><strong>Summary/Abstract: </strong>
The article’s primary goal is to demonstrate the problems inherited in “operationism – antioperationism” polemics.
</p>
<ul class="nav nav-tabs">
<li class="active" ><a href="#details" data-toggle="tab">Details</a></li>
<li><a href="#tableOfContents" data-toggle="tab">Contents</a></li>
</ul>
<div class="tab-content">
<div class="tab-pane fade active in" id="details">
<p class="journal-link"><strong>Journal: </strong><a href="/search/journal-detail?id=10">Studia Socjologiczne</a></p>
<ul class="article-additional-info">
<li><strong>Issue Year:</strong> 2011</li><li><strong>Issue No:</strong> 1 (200)</li><li><strong>Page Range:</strong> 65-80</li><li><strong>Page Count:</strong> 15</li><li><strong>Language:</strong> Polish</li>
</ul>
</div>