I have this HTML from which I have to extract data:
我有这个HTML,我必须从中提取数据:
<html>
<head></head>
<body>
<div class="main">
<div class="utlimate"><p>hello</p></div>
<div class = "headline"><p>some text</p></div>
<div class="content">
<div class = "utimate"> <p>TOP</p>
<div class ="utlimate"> <p>data1</p></div>
<div class ="utlimate"> <p>it could be anything</p></div>
<div class ="utlimate"> <p>not</p></div>
<div class ="utlimate"> <p></p></div>
</div>
</div>
</div>
</body>
</html>
I need to access <div class="ultimate">
with <p>
that has value "data1", "it could be anything", "not".The code I tried for this :
我需要使用
来访问
soup = BeautifulSoup(HTML_data) #HTML_data is all html content
first_div = soup.find('div',{"class" : "content"})
second_div = first_div.find('div',{"class" : "utlimate"})
div_list = second_div.findall('div',{"class" : "utlimate"})
I got error in my code last line 'NoneType' object is not callable
我的代码最后一行出现错误'NoneType'对象不可调用
How do i access only those div's???plz help
我如何只访问那些div的??? plz帮助
2 个解决方案
#1
2
Try this:
尝试这个:
soup = BeautifulSoup(HTML_data) #HTML_data is all html content
first_div = soup.find('div',{"class" : "content"})
second_div = first_div.find('div',{"class" : "utimate"})
div_list = second_div.findAll('div',{"class" : "utlimate"})
The method for getting the list is findAll
, not findall
. There's no "ultimate" in the HTML fragment, they're "utlimate" or "utimate". Are those typos?
获取列表的方法是findAll,而不是findall。 HTML片段中没有“终极”,它们“非常”或“非常”。这些错别字吗?
#2
1
Is Soup None?
汤没有?
I suggest you re-factor your code to guard against this:
我建议你重新考虑你的代码以防止这种情况:
soup = BeautifulSoup(HTML_data) #HTML_data is all html content
if soup ==None:
//Error
else:
c = soup.contents
// Use RE here
#1
2
Try this:
尝试这个:
soup = BeautifulSoup(HTML_data) #HTML_data is all html content
first_div = soup.find('div',{"class" : "content"})
second_div = first_div.find('div',{"class" : "utimate"})
div_list = second_div.findAll('div',{"class" : "utlimate"})
The method for getting the list is findAll
, not findall
. There's no "ultimate" in the HTML fragment, they're "utlimate" or "utimate". Are those typos?
获取列表的方法是findAll,而不是findall。 HTML片段中没有“终极”,它们“非常”或“非常”。这些错别字吗?
#2
1
Is Soup None?
汤没有?
I suggest you re-factor your code to guard against this:
我建议你重新考虑你的代码以防止这种情况:
soup = BeautifulSoup(HTML_data) #HTML_data is all html content
if soup ==None:
//Error
else:
c = soup.contents
// Use RE here