I can parse the full argument of a html Tag addressing it over a unix shell script like this:
我可以通过unix shell脚本解析html标记的完整参数,如下所示:
# !/usr/bin/python3
# import the module
from bs4 import BeautifulSoup
# define your object
soup = BeautifulSoup(open("test.html"))
# get the tag
print(soup(itemprop="name"))
where itemprop="name"
uniquely identifies the required tag.
其中itemprop="name"唯一标识所需的标记。
the output is something like
输出是这样的
[<span itemprop="name">
Blabla & Bloblo</span>]
Now I would like to return only the Bla Bla Blo Blo
part.
现在我只想返回Bla Blo的部分。
my attempt was to do:
我的尝试是:
print(soup(itemprop="name").getText())
but I get an error message like AttributeError: 'ResultSet' object has no attribute 'getText'
但是我得到了一个错误信息比如AttributeError: 'ResultSet'对象没有属性'getText'
it worked experimentally in other contexts such as
它在其他情况下,如。
print(soup.find('span').getText())
So what am I getting wrong?
那么我错在哪里呢?
1 个解决方案
#1
7
Using the soup
object as a callable returns a list of results, as if you used soup.find_all()
. See the documentation:
使用soup对象作为callable返回一个结果列表,就像使用soup.find_all()一样。看到文档:
Because
find_all()
is the most popular method in the Beautiful Soup search API, you can use a shortcut for it. If you treat theBeautifulSoup
object or aTag
object as though it were a function, then it’s the same as callingfind_all()
on that object.因为find_all()是这个漂亮的Soup搜索API中最流行的方法,所以您可以使用它的快捷方式。如果您将漂亮的soup对象或标记对象视为函数,那么它与在该对象上调用find_all()是相同的。
Use soup.find()
to find just the first match:
使用soup.find()查找第一个匹配项:
soup.find(itemprop="name").get_text()
or index into the resultset:
或索引到结果集:
soup(itemprop="name")[0].get_text()
#1
7
Using the soup
object as a callable returns a list of results, as if you used soup.find_all()
. See the documentation:
使用soup对象作为callable返回一个结果列表,就像使用soup.find_all()一样。看到文档:
Because
find_all()
is the most popular method in the Beautiful Soup search API, you can use a shortcut for it. If you treat theBeautifulSoup
object or aTag
object as though it were a function, then it’s the same as callingfind_all()
on that object.因为find_all()是这个漂亮的Soup搜索API中最流行的方法,所以您可以使用它的快捷方式。如果您将漂亮的soup对象或标记对象视为函数,那么它与在该对象上调用find_all()是相同的。
Use soup.find()
to find just the first match:
使用soup.find()查找第一个匹配项:
soup.find(itemprop="name").get_text()
or index into the resultset:
或索引到结果集:
soup(itemprop="name")[0].get_text()