When using BeautifulSoup bs4, how to get text from inside a HTML tag? When I run this line:
当使用漂亮的bs4时,如何从HTML标签中获取文本?当我运行这一行:
oname = soup.find("title")
I get the title
tag like this:
我得到这样的标题标签:
<title>page name</title>
and now I want to get only the inner text of it, page name
, without tags. How to do that?
现在我只想得到它的内部文本,页面名,没有标签。如何做呢?
1 个解决方案
#1
8
Use .text to get the text from the tag.
使用.text从标记中获取文本。
oname = soup.find("title")
oname.text
Or just soup.title.text
或者只是soup.title.text
In [4]: from bs4 import BeautifulSoup
In [5]: import requests
In [6]: r = requests.get("http://*.com/questions/27934387/how-to-retrieve-information-inside-a-tag-with-python/27934403#27934387")
In [7]: BeautifulSoup(r.content).title.text
Out[7]: u'html - How to Retrieve information inside a tag with python - Stack Overflow'
To open a file and use the text as the name simple use it as you would any other string:
要打开文件并使用文本作为名称,可以像使用任何其他字符串一样使用它:
with open(oname.text, 'w') as f
#1
8
Use .text to get the text from the tag.
使用.text从标记中获取文本。
oname = soup.find("title")
oname.text
Or just soup.title.text
或者只是soup.title.text
In [4]: from bs4 import BeautifulSoup
In [5]: import requests
In [6]: r = requests.get("http://*.com/questions/27934387/how-to-retrieve-information-inside-a-tag-with-python/27934403#27934387")
In [7]: BeautifulSoup(r.content).title.text
Out[7]: u'html - How to Retrieve information inside a tag with python - Stack Overflow'
To open a file and use the text as the name simple use it as you would any other string:
要打开文件并使用文本作为名称,可以像使用任何其他字符串一样使用它:
with open(oname.text, 'w') as f