如何让一个HTML标签的内部文本值有漂亮的bs4?

When using BeautifulSoup bs4, how to get text from inside a HTML tag? When I run this line:

当使用漂亮的bs4时，如何从HTML标签中获取文本?当我运行这一行:

oname = soup.find("title")

I get the title tag like this:

我得到这样的标题标签:

<title>page name</title>

and now I want to get only the inner text of it, page name, without tags. How to do that?

现在我只想得到它的内部文本，页面名，没有标签。如何做呢?

1 个解决方案

#1

Use .text to get the text from the tag.

使用.text从标记中获取文本。

oname = soup.find("title")
oname.text

Or just soup.title.text

或者只是soup.title.text

In [4]: from bs4 import BeautifulSoup    
In [5]: import  requests
In [6]: r = requests.get("http://*.com/questions/27934387/how-to-retrieve-information-inside-a-tag-with-python/27934403#27934387")    
In [7]: BeautifulSoup(r.content).title.text
Out[7]: u'html - How to Retrieve information inside a tag with python - Stack Overflow'

To open a file and use the text as the name simple use it as you would any other string:

要打开文件并使用文本作为名称，可以像使用任何其他字符串一样使用它:

with open(oname.text, 'w') as f

#1