I need a regular expression to match anything that is within <p>
tags so for example if I had some text:
我需要一个正则表达式来匹配
标签内的任何内容,例如,如果我有一些文本:
<p>Hello world</p>
The regex would match the Hello world part
正则表达式将匹配Hello world部分
3 个解决方案
#1
7
in javascript:
var str = "<p>Hello world</p>";
str.search(/<\s*p[^>]*>([^<]*)<\s*\/\s*p\s*>/)
in php:
$str = "<p>Hello world</p>";
preg_match_all("/<\s*p[^>]*>([^<]*)<\s*\/\s*p\s*>/", $str);
These will match something as complex as this
这些将匹配像这样复杂的东西
< p style= "font-weight: bold;" >Hello world < / p >
#2
5
EDIT: Don't do it. Just don't.
编辑:不要这样做。只是不要。
See this question
看到这个问题
If you insist, use <p>(.+?)</p>
and the result will be in the first group. It is not perfect, but no regexp solution to HTML parsing problem will ever be.
如果你坚持,使用
(。+?) ,结果将在第一组。它并不完美,但HTML解析问题的regexp解决方案永远都不会。
E.g (in python)
例如(在python中)
>>> import re
>>> r = re.compile('<p>(.+?)</p>')
>>> r.findall("<p>fo o</p><p>ba adr</p>")
['fo o', 'ba adr']
#3
1
Regex:
<([a-z][a-z0-9]*)\b[^>]*>(.*?)</\1>
This will work for any pair of tags.
这适用于任何一对标签。
e.g <p class="foo">hello<br/></p>
例如
你好
The \1 makes sure that the opening tag matches the closing tag.
\ 1确保开始标记与结束标记匹配。
The content between the tags is captured in \2.
标签之间的内容在\ 2中捕获。
#1
7
in javascript:
var str = "<p>Hello world</p>";
str.search(/<\s*p[^>]*>([^<]*)<\s*\/\s*p\s*>/)
in php:
$str = "<p>Hello world</p>";
preg_match_all("/<\s*p[^>]*>([^<]*)<\s*\/\s*p\s*>/", $str);
These will match something as complex as this
这些将匹配像这样复杂的东西
< p style= "font-weight: bold;" >Hello world < / p >
#2
5
EDIT: Don't do it. Just don't.
编辑:不要这样做。只是不要。
See this question
看到这个问题
If you insist, use <p>(.+?)</p>
and the result will be in the first group. It is not perfect, but no regexp solution to HTML parsing problem will ever be.
如果你坚持,使用
(。+?) ,结果将在第一组。它并不完美,但HTML解析问题的regexp解决方案永远都不会。
E.g (in python)
例如(在python中)
>>> import re
>>> r = re.compile('<p>(.+?)</p>')
>>> r.findall("<p>fo o</p><p>ba adr</p>")
['fo o', 'ba adr']
#3
1
Regex:
<([a-z][a-z0-9]*)\b[^>]*>(.*?)</\1>
This will work for any pair of tags.
这适用于任何一对标签。
e.g <p class="foo">hello<br/></p>
例如
你好
The \1 makes sure that the opening tag matches the closing tag.
\ 1确保开始标记与结束标记匹配。
The content between the tags is captured in \2.
标签之间的内容在\ 2中捕获。