I need help on regex or preg_match
because I am not that experienced yet with regards to those so here is my problem.
我需要在regex或preg_match上得到帮助,因为我对这些方面还没有经验,所以这是我的问题。
I need to get the value "get me" but I think my function has an error. The number of html tags are dynamic. It can contain many nested html tag like a bold tag. Also, the "get me" value is dynamic.
我需要得到“get me”的值,但我认为我的函数有一个错误。html标签的数量是动态的。它可以包含许多嵌套的html标记,如粗体标记。此外,“获取我”值是动态的。
<?php
function getTextBetweenTags($string, $tagname) {
$pattern = "/<$tagname>(.*?)<\/$tagname>/";
preg_match($pattern, $string, $matches);
return $matches[1];
}
$str = '<textformat leading="2"><p align="left"><font size="10">get me</font></p></textformat>';
$txt = getTextBetweenTags($str, "font");
echo $txt;
?>
7 个解决方案
#1
60
<?php
function getTextBetweenTags($string, $tagname) {
$pattern = "/<$tagname ?.*>(.*)<\/$tagname>/";
preg_match($pattern, $string, $matches);
return $matches[1];
}
$str = '<textformat leading="2"><p align="left"><font size="10">get me</font></p></textformat>';
$txt = getTextBetweenTags($str, "font");
echo $txt;
?>
That should do the trick
这应该很管用
#2
9
Try this
试试这个
$str = '<option value="123">abc</option>
<option value="123">aabbcc</option>';
preg_match_all("#<option.*?>([^<]+)</option>#", $str, $foo);
print_r($foo[1]);
#3
8
In your pattern, you simply want to match all text between the two tags. Thus, you could use for example a [\w\W]
to match all characters.
在您的模式中,您只需匹配两个标记之间的所有文本。因此,您可以使用例如a [\w\ w]来匹配所有字符。
function getTextBetweenTags($string, $tagname) {
$pattern = "/<$tagname>([\w\W]*?)<\/$tagname>/";
preg_match($pattern, $string, $matches);
return $matches[1];
}
#4
2
Since attribute values may contain a plain >
character, try this regular expression:
由于属性值可能包含一个普通的>字符,请尝试以下正则表达式:
$pattern = '/<'.preg_quote($tagname, '/').'(?:[^"'>]*|"[^"]*"|\'[^\']*\')*>(.*?)<\/'.preg_quote($tagname, '/').'>/s';
But regular expressions are not suitable for parsing non-regular languages like HTML. You should better use a parser like SimpleXML or DOMDocument.
但是正则表达式不适合解析像HTML这样的非正则语言。您最好使用SimpleXML或DOMDocument之类的解析器。
#5
0
The following php snippets would return the text between html tags/elements.
下面的php代码片段将返回html标记/元素之间的文本。
regex : "/tagname(.*)endtag/" will return text between tags.
"/tagname(.*)endtag/"将返回标签之间的文本。
i.e.
即。
$regex="/[start_tag_name](.*)[/end_tag_name]/";
$content="[start_tag_name]SOME TEXT[/end_tag_name]";
preg_replace($regex,$content);
It will return "SOME TEXT".
它将返回“一些文本”。
Regards,
问候,
Web-Farmer @letsnurture.com
Web-Farmer @letsnurture.com
#6
0
$userinput = "http://www.example.vn/";
//$url = urlencode($userinput);
$input = @file_get_contents($userinput) or die("Could not access file: $userinput");
$regexp = "<tagname\s[^>]*>(.*)<\/tagname>";
//==Example:
//$regexp = "<div\s[^>]*>(.*)<\/div>";
if(preg_match_all("/$regexp/siU", $input, $matches, PREG_SET_ORDER)) {
foreach($matches as $match) {
// $match[2] = link address
// $match[3] = link text
}
}
#7
0
try $pattern = "<($tagname)\b.*?>(.*?)</\1>"
and return $matches[2]
试模式美元= " <(tagname美元)\ b。* ? >(. * ?)< / \ 1 >”并返回$ matches[2]
#1
60
<?php
function getTextBetweenTags($string, $tagname) {
$pattern = "/<$tagname ?.*>(.*)<\/$tagname>/";
preg_match($pattern, $string, $matches);
return $matches[1];
}
$str = '<textformat leading="2"><p align="left"><font size="10">get me</font></p></textformat>';
$txt = getTextBetweenTags($str, "font");
echo $txt;
?>
That should do the trick
这应该很管用
#2
9
Try this
试试这个
$str = '<option value="123">abc</option>
<option value="123">aabbcc</option>';
preg_match_all("#<option.*?>([^<]+)</option>#", $str, $foo);
print_r($foo[1]);
#3
8
In your pattern, you simply want to match all text between the two tags. Thus, you could use for example a [\w\W]
to match all characters.
在您的模式中,您只需匹配两个标记之间的所有文本。因此,您可以使用例如a [\w\ w]来匹配所有字符。
function getTextBetweenTags($string, $tagname) {
$pattern = "/<$tagname>([\w\W]*?)<\/$tagname>/";
preg_match($pattern, $string, $matches);
return $matches[1];
}
#4
2
Since attribute values may contain a plain >
character, try this regular expression:
由于属性值可能包含一个普通的>字符,请尝试以下正则表达式:
$pattern = '/<'.preg_quote($tagname, '/').'(?:[^"'>]*|"[^"]*"|\'[^\']*\')*>(.*?)<\/'.preg_quote($tagname, '/').'>/s';
But regular expressions are not suitable for parsing non-regular languages like HTML. You should better use a parser like SimpleXML or DOMDocument.
但是正则表达式不适合解析像HTML这样的非正则语言。您最好使用SimpleXML或DOMDocument之类的解析器。
#5
0
The following php snippets would return the text between html tags/elements.
下面的php代码片段将返回html标记/元素之间的文本。
regex : "/tagname(.*)endtag/" will return text between tags.
"/tagname(.*)endtag/"将返回标签之间的文本。
i.e.
即。
$regex="/[start_tag_name](.*)[/end_tag_name]/";
$content="[start_tag_name]SOME TEXT[/end_tag_name]";
preg_replace($regex,$content);
It will return "SOME TEXT".
它将返回“一些文本”。
Regards,
问候,
Web-Farmer @letsnurture.com
Web-Farmer @letsnurture.com
#6
0
$userinput = "http://www.example.vn/";
//$url = urlencode($userinput);
$input = @file_get_contents($userinput) or die("Could not access file: $userinput");
$regexp = "<tagname\s[^>]*>(.*)<\/tagname>";
//==Example:
//$regexp = "<div\s[^>]*>(.*)<\/div>";
if(preg_match_all("/$regexp/siU", $input, $matches, PREG_SET_ORDER)) {
foreach($matches as $match) {
// $match[2] = link address
// $match[3] = link text
}
}
#7
0
try $pattern = "<($tagname)\b.*?>(.*?)</\1>"
and return $matches[2]
试模式美元= " <(tagname美元)\ b。* ? >(. * ?)< / \ 1 >”并返回$ matches[2]