What's the best/most efficient way to extract text set between parenthesis? Say I wanted to get the string "text" from the string "ignore everything except this (text)" in the most efficient manner possible.
在括号之间提取文本集的最佳/最有效的方法是什么?假设我想要从字符串“忽略除了这个(文本)之外的所有内容”中获取字符串“text”,这是最有效的方法。
So far, the best I've come up with is this:
到目前为止,我得出的最好结论是:
$fullString = "ignore everything except this (text)";
$start = strpos('(', $fullString);
$end = strlen($fullString) - strpos(')', $fullString);
$shortString = substr($fullString, $start, $end);
Is there a better way to do this? I know in general using regex tends to be less efficient, but unless I can reduce the number of function calls, perhaps this would be the best approach? Thoughts?
有更好的方法吗?我知道通常使用regex会降低效率,但是除非我可以减少函数调用的数量,也许这是最好的方法?想法吗?
5 个解决方案
#1
100
i'd just do a regex and get it over with. unless you are doing enough iterations that it becomes a huge performance issue, it's just easier to code (and understand when you look back on it)
我只要做一个正则表达式就可以了。除非您进行了足够的迭代,使其成为一个巨大的性能问题,否则编写代码就会更容易(当您回顾它时就会明白这一点)
$text = 'ignore everything except this (text)';
preg_match('#\((.*?)\)#', $text, $match);
print $match[1];
#2
11
So, actually, the code you posted doesn't work: substr()'s
parameters are $string, $start and $length, and strpos()'s
parameters are $haystack
, $needle
. Slightly modified:
所以,实际上,您发布的代码不起作用:substr()的参数是$string, $start和$length, strpos()的参数是$haystack, $needle。稍微修改:
$str = "ignore everything except this (text)"; $start = strpos($str, '('); $end = strpos($str, ')', $start + 1); $length = $end - $start; $result = substr($str, $start + 1, $length - 1);
Some subtleties: I used $start + 1
in the offset parameter in order to help PHP out while doing the strpos()
search on the second parenthesis; we increment $start
one and reduce $length
to exclude the parentheses from the match.
一些微妙之处:我在偏移参数中使用$start + 1,以便在对第二个括号进行strpos()搜索时帮助PHP;我们增加$start 1,减少$length以排除匹配中的括号。
Also, there's no error checking in this code: you'll want to make sure $start
and $end
do not === false before performing the substr
.
此外,在这段代码中没有错误检查:在执行substr之前,您需要确保$start和$end不=== == false。
As for using strpos/substr
versus regex; performance-wise, this code will beat a regular expression hands down. It's a little wordier though. I eat and breathe strpos/substr
, so I don't mind this too much, but someone else may prefer the compactness of a regex.
使用strpos/substr对regex的使用;在性能方面,这段代码将会打破常规的表达方式。不过这个词有点冗长。我吃并呼吸strpos/substr,所以我不太在意这个,但是其他人可能更喜欢regex的紧凑性。
#3
6
Use a regular expression:
使用一个正则表达式:
if( preg_match( '!\(([^\)]+)\)!', $text, $match ) )
$text = $match[1];
#4
3
This is a sample code to extract all the text between '[' and ']' and store it 2 separate arrays(ie text inside parentheses in one array and text outside parentheses in another array)
这是一个示例代码,用于提取'['和']'之间的所有文本,并将其存储为两个独立的数组(即在一个数组中的圆括号内的文本和在另一个数组中的圆括号外的文本)
function extract_text($string)
{
$text_outside=array();
$text_inside=array();
$t="";
for($i=0;$i<strlen($string);$i++)
{
if($string[$i]=='[')
{
$text_outside[]=$t;
$t="";
$t1="";
$i++;
while($string[$i]!=']')
{
$t1.=$string[$i];
$i++;
}
$text_inside[] = $t1;
}
else {
if($string[$i]!=']')
$t.=$string[$i];
else {
continue;
}
}
}
if($t!="")
$text_outside[]=$t;
var_dump($text_outside);
echo "\n\n";
var_dump($text_inside);
}
Output: extract_text("hello how are you?"); will produce:
输出:extract_text(“你好吗?”);会产生:
array(1) {
[0]=>
string(18) "hello how are you?"
}
array(0) {
}
extract_text("hello [http://www.google.com/test.mp3] how are you?"); will produce
extract_text(“hello[http://www.google.com/test.mp3]你好吗?”);将会产生
array(2) {
[0]=>
string(6) "hello "
[1]=>
string(13) " how are you?"
}
array(1) {
[0]=>
string(30) "http://www.google.com/test.mp3"
}
#5
1
This function may be useful.
这个函数可能有用。
public static function getStringBetween($str,$from,$to, $withFromAndTo = false)
{
$sub = substr($str, strpos($str,$from)+strlen($from),strlen($str));
if ($withFromAndTo)
return $from . substr($sub,0, strrpos($sub,$to)) . $to;
else
return substr($sub,0, strrpos($sub,$to));
}
$inputString = "ignore everything except this (text)";
$outputString = getStringBetween($inputString, '(', ')'));
echo $outputString;
//output will be test
$outputString = getStringBetween($inputString, '(', ')', true));
echo $outputString;
//output will be (test)
strpos() => which is used to find the position of first occurance in a string.
strpos() =>,用于查找字符串中第一次出现的位置。
strrpos() => which is used to find the position of first occurance in a string.
(大小写敏感)= >用于找到字符串中第一次出现的位置。
#1
100
i'd just do a regex and get it over with. unless you are doing enough iterations that it becomes a huge performance issue, it's just easier to code (and understand when you look back on it)
我只要做一个正则表达式就可以了。除非您进行了足够的迭代,使其成为一个巨大的性能问题,否则编写代码就会更容易(当您回顾它时就会明白这一点)
$text = 'ignore everything except this (text)';
preg_match('#\((.*?)\)#', $text, $match);
print $match[1];
#2
11
So, actually, the code you posted doesn't work: substr()'s
parameters are $string, $start and $length, and strpos()'s
parameters are $haystack
, $needle
. Slightly modified:
所以,实际上,您发布的代码不起作用:substr()的参数是$string, $start和$length, strpos()的参数是$haystack, $needle。稍微修改:
$str = "ignore everything except this (text)"; $start = strpos($str, '('); $end = strpos($str, ')', $start + 1); $length = $end - $start; $result = substr($str, $start + 1, $length - 1);
Some subtleties: I used $start + 1
in the offset parameter in order to help PHP out while doing the strpos()
search on the second parenthesis; we increment $start
one and reduce $length
to exclude the parentheses from the match.
一些微妙之处:我在偏移参数中使用$start + 1,以便在对第二个括号进行strpos()搜索时帮助PHP;我们增加$start 1,减少$length以排除匹配中的括号。
Also, there's no error checking in this code: you'll want to make sure $start
and $end
do not === false before performing the substr
.
此外,在这段代码中没有错误检查:在执行substr之前,您需要确保$start和$end不=== == false。
As for using strpos/substr
versus regex; performance-wise, this code will beat a regular expression hands down. It's a little wordier though. I eat and breathe strpos/substr
, so I don't mind this too much, but someone else may prefer the compactness of a regex.
使用strpos/substr对regex的使用;在性能方面,这段代码将会打破常规的表达方式。不过这个词有点冗长。我吃并呼吸strpos/substr,所以我不太在意这个,但是其他人可能更喜欢regex的紧凑性。
#3
6
Use a regular expression:
使用一个正则表达式:
if( preg_match( '!\(([^\)]+)\)!', $text, $match ) )
$text = $match[1];
#4
3
This is a sample code to extract all the text between '[' and ']' and store it 2 separate arrays(ie text inside parentheses in one array and text outside parentheses in another array)
这是一个示例代码,用于提取'['和']'之间的所有文本,并将其存储为两个独立的数组(即在一个数组中的圆括号内的文本和在另一个数组中的圆括号外的文本)
function extract_text($string)
{
$text_outside=array();
$text_inside=array();
$t="";
for($i=0;$i<strlen($string);$i++)
{
if($string[$i]=='[')
{
$text_outside[]=$t;
$t="";
$t1="";
$i++;
while($string[$i]!=']')
{
$t1.=$string[$i];
$i++;
}
$text_inside[] = $t1;
}
else {
if($string[$i]!=']')
$t.=$string[$i];
else {
continue;
}
}
}
if($t!="")
$text_outside[]=$t;
var_dump($text_outside);
echo "\n\n";
var_dump($text_inside);
}
Output: extract_text("hello how are you?"); will produce:
输出:extract_text(“你好吗?”);会产生:
array(1) {
[0]=>
string(18) "hello how are you?"
}
array(0) {
}
extract_text("hello [http://www.google.com/test.mp3] how are you?"); will produce
extract_text(“hello[http://www.google.com/test.mp3]你好吗?”);将会产生
array(2) {
[0]=>
string(6) "hello "
[1]=>
string(13) " how are you?"
}
array(1) {
[0]=>
string(30) "http://www.google.com/test.mp3"
}
#5
1
This function may be useful.
这个函数可能有用。
public static function getStringBetween($str,$from,$to, $withFromAndTo = false)
{
$sub = substr($str, strpos($str,$from)+strlen($from),strlen($str));
if ($withFromAndTo)
return $from . substr($sub,0, strrpos($sub,$to)) . $to;
else
return substr($sub,0, strrpos($sub,$to));
}
$inputString = "ignore everything except this (text)";
$outputString = getStringBetween($inputString, '(', ')'));
echo $outputString;
//output will be test
$outputString = getStringBetween($inputString, '(', ')', true));
echo $outputString;
//output will be (test)
strpos() => which is used to find the position of first occurance in a string.
strpos() =>,用于查找字符串中第一次出现的位置。
strrpos() => which is used to find the position of first occurance in a string.
(大小写敏感)= >用于找到字符串中第一次出现的位置。