I have a string as below (the letters in the example could be numbers or texts and could be either uppercase or lowercase or both. If a value is a sentence, it should be between single quotations):
我有一个字符串如下(示例中的字母可以是数字或文本,可以是大写或小写或两者。如果值是一个句子,它应该在单引号之间):
$string="a,b,c,(d,e,f),g,'h, i j.',k";
How can I explode that to get the following result?
我如何爆炸以获得以下结果?
Array([0]=>"a",[1]=>"b",[2]=>"c",[3]=>"(d,e,f)",[4]=>"g",[5]=>"'h,i j'",[6]=>"k")
I think using regular expressions will be a fast as well as clean solution. Any idea?
我认为使用正则表达式将是一个快速而干净的解决方案。任何想法?
EDIT: This is what I have done so far, which is very slow for the strings having a long part between parenthesis:
编辑:这是我到目前为止所做的,对于在括号之间有长部分的字符串,这是非常慢的:
$separator="*"; // whatever which is not used in the string
$Pattern="'[^,]([^']+),([^']+)[^,]'";
while(ereg($Pattern,$String,$Regs)){
$String=ereg_replace($Pattern,"'\\1$separator\\2'",$String);
}
$Pattern="\(([^(^']+),([^)^']+)\)";
while(ereg($Pattern,$String,$Regs)){
$String=ereg_replace($Pattern,"(\\1$separator\\2)",$String);
}
return $String;
This, will replace all the commas between the parenthesis. Then I can explode it by commas and the replace the $separator
with the original comma.
这将替换括号之间的所有逗号。然后我可以用逗号爆炸它并用原始逗号替换$ separator。
1 个解决方案
#1
4
You can do the job using preg_match_all
你可以使用preg_match_all完成这项工作
$string="a,b,c,(d,e,f),g,'h, i j.',k";
preg_match_all('~\'[^\']++\'|\([^)]++\)|[^,]++~', $string,$result);
print_r($result[0]);
Explanation:
The trick is to match parenthesis before the ,
诀窍是在之前匹配括号,
~ Pattern delimiter
'
[^'] All charaters but not a single quote
++ one or more time in [possessive][1] mode
'
| or
\([^)]++\) the same with parenthesis
| or
[^,] All characters but not a comma
++
~
if you have more than one delimiter like quotes (that are the same for open and close), you can write your pattern like this, using a capture group:
如果你有多个分隔符,如引号(打开和关闭相同),你可以使用捕获组编写这样的模式:
$string="a,b,c,(d,e,f),g,'h, i j.',k,°l,m°,#o,p#,@q,r@,s";
preg_match_all('~([\'#@°]).*?\1|\([^)]++\)|[^,]++~', $string,$result);
print_r($result[0]);
explanation:
(['#@°]) one character in the class is captured in group 1
.*? any character zero or more time in lazy mode
\1 group 1 content
With nested parenthesis:
使用嵌套括号:
$string="a,b,(c,(d,(e),f),t),g,'h, i j.',k,°l,m°,#o,p#,@q,r@,s";
preg_match_all('~([\'#@°]).*?\1|(\((?>[^()]++|(?-1)?)*\))|[^,]++~', $string,$result);
print_r($result[0]);
#1
4
You can do the job using preg_match_all
你可以使用preg_match_all完成这项工作
$string="a,b,c,(d,e,f),g,'h, i j.',k";
preg_match_all('~\'[^\']++\'|\([^)]++\)|[^,]++~', $string,$result);
print_r($result[0]);
Explanation:
The trick is to match parenthesis before the ,
诀窍是在之前匹配括号,
~ Pattern delimiter
'
[^'] All charaters but not a single quote
++ one or more time in [possessive][1] mode
'
| or
\([^)]++\) the same with parenthesis
| or
[^,] All characters but not a comma
++
~
if you have more than one delimiter like quotes (that are the same for open and close), you can write your pattern like this, using a capture group:
如果你有多个分隔符,如引号(打开和关闭相同),你可以使用捕获组编写这样的模式:
$string="a,b,c,(d,e,f),g,'h, i j.',k,°l,m°,#o,p#,@q,r@,s";
preg_match_all('~([\'#@°]).*?\1|\([^)]++\)|[^,]++~', $string,$result);
print_r($result[0]);
explanation:
(['#@°]) one character in the class is captured in group 1
.*? any character zero or more time in lazy mode
\1 group 1 content
With nested parenthesis:
使用嵌套括号:
$string="a,b,(c,(d,(e),f),t),g,'h, i j.',k,°l,m°,#o,p#,@q,r@,s";
preg_match_all('~([\'#@°]).*?\1|(\((?>[^()]++|(?-1)?)*\))|[^,]++~', $string,$result);
print_r($result[0]);