I'm using Clojure, so this is in the context of Java regexes.
我正在使用Clojure,所以这是在Java正则表达式的上下文中。
Here is an example string:
这是一个示例字符串:
{:a "ab,cd, efg", :b "ab,def, egf,", :c "Conjecture"}
The important bits are the commas after each string. I'd like to be able to replace them with newline characters with Java's replaceAll method. A regex that will match any comma that is not surrounded by quotes will do.
重要的位是每个字符串后面的逗号。我希望能够用Java的replaceAll方法用换行符替换它们。正则表达式将匹配任何未被引号括起的逗号。
If I'm not coming across well, please ask and I'll be happily to clarify anything.
如果我没遇好,请问,我会高兴地澄清一切。
edit: sorry for the confusion in the title. I haven't been awake very long.
编辑:对不起标题中的混淆。我没有醒来很久。
String: {:a "ab, cd efg",}
<-- In this example, the comma at the end would be matched, but the ones inside the quote would not.
字符串:{:a“ab,cd efg”,} < - 在此示例中,末尾的逗号将匹配,但引号内的逗号不匹配。
String: {:a 3, :b 3,}
<-- Every single comma matches.
字符串:{:a 3,:b 3,} < - 每个逗号匹配。
String {:a "abcd,efg" :b "abcedg,e"}
<-- Every single comma doesn't match.
String {:a“abcd,efg”:b“abcedg,e”} < - 每个逗号都不匹配。
1 个解决方案
#1
18
The regex:
正则表达式:
,\s*(?=([^"]*"[^"]*")*[^"]*$)
Matches:
火柴:
{:a "ab,cd, efg", :b "ab,def, egf,", :c "Conjecture"}
^ ^
^ ^
and:
和:
{:a "ab, cd efg",}
^
^
and does not match a comma in:
与逗号不匹配:
{:a "abcd,efg" :b "abcedg,e"}
But when escaped quotes can appear, like so:
但是当出现转义引号时,就像这样:
{:a "ab,\" cd efg",} // only the last comma should match
then a regex solution won't work.
然后正则表达式解决方案将无法正常工作。
A brief explanation of the regex:
正则表达式的简要说明:
, # match the character ','
\s* # match a whitespace character: [ \t\n\x0B\f\r] and repeat it zero or more times
(?= # start positive look ahead
( # start capture group 1
[^"]* # match any character other than '"' and repeat it zero or more times
" # match the character '"'
[^"]* # match any character other than '"' and repeat it zero or more times
" # match the character '"'
)* # end capture group 1 and repeat it zero or more times
[^"]* # match any character other than '"' and repeat it zero or more times
$ # match the end of the input
) # end positive look ahead
In other words: match any comma that has zero, or an even number of quotes ahead of it (until the end of the string).
换句话说:匹配任何前面有零或者偶数引号的逗号(直到字符串结尾)。
#1
18
The regex:
正则表达式:
,\s*(?=([^"]*"[^"]*")*[^"]*$)
Matches:
火柴:
{:a "ab,cd, efg", :b "ab,def, egf,", :c "Conjecture"}
^ ^
^ ^
and:
和:
{:a "ab, cd efg",}
^
^
and does not match a comma in:
与逗号不匹配:
{:a "abcd,efg" :b "abcedg,e"}
But when escaped quotes can appear, like so:
但是当出现转义引号时,就像这样:
{:a "ab,\" cd efg",} // only the last comma should match
then a regex solution won't work.
然后正则表达式解决方案将无法正常工作。
A brief explanation of the regex:
正则表达式的简要说明:
, # match the character ','
\s* # match a whitespace character: [ \t\n\x0B\f\r] and repeat it zero or more times
(?= # start positive look ahead
( # start capture group 1
[^"]* # match any character other than '"' and repeat it zero or more times
" # match the character '"'
[^"]* # match any character other than '"' and repeat it zero or more times
" # match the character '"'
)* # end capture group 1 and repeat it zero or more times
[^"]* # match any character other than '"' and repeat it zero or more times
$ # match the end of the input
) # end positive look ahead
In other words: match any comma that has zero, or an even number of quotes ahead of it (until the end of the string).
换句话说:匹配任何前面有零或者偶数引号的逗号(直到字符串结尾)。