In Java, I'm using the String split method to split a string containing values separated by semicolons.
在Java中,我使用String split方法来分割一个包含由分号分隔的值的字符串。
Currently, I have the following line that works in 99% of all cases.
目前,在99%的情况下,我都有如下一行。
String[] fields = optionsTxt.split(";");
However, the requirement has been added to include escaped semicolons as part of the string. So, the following strings should parse out to the following values:
但是,已经添加了需求,将转义的分号作为字符串的一部分。因此,以下字符串应该解析为以下值:
"Foo foo;Bar bar" => [Foo foo] [Bar bar]
"Foo foo\; foo foo;Bar bar bar" => [Foo foo\; foo foo] [Bar bar bar]
This should be painfully simple, but I'm totally unsure about how to go about it. I just want to not tokenize when there is a \; and only tokenize when there is a ;.
这应该很简单,但我完全不确定该怎么做。我只是想在有问题的时候不去做。只有当有a的时候,才会让人知道;
Does anyone out there know the magic formula?
有人知道这个神奇的公式吗?
4 个解决方案
#1
2
try this:
试试这个:
String[] fields = optionsTxt.split("(?<!\\\\);");
#2
1
There's probably a better way but the quick-and-dirty method would be to first replace \; with some string that won't appear in your input buffers, like {{ESCAPED_SEMICOLON}}, then do the tokenize on ;, and then when you pull out each token do the original substitution in reverse to put back the \;
可能有更好的方法,但快速和脏的方法是先替换\;有一些字符串不会出现在您的输入缓冲区中,比如{{{escaped_分号}},然后执行标记;然后当您取出每个令牌时,执行反向的原始替换以放回\;
#3
1
Using a regular expression (java.util.regex)
使用正则表达式(java.util.regex)
[^\\];
should be what you are looking for without doing a double replace.
应该是你所寻找的,而不需要做双重替换。
try it out using a tool like this
用这样的工具试试
#4
0
Using only your provided examples, you can use objects' code from above. If you want the split to happen only when there's an even number of backslashes before your semi-colon, try this:
只使用您提供的示例,您可以使用上面的对象代码。如果你想要分裂只发生在你的分号前有一个偶数反斜杠的时候,试试以下方法:
String[] fields = optionsTxt.split("((?<!\\\\)|(?<=[^\\\\](\\\\\\\\){0,15}));");
I've picked 15 arbitrarily. Change it to a higher number if need be.
我随便选了15。如果需要的话,把它改成更高的数字。
#1
2
try this:
试试这个:
String[] fields = optionsTxt.split("(?<!\\\\);");
#2
1
There's probably a better way but the quick-and-dirty method would be to first replace \; with some string that won't appear in your input buffers, like {{ESCAPED_SEMICOLON}}, then do the tokenize on ;, and then when you pull out each token do the original substitution in reverse to put back the \;
可能有更好的方法,但快速和脏的方法是先替换\;有一些字符串不会出现在您的输入缓冲区中,比如{{{escaped_分号}},然后执行标记;然后当您取出每个令牌时,执行反向的原始替换以放回\;
#3
1
Using a regular expression (java.util.regex)
使用正则表达式(java.util.regex)
[^\\];
should be what you are looking for without doing a double replace.
应该是你所寻找的,而不需要做双重替换。
try it out using a tool like this
用这样的工具试试
#4
0
Using only your provided examples, you can use objects' code from above. If you want the split to happen only when there's an even number of backslashes before your semi-colon, try this:
只使用您提供的示例,您可以使用上面的对象代码。如果你想要分裂只发生在你的分号前有一个偶数反斜杠的时候,试试以下方法:
String[] fields = optionsTxt.split("((?<!\\\\)|(?<=[^\\\\](\\\\\\\\){0,15}));");
I've picked 15 arbitrarily. Change it to a higher number if need be.
我随便选了15。如果需要的话,把它改成更高的数字。