I have a string like this:
我有一个像这样的字符串:
KEY1=Value1, KE_Y2=[V@LUE2A, Value2B], Key3=, KEY4=V-AL.UE4, KEY5={Value5}
I need to split it to get a Map with key-value pairs. Values in []
should be passed as a single value (KE_Y2
is a key and [V@LUE2A, Value2B]
is a value).
我需要将其拆分以获得具有键值对的Map。 []中的值应作为单个值传递(KE_Y2是键,[V @ LUE2A,Value2B]是一个值)。
What regular expression should I use to split it correctly?
我应该用什么正则表达式正确分割它?
4 个解决方案
#1
8
There's a magic regex for the first split:
第一次拆分有一个神奇的正则表达式:
String[] pairs = input.split(", *(?![^\\[\\]]*\\])");
Then split each of the key/values with simply "=":
然后用简单的“=”分割每个键/值:
for (String pair : pairs) {
String[] parts = pair.split("=");
String key = parts[0];
String value = parts[1];
}
Putting it all together:
把它们放在一起:
Map<String, String> map = new HashMap<String, String>();
for (String pair : input.split(", *(?![^\\[\\]]*\\])")) {
String[] parts = pair.split("=");
map.put(parts[0], parts[1]);
}
Voila!
Explanation of magic regex:
The regex says "a comma followed by any number of spaces (so key names don't have leading blanks), but only if the next bracket encountered is not a close bracket"
正则表达式说“一个逗号后跟任意数量的空格(所以键名没有前导空格),但只有遇到下一个括号不是一个紧密的括号”
#2
4
How about this:
这个怎么样:
Map<String, String> map = new HashMap<String, String>();
Pattern regex = Pattern.compile(
"(\\w+) # Match an alphanumeric identifier, capture in group 1\n" +
"= # Match = \n" +
"( # Match and capture in group 2: \n" +
" (?: # Either... \n" +
" \\[ # a [ \n" +
" [^\\[\\]]* # followed by any number of characters except [ or ] \n" +
" \\] # followed by a ] \n" +
" | # or... \n" +
" [^\\[\\],]* # any number of characters except commas, [ or ] \n" +
" ) # End of alternation \n" +
") # End of capturing group",
Pattern.COMMENTS);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
map.put(regexMatcher.group(1), regexMatcher.group(2));
}
#3
-1
Start with @achintya-jha's answer. When you split a String, it will give you an array (or something that acts like it) so you can iterate throught the pair of key/value and then you do the second split which is supposed to give you another array of size 2; you then use the first element as the key and the second as the value.
从@ achintya-jha的回答开始。当你拆分一个String时,它会给你一个数组(或类似它的东西),所以你可以遍历这对键/值,然后你做第二次拆分,它应该给你另一个大小为2的数组;然后使用第一个元素作为键,第二个元素作为值。
EDIT:
I dind't found useful link for what I meant (see the comments on the question) in JAVA, (there is plenty of them for C/C++ though) so I wrote it:
我在JAVA中没有找到有用的链接(请参阅问题的评论)(虽然它有很多用于C / C ++)所以我写了它:
Map<String, String> map = new HashMap<String, String>();
String str = "KEY1=Value1, KE_Y2=[V@LUE2A, Value2B]], Key3=, KEY4=V-AL.UE4, KEY5={Value5}";
final String openBrackets = "({[<";
final String closeBrackets = ")}]>";
String buffer = "";
int state = 0;
int i = 0;
Stack<Integer> stack = new Stack<Integer>(); //For the brackets
String key = "";
while( i < str.length() ) {
char c = str.charAt(i);
//Skip any whitespace
if( " \t\n\r".indexOf(c) > -1 ) {
++i;
continue;
}
switch(state) {
//Reading Key
case 0:
if( c != '=' ) {
buffer += c;
} else {
//Go read a value.
key = buffer;
state = 1;
buffer = "";
}
++i;
break;
//Reading value
case 1:
//Opening bracket
int pos = openBrackets.indexOf(c);
if( pos != -1 ) {
stack.push(pos);
++i;
break;
}
//Closing bracket
pos = closeBrackets.indexOf(c);
if( pos != -1 ) {
if( stack.size() == 0 ) {
throw new RuntimeException("Syntax error: Unmatched closing bracket '" + c + "'" );
}
int pos2 = stack.pop();
if( pos != pos2 ) {
throw new RuntimeException("Syntax error: Unmatched closing bracket, expected a '"
+ closeBrackets.charAt(pos2) + "' got '" + c );
}
++i;
break;
}
//Handling separators
if( c == ',' ) {
if( stack.size() == 0 ) {
//Put the pair in the map.
map.put(key, buffer);
//Go read a new Key.
state = 0;
buffer = "";
++i;
break;
}
}
//else
buffer += c;
++i;
} //switch
} //while
#4
-2
- split the given string with String.split(",");
- Now split each element of the array with String.split("=");
用String.split(“,”)拆分给定的字符串;
现在用String.split(“=”)分割数组的每个元素;
#1
8
There's a magic regex for the first split:
第一次拆分有一个神奇的正则表达式:
String[] pairs = input.split(", *(?![^\\[\\]]*\\])");
Then split each of the key/values with simply "=":
然后用简单的“=”分割每个键/值:
for (String pair : pairs) {
String[] parts = pair.split("=");
String key = parts[0];
String value = parts[1];
}
Putting it all together:
把它们放在一起:
Map<String, String> map = new HashMap<String, String>();
for (String pair : input.split(", *(?![^\\[\\]]*\\])")) {
String[] parts = pair.split("=");
map.put(parts[0], parts[1]);
}
Voila!
Explanation of magic regex:
The regex says "a comma followed by any number of spaces (so key names don't have leading blanks), but only if the next bracket encountered is not a close bracket"
正则表达式说“一个逗号后跟任意数量的空格(所以键名没有前导空格),但只有遇到下一个括号不是一个紧密的括号”
#2
4
How about this:
这个怎么样:
Map<String, String> map = new HashMap<String, String>();
Pattern regex = Pattern.compile(
"(\\w+) # Match an alphanumeric identifier, capture in group 1\n" +
"= # Match = \n" +
"( # Match and capture in group 2: \n" +
" (?: # Either... \n" +
" \\[ # a [ \n" +
" [^\\[\\]]* # followed by any number of characters except [ or ] \n" +
" \\] # followed by a ] \n" +
" | # or... \n" +
" [^\\[\\],]* # any number of characters except commas, [ or ] \n" +
" ) # End of alternation \n" +
") # End of capturing group",
Pattern.COMMENTS);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
map.put(regexMatcher.group(1), regexMatcher.group(2));
}
#3
-1
Start with @achintya-jha's answer. When you split a String, it will give you an array (or something that acts like it) so you can iterate throught the pair of key/value and then you do the second split which is supposed to give you another array of size 2; you then use the first element as the key and the second as the value.
从@ achintya-jha的回答开始。当你拆分一个String时,它会给你一个数组(或类似它的东西),所以你可以遍历这对键/值,然后你做第二次拆分,它应该给你另一个大小为2的数组;然后使用第一个元素作为键,第二个元素作为值。
EDIT:
I dind't found useful link for what I meant (see the comments on the question) in JAVA, (there is plenty of them for C/C++ though) so I wrote it:
我在JAVA中没有找到有用的链接(请参阅问题的评论)(虽然它有很多用于C / C ++)所以我写了它:
Map<String, String> map = new HashMap<String, String>();
String str = "KEY1=Value1, KE_Y2=[V@LUE2A, Value2B]], Key3=, KEY4=V-AL.UE4, KEY5={Value5}";
final String openBrackets = "({[<";
final String closeBrackets = ")}]>";
String buffer = "";
int state = 0;
int i = 0;
Stack<Integer> stack = new Stack<Integer>(); //For the brackets
String key = "";
while( i < str.length() ) {
char c = str.charAt(i);
//Skip any whitespace
if( " \t\n\r".indexOf(c) > -1 ) {
++i;
continue;
}
switch(state) {
//Reading Key
case 0:
if( c != '=' ) {
buffer += c;
} else {
//Go read a value.
key = buffer;
state = 1;
buffer = "";
}
++i;
break;
//Reading value
case 1:
//Opening bracket
int pos = openBrackets.indexOf(c);
if( pos != -1 ) {
stack.push(pos);
++i;
break;
}
//Closing bracket
pos = closeBrackets.indexOf(c);
if( pos != -1 ) {
if( stack.size() == 0 ) {
throw new RuntimeException("Syntax error: Unmatched closing bracket '" + c + "'" );
}
int pos2 = stack.pop();
if( pos != pos2 ) {
throw new RuntimeException("Syntax error: Unmatched closing bracket, expected a '"
+ closeBrackets.charAt(pos2) + "' got '" + c );
}
++i;
break;
}
//Handling separators
if( c == ',' ) {
if( stack.size() == 0 ) {
//Put the pair in the map.
map.put(key, buffer);
//Go read a new Key.
state = 0;
buffer = "";
++i;
break;
}
}
//else
buffer += c;
++i;
} //switch
} //while
#4
-2
- split the given string with String.split(",");
- Now split each element of the array with String.split("=");
用String.split(“,”)拆分给定的字符串;
现在用String.split(“=”)分割数组的每个元素;