I want to skip first occurrence if no of occurrence more than 4. For now I will get max of 5 number underscore occurrence. I need to produce the output A_B, C, D, E, F and I did using below code. I want better solution. Please check and let me know. Thanks in advance.
我想跳过第一次出现,如果没有发生超过4次。现在我将得到5个数字下划线的最大值。我需要生成A_B, C, D, E, F的输出,我使用了下面的代码。我想要更好的解决方案。请查一下,让我知道。提前谢谢。
String key = "A_B_C_D_E_F";
int occurance = StringUtils.countOccurrencesOf(key, "_");
System.out.println(occurance);
String[] keyValues = null;
if(occurance == 5){
key = key.replaceFirst("_", "-");
keyValues = StringUtils.tokenizeToStringArray(key, "_");
keyValues[0] = replaceOnce(keyValues[0], "-", "_");
}else{
keyValues = StringUtils.tokenizeToStringArray(key, "_");
}
for(String keyValue : keyValues){
System.out.println(keyValue);
}
5 个解决方案
#1
1
You can use this regex to split:
您可以使用这个regex来拆分:
String s = "A_B_C_D_E_F";
String[] list = s.split("(?<=_[A-Z])_");
Output:
输出:
[A_B, C, D, E, F]
[A_B, C, D, E, F]
The idea is to match only the _
who are preceded by "_[A-Z]"
, which effectively skips only the first one.
我们的想法是只匹配前面有“_[A-Z]”的_人,这实际上只跳过第一个。
If the strings you are considering have a different format between the "_"
, you have to replace [A-Z]
by the appropriate regex
如果您正在考虑的字符串在“_”之间有不同的格式,您必须用适当的regex替换[a - z]
#2
2
Well, it is relatively "simple":
嗯,它是相对“简单”的:
String str = "A_B_C_D_E_F_G";
String[] result = str.split("(?<!^[^_]*)_|_(?=(?:[^_]*_){0,3}[^_]*$)");
System.out.println(Arrays.toString(result));
Here a version with comments for better understanding that can also be used as is:
这里有一个版本,有更好的理解,也可以使用如下:
String str = "A_B_C_D_E_F_G";
String[] result = str.split("(?x) # enable embedded comments \n"
+ " # first alternative splits on all but the first underscore \n"
+ "(?<! # next character should not be preceded by \n"
+ " ^[^_]* # only non-underscores since beginning of input \n"
+ ") # so this matches only if there was an underscore before \n"
+ "_ # underscore \n"
+ "| # alternatively split if an underscore is followed by at most three more underscores to match the less than five underscores case \n"
+ "_ # underscore \n"
+ "(?= # preceding character must be followed by \n"
+ " (?:[^_]*_){0,3} # at most three groups of non-underscores and an underscore \n"
+ " [^_]*$ # only more non-underscores until end of line \n"
+ ")");
System.out.println(Arrays.toString(result));
#3
0
You can use this regex based on \G
and instead of splitting use matching:
您可以使用这个基于\G的regex,而不是拆分使用匹配:
String str = "A_B_C_D_E_F";
Pattern p = Pattern.compile("(^[^_]*_[^_]+|\\G[^_]+)(?:_|$)");
Matcher m = p.matcher(str);
List<String> resultArr = new ArrayList<>();
while (m.find()) {
resultArr.add( m.group(1) );
}
System.err.println(resultArr);
\G
asserts position at the end of the previous match or the start of the string for the first match.
\G在前一个匹配的末尾或第一个匹配的字符串的开头断言位置。
Output:
输出:
[A_B, C, D, E, F]
RegEx演示
#4
0
I would do it after the split.
离婚后我也会这么做。
public void test() {
String key = "A_B_C_D_E_F";
String[] parts = key.split("_");
if (parts.length >= 5) {
String[] newParts = new String[parts.length - 1];
newParts[0] = parts[0] + "-" + parts[1];
System.arraycopy(parts, 2, newParts, 1, parts.length - 2);
parts = newParts;
}
System.out.println("parts = " + Arrays.toString(parts));
}
#5
0
Although Java does not say that officially, you can use *
and +
in the lookbehind as they are implemented as limiting quantifiers: *
as {0,0x7FFFFFFF}
and +
as {1,0x7FFFFFFF}
(see Regex look-behind without obvious maximum length in Java). So, if your strings are not too long, you can use
虽然Java没有正式地说,但是您可以在lookbehind中使用*和+,因为它们是作为限制量词实现的:*作为{0,0x7fffff}, +作为{1,0x7FFFFFFF}(请参阅Regex查找,在Java中没有明显的最大长度)。所以,如果你的字符串不是太长,你可以使用
String key = "A_B_C_D"; // => [A, B, C, D]
//String key = "A_B_C_D_E_F"; // => [A_B, C, D, E, F]
String[] res = null;
if (key.split("_").length > 4) {
res = key.split("(?<!^[^_]*)_");
} else {
res = key.split("_");
}
System.out.println(Arrays.toString(res));
See the JAVA demo
查看演示JAVA
DISCLAIMER: Since this is an exploit of the current Java 8 regex engine, the code may break in the future when the bug is fixed in Java.
免责声明:由于这是当前Java 8 regex引擎的一个漏洞,所以当这个错误在Java中修复时,代码将来可能会崩溃。
#1
1
You can use this regex to split:
您可以使用这个regex来拆分:
String s = "A_B_C_D_E_F";
String[] list = s.split("(?<=_[A-Z])_");
Output:
输出:
[A_B, C, D, E, F]
[A_B, C, D, E, F]
The idea is to match only the _
who are preceded by "_[A-Z]"
, which effectively skips only the first one.
我们的想法是只匹配前面有“_[A-Z]”的_人,这实际上只跳过第一个。
If the strings you are considering have a different format between the "_"
, you have to replace [A-Z]
by the appropriate regex
如果您正在考虑的字符串在“_”之间有不同的格式,您必须用适当的regex替换[a - z]
#2
2
Well, it is relatively "simple":
嗯,它是相对“简单”的:
String str = "A_B_C_D_E_F_G";
String[] result = str.split("(?<!^[^_]*)_|_(?=(?:[^_]*_){0,3}[^_]*$)");
System.out.println(Arrays.toString(result));
Here a version with comments for better understanding that can also be used as is:
这里有一个版本,有更好的理解,也可以使用如下:
String str = "A_B_C_D_E_F_G";
String[] result = str.split("(?x) # enable embedded comments \n"
+ " # first alternative splits on all but the first underscore \n"
+ "(?<! # next character should not be preceded by \n"
+ " ^[^_]* # only non-underscores since beginning of input \n"
+ ") # so this matches only if there was an underscore before \n"
+ "_ # underscore \n"
+ "| # alternatively split if an underscore is followed by at most three more underscores to match the less than five underscores case \n"
+ "_ # underscore \n"
+ "(?= # preceding character must be followed by \n"
+ " (?:[^_]*_){0,3} # at most three groups of non-underscores and an underscore \n"
+ " [^_]*$ # only more non-underscores until end of line \n"
+ ")");
System.out.println(Arrays.toString(result));
#3
0
You can use this regex based on \G
and instead of splitting use matching:
您可以使用这个基于\G的regex,而不是拆分使用匹配:
String str = "A_B_C_D_E_F";
Pattern p = Pattern.compile("(^[^_]*_[^_]+|\\G[^_]+)(?:_|$)");
Matcher m = p.matcher(str);
List<String> resultArr = new ArrayList<>();
while (m.find()) {
resultArr.add( m.group(1) );
}
System.err.println(resultArr);
\G
asserts position at the end of the previous match or the start of the string for the first match.
\G在前一个匹配的末尾或第一个匹配的字符串的开头断言位置。
Output:
输出:
[A_B, C, D, E, F]
RegEx演示
#4
0
I would do it after the split.
离婚后我也会这么做。
public void test() {
String key = "A_B_C_D_E_F";
String[] parts = key.split("_");
if (parts.length >= 5) {
String[] newParts = new String[parts.length - 1];
newParts[0] = parts[0] + "-" + parts[1];
System.arraycopy(parts, 2, newParts, 1, parts.length - 2);
parts = newParts;
}
System.out.println("parts = " + Arrays.toString(parts));
}
#5
0
Although Java does not say that officially, you can use *
and +
in the lookbehind as they are implemented as limiting quantifiers: *
as {0,0x7FFFFFFF}
and +
as {1,0x7FFFFFFF}
(see Regex look-behind without obvious maximum length in Java). So, if your strings are not too long, you can use
虽然Java没有正式地说,但是您可以在lookbehind中使用*和+,因为它们是作为限制量词实现的:*作为{0,0x7fffff}, +作为{1,0x7FFFFFFF}(请参阅Regex查找,在Java中没有明显的最大长度)。所以,如果你的字符串不是太长,你可以使用
String key = "A_B_C_D"; // => [A, B, C, D]
//String key = "A_B_C_D_E_F"; // => [A_B, C, D, E, F]
String[] res = null;
if (key.split("_").length > 4) {
res = key.split("(?<!^[^_]*)_");
} else {
res = key.split("_");
}
System.out.println(Arrays.toString(res));
See the JAVA demo
查看演示JAVA
DISCLAIMER: Since this is an exploit of the current Java 8 regex engine, the code may break in the future when the bug is fixed in Java.
免责声明:由于这是当前Java 8 regex引擎的一个漏洞,所以当这个错误在Java中修复时,代码将来可能会崩溃。