在Java中使用Regex分割字符串

时间:2021-05-17 21:42:48

Would anyone be able to assist me with some regex.

谁能帮我弄点regex吗?

I want to split the following string into a number, string number

我想把下面的字符串分割成一个数字,字符串

"810LN15"

“810 ln15”

1 method requires 810 to be returned, another requires LN and another should return 15.

一个方法需要返回810,另一个需要LN,另一个应该返回15。

The only real solution to this is using regex as the numbers will grow in length

唯一真正的解决方案是使用regex,因为数字将会增长

What regex can I used to accomodate this?

我能用什么regex来适应这个?

4 个解决方案

#1


17  

String.split won't give you the desired result, which I guess would be "810", "LN", "15", since it would have to look for a token to split at and would strip that token.

字符串。split不会给你想要的结果,我猜应该是“810”、“LN”、“15”,因为它必须查找一个要分割的令牌并将其剥离。

Try Pattern and Matcher instead, using this regex: (\d+)|([a-zA-Z]+), which would match any sequence of numbers and letters and get distinct number/text groups (i.e. "AA810LN15QQ12345" would result in the groups "AA", "810", "LN", "15", "QQ" and "12345").

尝试使用Pattern和Matcher,使用这个regex: (\d+)|([a-zA-Z]+),它将匹配任何数字和字母序列,并获得不同的数字/文本组(例如“AA810LN15QQ12345”会出现“AA”、“810”、“LN”、“15”、“QQ”、“12345”等组。

Example:

例子:

Pattern p = Pattern.compile("(\\d+)|([a-zA-Z]+)");
Matcher m = p.matcher("810LN15");
List<String> tokens = new LinkedList<String>();
while(m.find())
{
  String token = m.group( 1 ); //group 0 is always the entire match   
  tokens.add(token);
}
//now iterate through 'tokens' and check whether you have a number or text

#2


10  

In Java, as in most regex flavors (Python being a notable exception), the split() regex isn't required to consume any characters when it finds a match. Here I've used lookaheads and lookbehinds to match any position that has a digit one side of it and a non-digit on the other:

在Java中,与大多数regex风格一样(Python是一个明显的例外),split() regex在找到匹配时不需要使用任何字符。在这里,我使用了lookahead和lookbehind来匹配任何位置,它的一边是数字,另一边是非数字:

String source = "810LN15";
String[] parts = source.split("(?<=\\d)(?=\\D)|(?<=\\D)(?=\\d)");
System.out.println(Arrays.toString(parts));

output:

输出:

[810, LN, 15]

#3


7  

(\\d+)([a-zA-Z]+)(\\d+) should do the trick. The first capture group will be the first number, the second capture group will be the letters in between and the third capture group will be the second number. The double backslashes are for java.

(d+)([a-zA-Z]+)(\\d+)应该这样做。第一个捕获组将是第一个数字,第二个捕获组将是中间的字母,第三个捕获组将是第二个数字。双斜杠是java的。

#4


0  

This gives you the exact thing you guys are looking for

这就是你们要找的东西

        Pattern p = Pattern.compile("(([a-zA-Z]+)|(\\d+))|((\\d+)|([a-zA-Z]+))");
        Matcher m = p.matcher("810LN15");
        List<Object> tokens = new LinkedList<Object>();
        while(m.find())
        {
          String token = m.group( 1 ); 
          tokens.add(token);
        }
        System.out.println(tokens);

#1


17  

String.split won't give you the desired result, which I guess would be "810", "LN", "15", since it would have to look for a token to split at and would strip that token.

字符串。split不会给你想要的结果,我猜应该是“810”、“LN”、“15”,因为它必须查找一个要分割的令牌并将其剥离。

Try Pattern and Matcher instead, using this regex: (\d+)|([a-zA-Z]+), which would match any sequence of numbers and letters and get distinct number/text groups (i.e. "AA810LN15QQ12345" would result in the groups "AA", "810", "LN", "15", "QQ" and "12345").

尝试使用Pattern和Matcher,使用这个regex: (\d+)|([a-zA-Z]+),它将匹配任何数字和字母序列,并获得不同的数字/文本组(例如“AA810LN15QQ12345”会出现“AA”、“810”、“LN”、“15”、“QQ”、“12345”等组。

Example:

例子:

Pattern p = Pattern.compile("(\\d+)|([a-zA-Z]+)");
Matcher m = p.matcher("810LN15");
List<String> tokens = new LinkedList<String>();
while(m.find())
{
  String token = m.group( 1 ); //group 0 is always the entire match   
  tokens.add(token);
}
//now iterate through 'tokens' and check whether you have a number or text

#2


10  

In Java, as in most regex flavors (Python being a notable exception), the split() regex isn't required to consume any characters when it finds a match. Here I've used lookaheads and lookbehinds to match any position that has a digit one side of it and a non-digit on the other:

在Java中,与大多数regex风格一样(Python是一个明显的例外),split() regex在找到匹配时不需要使用任何字符。在这里,我使用了lookahead和lookbehind来匹配任何位置,它的一边是数字,另一边是非数字:

String source = "810LN15";
String[] parts = source.split("(?<=\\d)(?=\\D)|(?<=\\D)(?=\\d)");
System.out.println(Arrays.toString(parts));

output:

输出:

[810, LN, 15]

#3


7  

(\\d+)([a-zA-Z]+)(\\d+) should do the trick. The first capture group will be the first number, the second capture group will be the letters in between and the third capture group will be the second number. The double backslashes are for java.

(d+)([a-zA-Z]+)(\\d+)应该这样做。第一个捕获组将是第一个数字,第二个捕获组将是中间的字母,第三个捕获组将是第二个数字。双斜杠是java的。

#4


0  

This gives you the exact thing you guys are looking for

这就是你们要找的东西

        Pattern p = Pattern.compile("(([a-zA-Z]+)|(\\d+))|((\\d+)|([a-zA-Z]+))");
        Matcher m = p.matcher("810LN15");
        List<Object> tokens = new LinkedList<Object>();
        while(m.find())
        {
          String token = m.group( 1 ); 
          tokens.add(token);
        }
        System.out.println(tokens);