如何只显示包含多个连续字符的字符串中的某些字符？

Let's say I have this string: fffooooobbbbaarrr. Given a number N, for each duplicated characters, I want to display N of them.

假设我有这个字符串:fffooooobbbbaarrr。给定数字N,对于每个重复的字符,我想显示其中的N个。

If N=2, the output is ffoobbaarr

如果N = 2,则输出为ffoobbaarr

If N=3, the output is fffooobbbaarrr

如果N = 3,则输出为fffooobbbaarrr

If N=1, the output is fobar

如果N = 1,则输出为fobar

And if N=0, the output is (empty)

如果N = 0,则输出为(空)

As I'm learning regex, after some experimentation, I found that this works for N=2:

当我正在学习正则表达式时,经过一些实验,我发现这适用于N = 2:

Pattern pattern = Pattern.compile("(\\w)\\1{2,}");
System.out.println(pattern.matcher(input.replaceAll("$1$1"));

Of course, won't work for N=3, 4, etc. How to fix this?

当然,不适用于N = 3,4等。如何解决这个问题?

3 个解决方案

#1

You can use this regex replacement:

您可以使用此正则表达式替换:

int n = 3 // or whatever number;
String repl = "";

if (n > 0) {
   repl = str.replaceAll("((\\S)\\2{" + (n-1) + "})\\2*", "$1");
}

Example: (for N=3)

示例:(对于N = 3)

RegEx Demo 1

RegEx演示1

Example: (for N=2)

示例:(对于N = 2)

RegEx Demo 2

RegEx演示2

Explanation:

(: Start capture group #1

(:开始捕获组#1

(\S): Match 1+ non-whitespace char and capture as group #2

(\ S):匹配1+非空白字符并捕获为组#2

\2{2}: Match 2 instances of same char

\ 2 {2}:匹配同一个char的2个实例

): End capture group #1

):结束捕获组#1

\2*: Match 0+ instances of same character outside capture group

\ 2 *:匹配捕获组外的相同字符的0+个实例

Code Demo

#2

You can Pattern and matcher like this :

你可以像这样模式和匹配:

    String text = "fffooooobbbbaarrr";
    Pattern pattern = Pattern.compile("(.)\\1*");
    Matcher matcher = pattern.matcher(text);
    String result = "";
    int len = 3;
    while (matcher.find()) {
        if(matcher.group().length() >= len) {
            result += matcher.group().substring(0, len);
        }else {
            result += matcher.group();
        }

    }
    System.out.println(result);

Result :

3 --> fffooobbbaarrr
2 --> ffoobbaarr
1 --> fobar
0 --> empty

The idea is :

这个想法是:

match any repetitive character (.)\1* zero or more time

匹配任何重复字符(。)\ 1 *零或更多时间

then check if the length of that matches is great or equal to your length, if so use substring to get the length you want.

然后检查匹配的长度是否大于或等于你的长度,如果是这样,使用substring来获得你想要的长度。

else use the matched characters as it is.

否则使用匹配的字符。

#3

Use below regex as looker:

使用以下正则表达式作为外观:

(\\w)(\\1{N})\\1*

Breakdown:

(\w) Match and capture a letter to capturing group 1

(\ w)匹配并捕获一封信给捕获组1

(\1{N}) Match previous captured letter N times (capturing group 2)

(\ 1 {N})匹配先前捕获的字母N次(捕获组2)

\1* Match any number of following repetitions

\ 1 *匹配任意数量的以下重复

N is the number of letters you need to retain (you could use it as a variable. 0 results an empty output) and for replacement use:

N是您需要保留的字母数(可以将其用作变量.0表示空输出)并替换使用:

$2

Regex live demo

正则表达式现场演示

Java code (demo):

Java代码(演示):

String str = "fffooooobbbbaarrr";
int N = 3;
str = str.replaceAll("(\\w)(\\1{" + N + "})\\1*", "$2");
System.out.println(str); // fffooobbbaarrr

#1