在Perl中,如何检测字符串是否多次出现两位数?

时间:2021-01-16 19:24:32

I wanted to match 110110 but not 10110. That means at least twice repeating of two consecutive digits which are the same. Any regex for that?

我想匹配110110但不匹配10110.这意味着至少两次重复两个相同的连续数字。任何正则表达式?

Should match: 110110, 123445446, 12344544644

应匹配:110110,123445446,12344544644

Should not match: 10110, 123445

不匹配:10110,123445

6 个解决方案

#1


/(\d)\1.*\1\1/

This matches a string with 2 instances of a double number, ie 11011 but not 10011

这匹配一个带有2个双数实例的字符串,即11011但不是10011

\d matches any digit \1 matches the first match effectively doubling the first entry

\ d匹配任何数字\ 1匹配第一个匹配有效地加倍第一个条目

This will also match 1111. If there needs to be other characters between change .* to .+

这也将匹配1111.如果在更改。*到。+之间需要有其他字符

ooh, this looks neater

哦,这看起来更整洁

((\d)\2).*\1

If you want to find non-matching values, but there has to be 2 sets of doubles, then you would simply need to add the first part again as in

如果你想找到不匹配的值,但是必须有2组双打,那么你只需要再次添加第一部分,就像在

((\d)\2).*((\d)\4)

The bracketing would mean that $1 and $3 would contain the double digits and $2 and $4 contains the single digits (which are then doubled).

包围将意味着$ 1和$ 3将包含两位数,$ 2和$ 4包含单个数字(然后加倍)。

11233

$1=11
$2=1
$3=33
$4=3

#2


If I understand correctly, your regexp will be:

如果我理解正确,你的正则表达式将是:

m{
  (\d)\1            # First repeated pair
  .*                # Anything in between
  (\d)\2            # Second repeated pair
}x

For example:

for my $x (qw(110110 123445446 12344544644 10110 123445)) {
    my $m = $x =~ m{(\d)\1.*(\d)\2} ? "matches" : "does not match";
    printf "%-11s : %s\n", $x, $m;
}
110110      : matches
123445446   : matches
12344544644 : matches
10110       : does not match
123445      : does not match

#3


If you're talking about all digits, this will do it:

如果您正在谈论所有数字,这将做到:

00.*00|11.*11|22.*22|33.*33|44.*44|55.*55|66.*66|77.*77|88.*88|99.*99

It's just 9 different patterns OR'ed together, each of which checks for at least two occurrences of the desired 2-digit pattern.

它只是9个不同的模式或者在一起,每个模式检查至少两次出现所需的2位数模式。

Using Perls more advanced REs, you can use the following for two consecutive digits twice:

使用Perls更高级的RE,您可以将以下两个连续数字使用两次:

(\d)\1.*\1\1

or, as one of your comments states, two consecutive digits follwed somewhere by two more consecutive digits which may not be the same:

或者,正如你的一条评论所指出的那样,两个连续的数字在两个连续的数字之后,可能不一样:

(\d)\1.*(\d)\2

#4


depending on how your data is, here's a minimal regex way.

取决于您的数据如何,这是一种最小的正则表达方式。

while(<DATA>){
    chomp;
    @s = split/\s+/;
    foreach my $i (@s){
        if( $i =~ /123445/ && length($i) ne 6){
            print $_."\n";
        }
    }
}

__DATA__
  This is a line
  blah 123445446 blah
  blah blah 12344544644 blah
  .... 123445 ....
  this is last line

#5


There is no reason to do everything in one regex... You can use the rest of Perl as well:

没有理由在一个正则表达式中执行所有操作...您也可以使用Perl的其余部分:

#!/usr/bin/perl -l

use strict;
use warnings;

my @strings = qw( 11233 110110 10110 123445 123445446 12344544644 );

print if is_wanted($_) for @strings;

sub is_wanted {
    my ($s) = @_;
    my @matches = $s =~ /(?<group>(?<first>[0-9])\k<first>)/g;
    return 1 < @matches / 2;
}

__END__

#6


If I've understood your question correctly, then this, according to regexbuddy (set to using perl syntax), will match 110110 but not 10110:

如果我已正确理解你的问题,那么根据regexbuddy(设置为使用perl语法),这将匹配110110而不是10110:

(1{2})0\10

The following is more general and will match any string where two equal digits is repeated later on in the string.

以下是更一般的,将匹配任何字符串,其中两个相等的数字稍后在字符串中重复。

(\d{2})\d+\1\d*

The above will match the following examples:

以上将匹配以下示例:

110110 110011 112345611 2200022345

110110 110011 112345611 2200022345

Finally, to find two sets of double digits in a string and you don't care where they are, try this:

最后,要在字符串中找到两组双位数,并且您不关心它们的位置,请尝试以下方法:

\d*?(\d{2})\d+?\1\d*

This will match the examples above plus this one:

这将匹配上面的示例加上这一个:

12345501355789

Its the two sets of double 5 in the above example that are matched.

它是上面例子中匹配的两组双5。

[Update] Having just seen your extra requirement of matching a string with two different double digits, try this:

[更新]刚刚看到你需要匹配一个字符串和两个不同的两位数字,试试这个:

\d*?(\d)\1\d*?(\d)\2\d*

This will match strings like the following:

这将匹配以下字符串:

12342202345567
12342202342267

Note that the 22 and 55 cause the first string to match and the pair of 22 cause the second string to match.

请注意,22和55导致第一个字符串匹配,而22对导致第二个字符串匹配。

#1


/(\d)\1.*\1\1/

This matches a string with 2 instances of a double number, ie 11011 but not 10011

这匹配一个带有2个双数实例的字符串,即11011但不是10011

\d matches any digit \1 matches the first match effectively doubling the first entry

\ d匹配任何数字\ 1匹配第一个匹配有效地加倍第一个条目

This will also match 1111. If there needs to be other characters between change .* to .+

这也将匹配1111.如果在更改。*到。+之间需要有其他字符

ooh, this looks neater

哦,这看起来更整洁

((\d)\2).*\1

If you want to find non-matching values, but there has to be 2 sets of doubles, then you would simply need to add the first part again as in

如果你想找到不匹配的值,但是必须有2组双打,那么你只需要再次添加第一部分,就像在

((\d)\2).*((\d)\4)

The bracketing would mean that $1 and $3 would contain the double digits and $2 and $4 contains the single digits (which are then doubled).

包围将意味着$ 1和$ 3将包含两位数,$ 2和$ 4包含单个数字(然后加倍)。

11233

$1=11
$2=1
$3=33
$4=3

#2


If I understand correctly, your regexp will be:

如果我理解正确,你的正则表达式将是:

m{
  (\d)\1            # First repeated pair
  .*                # Anything in between
  (\d)\2            # Second repeated pair
}x

For example:

for my $x (qw(110110 123445446 12344544644 10110 123445)) {
    my $m = $x =~ m{(\d)\1.*(\d)\2} ? "matches" : "does not match";
    printf "%-11s : %s\n", $x, $m;
}
110110      : matches
123445446   : matches
12344544644 : matches
10110       : does not match
123445      : does not match

#3


If you're talking about all digits, this will do it:

如果您正在谈论所有数字,这将做到:

00.*00|11.*11|22.*22|33.*33|44.*44|55.*55|66.*66|77.*77|88.*88|99.*99

It's just 9 different patterns OR'ed together, each of which checks for at least two occurrences of the desired 2-digit pattern.

它只是9个不同的模式或者在一起,每个模式检查至少两次出现所需的2位数模式。

Using Perls more advanced REs, you can use the following for two consecutive digits twice:

使用Perls更高级的RE,您可以将以下两个连续数字使用两次:

(\d)\1.*\1\1

or, as one of your comments states, two consecutive digits follwed somewhere by two more consecutive digits which may not be the same:

或者,正如你的一条评论所指出的那样,两个连续的数字在两个连续的数字之后,可能不一样:

(\d)\1.*(\d)\2

#4


depending on how your data is, here's a minimal regex way.

取决于您的数据如何,这是一种最小的正则表达方式。

while(<DATA>){
    chomp;
    @s = split/\s+/;
    foreach my $i (@s){
        if( $i =~ /123445/ && length($i) ne 6){
            print $_."\n";
        }
    }
}

__DATA__
  This is a line
  blah 123445446 blah
  blah blah 12344544644 blah
  .... 123445 ....
  this is last line

#5


There is no reason to do everything in one regex... You can use the rest of Perl as well:

没有理由在一个正则表达式中执行所有操作...您也可以使用Perl的其余部分:

#!/usr/bin/perl -l

use strict;
use warnings;

my @strings = qw( 11233 110110 10110 123445 123445446 12344544644 );

print if is_wanted($_) for @strings;

sub is_wanted {
    my ($s) = @_;
    my @matches = $s =~ /(?<group>(?<first>[0-9])\k<first>)/g;
    return 1 < @matches / 2;
}

__END__

#6


If I've understood your question correctly, then this, according to regexbuddy (set to using perl syntax), will match 110110 but not 10110:

如果我已正确理解你的问题,那么根据regexbuddy(设置为使用perl语法),这将匹配110110而不是10110:

(1{2})0\10

The following is more general and will match any string where two equal digits is repeated later on in the string.

以下是更一般的,将匹配任何字符串,其中两个相等的数字稍后在字符串中重复。

(\d{2})\d+\1\d*

The above will match the following examples:

以上将匹配以下示例:

110110 110011 112345611 2200022345

110110 110011 112345611 2200022345

Finally, to find two sets of double digits in a string and you don't care where they are, try this:

最后,要在字符串中找到两组双位数,并且您不关心它们的位置,请尝试以下方法:

\d*?(\d{2})\d+?\1\d*

This will match the examples above plus this one:

这将匹配上面的示例加上这一个:

12345501355789

Its the two sets of double 5 in the above example that are matched.

它是上面例子中匹配的两组双5。

[Update] Having just seen your extra requirement of matching a string with two different double digits, try this:

[更新]刚刚看到你需要匹配一个字符串和两个不同的两位数字,试试这个:

\d*?(\d)\1\d*?(\d)\2\d*

This will match strings like the following:

这将匹配以下字符串:

12342202345567
12342202342267

Note that the 22 and 55 cause the first string to match and the pair of 22 cause the second string to match.

请注意,22和55导致第一个字符串匹配,而22对导致第二个字符串匹配。