I have this string
我有这个字符串
my $line = "MZEFSRGGRMEAZFE*MQZEFFMAEZF*"
and I want to find every substring starting with M
and ending with *
and add it to an array. This means that the above string would give me 6 elements in my array.
我想找到以M开头并以*结尾的每个子字符串,并将其添加到数组中。这意味着上面的字符串会在我的数组中给出6个元素。
I have this code
我有这个代码
foreach ( $line =~ m/M.*?\*/g ) {
push @ORF, $_;
}
but it only gives me two elements in my array since it ignores overlapping strings.
但它只给了我数组中的两个元素,因为它忽略了重叠的字符串。
Is there any way to get all matches? I tried googling but could not find an answer.
有没有办法获得所有比赛?我试过谷歌搜索但找不到答案。
2 个解决方案
#1
4
Can use code within re
and Backtracking control verbs
for a little magic:
可以在re和Backtracking控制动词中使用代码来获得一些魔力:
#!/usr/bin/env perl
use strict;
use warnings;
my $line = "MZEFSRGGRMEAZFE*MQZEFFMAEZF*";
local our @match;
$line =~ m/(M.*\*)(?{ push @match, $1 })(*FAIL)/;
use Data::Dump;
dd @match;
Outputs:
(
"MZEFSRGGRMEAZFE*MQZEFFMAEZF*",
"MZEFSRGGRMEAZFE*",
"MEAZFE*MQZEFFMAEZF*",
"MEAZFE*",
"MQZEFFMAEZF*",
"MAEZF*",
)
#2
1
I don't believe it's possible to create a single regex pattern that will match all such substrings, because you're asking for both a greedy and a non-greedy match at the same time, and everything else in-between
我不相信有可能创建一个匹配所有这些子串的单一正则表达式模式,因为你要求同时进行贪婪和非贪婪的匹配,以及其他所有内容
I suggest you store all possible start and end positions of these substrings and use a double loop to combine all start positions with all end positions
我建议您存储这些子串的所有可能的开始和结束位置,并使用双循环将所有起始位置与所有结束位置组合
This program demonstrates
这个程序演示
use strict;
use warnings 'all';
use feature 'say';
my $line = 'MZEFSRGGRMEAZFE*MQZEFFMAEZF*';
my @orf;
{
my (@s, @e);
push @s, $-[0] while $line =~/M/g;
push @e, $+[0] while $line =~/\*/g;
for my $s ( @s ) {
for my $e ( @e ) {
push @orf, substr $line, $s, $e-$s if $e > $s;
}
}
}
say for @orf;
output
MZEFSRGGRMEAZFE*
MZEFSRGGRMEAZFE*MQZEFFMAEZF*
MEAZFE*
MEAZFE*MQZEFFMAEZF*
MQZEFFMAEZF*
MAEZF*
#1
4
Can use code within re
and Backtracking control verbs
for a little magic:
可以在re和Backtracking控制动词中使用代码来获得一些魔力:
#!/usr/bin/env perl
use strict;
use warnings;
my $line = "MZEFSRGGRMEAZFE*MQZEFFMAEZF*";
local our @match;
$line =~ m/(M.*\*)(?{ push @match, $1 })(*FAIL)/;
use Data::Dump;
dd @match;
Outputs:
(
"MZEFSRGGRMEAZFE*MQZEFFMAEZF*",
"MZEFSRGGRMEAZFE*",
"MEAZFE*MQZEFFMAEZF*",
"MEAZFE*",
"MQZEFFMAEZF*",
"MAEZF*",
)
#2
1
I don't believe it's possible to create a single regex pattern that will match all such substrings, because you're asking for both a greedy and a non-greedy match at the same time, and everything else in-between
我不相信有可能创建一个匹配所有这些子串的单一正则表达式模式,因为你要求同时进行贪婪和非贪婪的匹配,以及其他所有内容
I suggest you store all possible start and end positions of these substrings and use a double loop to combine all start positions with all end positions
我建议您存储这些子串的所有可能的开始和结束位置,并使用双循环将所有起始位置与所有结束位置组合
This program demonstrates
这个程序演示
use strict;
use warnings 'all';
use feature 'say';
my $line = 'MZEFSRGGRMEAZFE*MQZEFFMAEZF*';
my @orf;
{
my (@s, @e);
push @s, $-[0] while $line =~/M/g;
push @e, $+[0] while $line =~/\*/g;
for my $s ( @s ) {
for my $e ( @e ) {
push @orf, substr $line, $s, $e-$s if $e > $s;
}
}
}
say for @orf;
output
MZEFSRGGRMEAZFE*
MZEFSRGGRMEAZFE*MQZEFFMAEZF*
MEAZFE*
MEAZFE*MQZEFFMAEZF*
MQZEFFMAEZF*
MAEZF*