I would like to retrieve elements from @amplicon_exon
array that contain similar element (like) to @failedamplicons
array. Each element in @failedamplicons
is unique and can only match a single element from @amplicon_exon
. I've tried two for loops but get repeat values. Is there a better way of finding and retrieving similar values from the two arrays?
我想从@amplicon_exon数组中检索包含类似元素(比如)和@failedamplicons数组的元素。@failedamplicons中的每个元素都是惟一的,只能与@amplicon_exon中的单个元素匹配。我尝试了两个for循环,但得到了重复值。是否有更好的方法从两个数组中查找和检索相似的值?
@failedamplicons: example:
OCP1_FGFR3_8.87
OCP1_AR_14.89
@amplicon_exon: example:
TEST_Focus_ERBB2_2:22:ERBB2:GENE_ID=ERBB2;PURPOSE=CNV,Hotspot;CNV_ID=ERBB2;CNV_HS=1
OCP1_FGFR3_8:intron:FGFR3:GENE_ID=FGFR3;PURPOSE=CNV;CNV_ID=FGFR3;CNV_HS=1
OCP1_CDK6_14:intron:CDK6:GENE_ID=CDK6;PURPOSE=CNV;CNV_ID=CDK6;CNV_HS=1
Here is two for loop code:
这里有两个for循环代码:
my $i = 0;
my $j = 0;
for ( $i = 0; $i < @amplicon_exon; $i++ ) {
for ( $j = 0; $j < @failedamplicons; $j++ ) {
my $fail_amp = ( split /\./, $failedamplicons[$j] )[0];
#print "the failed amp before match is $fail_amp\n";
if ( index( $amplicon_exon[$i], $fail_amp ) != -1 ) {
#print "the amplicon exon that matches $amplicon_exon[$i] and sample is $sample_id\n";
print "the failed amp that matches $fail_amp and sample is $sample_id\n";
my @parts = split /:/, $amplicon_exon[$i];
my $exon_amp = $parts[1];
next unless $parts[3] =~ /Hotspot/; #includes only Hotspot amplicons
my $gene_res = $parts[2];
my $depth = ( split /\./, $failedamplicons[$j] )[1];
my @total_amps = (
$run_name, $sample_id, $gene_res, $depth, $fail_amp, $run_date, $matrix_status
);
my $lines = join "\t", @total_amps;
push( @finallines, $lines );
}
}
}
1 个解决方案
#1
4
split and grep are your friends, as is the idiomatic approach to iterating over a list. Simply iterate over the first array, extract just the part you want to match on (by using split
to split the element on a .
character, then taking only the first entry), then using a regex, grep
for that part of the string in the second array from the beginning of the element up to the :
:
split和grep是您的朋友,迭代列表的惯用方法也是如此。只需遍历第一个数组,提取想要匹配的部分(通过使用split来分割元素a)。字符,然后只取第一个条目),然后使用regex, grep从元素的开始到::
for my $elem (@failedamplicons){
my $to_match = (split /\./, $elem)[0];
if (my ($matched) = grep {$_ =~ /^\Q$to_match:/} @amplicon_exon){
print "$matched\n";
}
}
#1
4
split and grep are your friends, as is the idiomatic approach to iterating over a list. Simply iterate over the first array, extract just the part you want to match on (by using split
to split the element on a .
character, then taking only the first entry), then using a regex, grep
for that part of the string in the second array from the beginning of the element up to the :
:
split和grep是您的朋友,迭代列表的惯用方法也是如此。只需遍历第一个数组,提取想要匹配的部分(通过使用split来分割元素a)。字符,然后只取第一个条目),然后使用regex, grep从元素的开始到::
for my $elem (@failedamplicons){
my $to_match = (split /\./, $elem)[0];
if (my ($matched) = grep {$_ =~ /^\Q$to_match:/} @amplicon_exon){
print "$matched\n";
}
}