I have a text file in the following format:
我有一个以下格式的文本文件:
211B1 CUSTOMER|UPDATE|
211B2 CUSTOMER|UPDATE|
211B3 CUSTOMER|UPDATE|
211B4 CUSTOMER|UPDATE|
211B5 CUSTOMER|UPDATE|
567FR CUSTOMER|DELETE|
647GI CUSTOMER|DELETE|
I want a script that processes the text file and reports the following:
我想要一个处理文本文件的脚本并报告以下内容:
- "UPDATE" for column CUSTOMER found for Acct's: 211B1,211B2,211B3,211B4,211B5
- "DELETE" for column CUSTOMER found for Acct's: 5675FR,6470GI
为客户发现的CUSTOMER栏目的“更新”:211B1,211B2,211B3,211B4,211B5
找到Acct's的CUSTOMER列的“DELETE”:5675FR,6470GI
I can script simple solutions, but this seems a little complex to me and would appreciate assistance or guidance.
我可以编写简单的解决方案,但这对我来说似乎有点复杂,并希望得到帮助或指导。
5 个解决方案
#1
collate.pl
#!/usr/bin/perl
use strict;
my %actions;
while (<>) {
my ($key, $fld, $action) = /^(\w+) (.+?)\|(.+?)\|/ or die "Failed on line $.!";
push @{$actions{$action}{$fld}}, $key;
}
foreach my $action (keys %actions) {
foreach my $fld (keys %{$actions{$action}}) {
print "\"$action\" for column $fld found for Acct's: " . join(",", @{$actions{$action}{$fld}}), "\n";
}
}
Use like so:
使用如下:
perl collate.pl < input.txt > output.txt
#2
With awk:
echo '211B1 CUSTOMER|UPDATE|
211B2 CUSTOMER|UPDATE|
211B3 CUSTOMER|UPDATE|
211B4 CUSTOMER|UPDATE|
211B5 CUSTOMER|UPDATE|
567FR CUSTOMER|DELETE|
647GI CUSTOMER|DELETE|' | awk -F '[ |]' '
BEGIN {
upd="";del=""
} {
if ($3 == "UPDATE") {upd = upd" "$1};
if ($3 == "DELETE") {del = del" "$1};
} END {
print "Updates:"upd; print "Deletes:"del
}'
produces:
Updates: 211B1 211B2 211B3 211B4 211B5
Deletes: 567FR 647GI
It basically just breaks each line into three fields (with the -F
option) and maintains a list of updates and deletes that it appends to, depending on the "command".
它基本上只是将每一行分成三个字段(使用-F选项)并维护它附加的更新和删除列表,具体取决于“命令”。
The BEGIN
and END
are run before and after all line processing so they're initialization and the final output.
BEGIN和END在所有行处理之前和之后运行,因此它们是初始化和最终输出。
I'd put it into a script to make it easier. I left it as a command line tool just since that's how I usually debug my awk scripts.
我将它放入脚本中以使其更容易。我把它留作命令行工具,因为我通常调试我的awk脚本。
#3
#!/usr/bin/perl
use strict;
use warnings;
my %data;
while ( my $line = <DATA> ) {
next unless $line =~ /\S/;
my ($acct, $col, $action) = split /\s|\|/, $line;
push @{ $data{$action}->{$col} }, $acct;
}
for my $action ( keys %data ) {
for my $col ( keys %{ $data{$action} } ) {
print qq{"$action" for column $col found for acct's: },
join q{,}, @{ $data{$action}->{$col} }, "\n";
}
}
__DATA__
211B1 CUSTOMER|UPDATE|
211B2 CUSTOMER|UPDATE|
211B3 CUSTOMER|UPDATE|
211B4 CUSTOMER|UPDATE|
211B5 CUSTOMER|UPDATE|
567FR CUSTOMER|DELETE|
647GI CUSTOMER|DELETE|
#4
another awk version, though does reverse order of code values, and has an extra "," at end of each line
另一个awk版本,虽然代码值的顺序相反,并且在每一行的末尾都有一个额外的“,”
BEGIN { FS="[ |]" }
{
key = $3 " for column " $2
MAP[ key ] = $1 "," MAP[ key ]
}
END {
for ( item in MAP ) {
print item " found for Acct's: " MAP[ item ]
}
}
#5
Based on your question, you could do this:
根据您的问题,您可以这样做:
perl -i.bak -pe'if(/^211B[1-5]/){s/CUSTOMER/UPDATE/}elsif(/^(5675FR|6470GI)/){s/CUSTOMER/DELETE/}' filename
Though I notice now that the last two account numbers differ in the example, and also that the second column already has those values...
虽然我现在注意到示例中最后两个帐号不同,而且第二列已经有了这些值...
#1
collate.pl
#!/usr/bin/perl
use strict;
my %actions;
while (<>) {
my ($key, $fld, $action) = /^(\w+) (.+?)\|(.+?)\|/ or die "Failed on line $.!";
push @{$actions{$action}{$fld}}, $key;
}
foreach my $action (keys %actions) {
foreach my $fld (keys %{$actions{$action}}) {
print "\"$action\" for column $fld found for Acct's: " . join(",", @{$actions{$action}{$fld}}), "\n";
}
}
Use like so:
使用如下:
perl collate.pl < input.txt > output.txt
#2
With awk:
echo '211B1 CUSTOMER|UPDATE|
211B2 CUSTOMER|UPDATE|
211B3 CUSTOMER|UPDATE|
211B4 CUSTOMER|UPDATE|
211B5 CUSTOMER|UPDATE|
567FR CUSTOMER|DELETE|
647GI CUSTOMER|DELETE|' | awk -F '[ |]' '
BEGIN {
upd="";del=""
} {
if ($3 == "UPDATE") {upd = upd" "$1};
if ($3 == "DELETE") {del = del" "$1};
} END {
print "Updates:"upd; print "Deletes:"del
}'
produces:
Updates: 211B1 211B2 211B3 211B4 211B5
Deletes: 567FR 647GI
It basically just breaks each line into three fields (with the -F
option) and maintains a list of updates and deletes that it appends to, depending on the "command".
它基本上只是将每一行分成三个字段(使用-F选项)并维护它附加的更新和删除列表,具体取决于“命令”。
The BEGIN
and END
are run before and after all line processing so they're initialization and the final output.
BEGIN和END在所有行处理之前和之后运行,因此它们是初始化和最终输出。
I'd put it into a script to make it easier. I left it as a command line tool just since that's how I usually debug my awk scripts.
我将它放入脚本中以使其更容易。我把它留作命令行工具,因为我通常调试我的awk脚本。
#3
#!/usr/bin/perl
use strict;
use warnings;
my %data;
while ( my $line = <DATA> ) {
next unless $line =~ /\S/;
my ($acct, $col, $action) = split /\s|\|/, $line;
push @{ $data{$action}->{$col} }, $acct;
}
for my $action ( keys %data ) {
for my $col ( keys %{ $data{$action} } ) {
print qq{"$action" for column $col found for acct's: },
join q{,}, @{ $data{$action}->{$col} }, "\n";
}
}
__DATA__
211B1 CUSTOMER|UPDATE|
211B2 CUSTOMER|UPDATE|
211B3 CUSTOMER|UPDATE|
211B4 CUSTOMER|UPDATE|
211B5 CUSTOMER|UPDATE|
567FR CUSTOMER|DELETE|
647GI CUSTOMER|DELETE|
#4
another awk version, though does reverse order of code values, and has an extra "," at end of each line
另一个awk版本,虽然代码值的顺序相反,并且在每一行的末尾都有一个额外的“,”
BEGIN { FS="[ |]" }
{
key = $3 " for column " $2
MAP[ key ] = $1 "," MAP[ key ]
}
END {
for ( item in MAP ) {
print item " found for Acct's: " MAP[ item ]
}
}
#5
Based on your question, you could do this:
根据您的问题,您可以这样做:
perl -i.bak -pe'if(/^211B[1-5]/){s/CUSTOMER/UPDATE/}elsif(/^(5675FR|6470GI)/){s/CUSTOMER/DELETE/}' filename
Though I notice now that the last two account numbers differ in the example, and also that the second column already has those values...
虽然我现在注意到示例中最后两个帐号不同,而且第二列已经有了这些值...