Advance New Year Wishes to All.
祝大家新年祝福。
I have an error log file with the contents in a pattern parameter, result and stderr (stderr can be in multiple lines).
我有一个错误日志文件,其中包含模式参数,结果和stderr中的内容(stderr可以在多行中)。
$cat error_log
<parameter>:test_tot_count
<result>:1
<stderr>:Expected "test_tot_count=2" and the actual value is 3
test_tot_count = 3
<parameter>:test_one_count
<result>:0
<stderr>:Expected "test_one_count=2" and the actual value is 0
test_one_count = 0
<parameter>:test_two_count
<result>:4
<stderr>:Expected "test_two_count=2" and the actual value is 4
test_two_count = 4
...
I need to write a function in Perl to store each parameters, result and stderr in an array or hash table.
我需要在Perl中编写一个函数来将每个参数,result和stderr存储在数组或散列表中。
This is our own internally defined structure. I wrote the Perl function like this. Is there a better way of doing this using regular expression itself?
这是我们自己内部定义的结构。我写了这样的Perl函数。使用正则表达式本身有更好的方法吗?
my $err_msg = "";
while (<ERR_LOG>)
{
if (/<parameter>:/)
{
s/<parameter>://;
push @parameter, $_;
}
elsif (/<result>:/)
{
s/<result>://;
push @result, $_;
}
elsif (/<stderr>:/)
{
if (length($err_msg) > 0)
{
push @stderr, $err_msg;
}
s/<stderr>://;
$err_msg = $_;
}
else
{
$err_msg .= $_;
}
}
if (length($err_msg) > 0)
{
push @stderr, $err_msg;
}
3 个解决方案
#1
If you're using Perl 5.10, you can do something very similar to what you have now but with a much nicer layout by using the given/when structure:
如果您使用的是Perl 5.10,那么您可以使用给定的/ when结构,执行与现在非常相似的操作,但使用更好的布局:
use 5.010;
while (<ERR_LOG>) {
chomp;
given ($_) {
when ( m{^<parameter>: (.*)}x ) { push @parameter, $1 }
when ( m{^<result>: (.*)}x ) { push @result, $1 }
when ( m{^<stderr>: (.*)}x ) { push @stderr, $1 }
default { $stderr[-1] .= "\n$_" }
}
}
It's worth noting that for the default case here, rather than keeping a separate $err_msg variable, I'm simply pushing onto @stderr
when I see a stderr
tag, and appending to the last item of the @stderr
array if I see a continuation line. I'm adding a newline when I see continuation lines, since I assume you want them preserved.
值得注意的是,对于这里的默认情况,而不是保留一个单独的$ err_msg变量,我只是在看到stderr标记时推送到@stderr,并且如果我看到一个延续,则追加到@stderr数组的最后一项线。当我看到延续线时,我正在添加换行符,因为我假设您希望它们保留。
Despite the above code looking quite elegant, I'm not really all that fond of keeping three separate arrays, since it will presumably cause you headaches if things get out of sync, and because if you want to add more fields in the future you'll end up with lots and lots of variables floating around that you'll need to keep track of. I'd suggest storing each record inside a hash, and then keeping an array of records:
尽管上面的代码看起来非常优雅,但我并不是真的喜欢保留三个独立的数组,因为如果事情不同步,它可能会让你头疼,因为如果你想在未来添加更多的字段你呢?最终会有大量的变量浮动,你需要跟踪它们。我建议将每条记录存储在一个哈希中,然后保留一组记录:
use 5.010;
my @records;
my $prev_key;
while (<ERR_LOG>) {
chomp;
given ($_) {
when ( m{^<parameter> }x ) { push(@records, {}); continue; }
when ( m{^<(\w+)>: (.*)}x ) { $records[-1]{$1} = $2; $prev_key = $1; }
default { $records[-1]{$prev_key} .= "\n$_"; }
}
}
Here we're pushing a new record onto the array when we see a field, adding an entry to our hash whenever we see a key/value pair, and appending to the last field we added to if we see a continuation line. The end result of @records
looks like this:
当我们看到一个字段时,我们将新记录推送到数组中,每当我们看到一个键/值对时向我们的哈希添加一个条目,如果我们看到一个连续行,则追加到我们添加的最后一个字段。 @records的最终结果如下:
(
{
parameter => 'test_one_count',
result => 0,
stderr => qq{Expected "test_one_count=2" and the actual value is 0\ntest_one_count=0},
},
{
parameter => 'test_two_count',
result => 4,
stderr => qq{Expected "test_two_count=2" and the actual value is 4\ntest_two_count=4},
}
)
Now you can pass just a single data structure around which contains all of your records, and you can add more fields in the future (even multi-line ones) and they'll be correctly handled.
现在,您只能传递包含所有记录的单个数据结构,并且您可以在将来添加更多字段(甚至是多行字段),并且它们将被正确处理。
If you're not using Perl 5.10, then this may be a good excuse to upgrade. If not, you can translate the given/when structures into more traditional if/elsif/else structures, but they lose much of their beauty in the conversion.
如果你没有使用Perl 5.10,那么这可能是升级的一个很好的借口。如果没有,您可以将给定/何时结构转换为更传统的if / elsif / else结构,但它们在转换中失去了很多美感。
Paul
#2
The main thing that jumps out for refactoring is the repetition in the matching, stripping, and storing. Something like this (untested) code is more concise:
跳出重构的主要问题是匹配,剥离和存储的重复。像这样(未经测试的)代码更简洁:
my( $err_msg , %data );
while (<ERR_LOG>) {
if(( my $key ) = $_ =~ s/^<(parameter|result|stderr)>:// ) {
if( $key eq 'stderr' ) {
push @{ $data{$key} } , $err_msg if $err_msg;
$err_msg = $_;
}
else { push @{ $data{$key} } , $_ }
}
else { $err_msg .= $_ }
}
# grab the last err_msg out of the hopper
push @{ $data{stderr} } , $err_msg;
... but it may be harder to understand six months from now... 8^)
......但是从现在起六个月后可能更难理解...... 8 ^)
#3
Looks nice. =) An improvement is probably to anchor those tags at the beginning of the line:
看起来不错。 =)改进可能是在行的开头锚定这些标记:
if (/^<parameter>:/)
It'll make the script a bit more robust.
它会使脚本更加健壮。
You can also avoid the stripping of the tag if you catch what's after it and use only that part:
如果你抓住它后面的内容并且只使用那个部分,你也可以避免剥离标签:
if (/^<parameter>:(.*)/s)
{
push @parameter, $1;
}
#1
If you're using Perl 5.10, you can do something very similar to what you have now but with a much nicer layout by using the given/when structure:
如果您使用的是Perl 5.10,那么您可以使用给定的/ when结构,执行与现在非常相似的操作,但使用更好的布局:
use 5.010;
while (<ERR_LOG>) {
chomp;
given ($_) {
when ( m{^<parameter>: (.*)}x ) { push @parameter, $1 }
when ( m{^<result>: (.*)}x ) { push @result, $1 }
when ( m{^<stderr>: (.*)}x ) { push @stderr, $1 }
default { $stderr[-1] .= "\n$_" }
}
}
It's worth noting that for the default case here, rather than keeping a separate $err_msg variable, I'm simply pushing onto @stderr
when I see a stderr
tag, and appending to the last item of the @stderr
array if I see a continuation line. I'm adding a newline when I see continuation lines, since I assume you want them preserved.
值得注意的是,对于这里的默认情况,而不是保留一个单独的$ err_msg变量,我只是在看到stderr标记时推送到@stderr,并且如果我看到一个延续,则追加到@stderr数组的最后一项线。当我看到延续线时,我正在添加换行符,因为我假设您希望它们保留。
Despite the above code looking quite elegant, I'm not really all that fond of keeping three separate arrays, since it will presumably cause you headaches if things get out of sync, and because if you want to add more fields in the future you'll end up with lots and lots of variables floating around that you'll need to keep track of. I'd suggest storing each record inside a hash, and then keeping an array of records:
尽管上面的代码看起来非常优雅,但我并不是真的喜欢保留三个独立的数组,因为如果事情不同步,它可能会让你头疼,因为如果你想在未来添加更多的字段你呢?最终会有大量的变量浮动,你需要跟踪它们。我建议将每条记录存储在一个哈希中,然后保留一组记录:
use 5.010;
my @records;
my $prev_key;
while (<ERR_LOG>) {
chomp;
given ($_) {
when ( m{^<parameter> }x ) { push(@records, {}); continue; }
when ( m{^<(\w+)>: (.*)}x ) { $records[-1]{$1} = $2; $prev_key = $1; }
default { $records[-1]{$prev_key} .= "\n$_"; }
}
}
Here we're pushing a new record onto the array when we see a field, adding an entry to our hash whenever we see a key/value pair, and appending to the last field we added to if we see a continuation line. The end result of @records
looks like this:
当我们看到一个字段时,我们将新记录推送到数组中,每当我们看到一个键/值对时向我们的哈希添加一个条目,如果我们看到一个连续行,则追加到我们添加的最后一个字段。 @records的最终结果如下:
(
{
parameter => 'test_one_count',
result => 0,
stderr => qq{Expected "test_one_count=2" and the actual value is 0\ntest_one_count=0},
},
{
parameter => 'test_two_count',
result => 4,
stderr => qq{Expected "test_two_count=2" and the actual value is 4\ntest_two_count=4},
}
)
Now you can pass just a single data structure around which contains all of your records, and you can add more fields in the future (even multi-line ones) and they'll be correctly handled.
现在,您只能传递包含所有记录的单个数据结构,并且您可以在将来添加更多字段(甚至是多行字段),并且它们将被正确处理。
If you're not using Perl 5.10, then this may be a good excuse to upgrade. If not, you can translate the given/when structures into more traditional if/elsif/else structures, but they lose much of their beauty in the conversion.
如果你没有使用Perl 5.10,那么这可能是升级的一个很好的借口。如果没有,您可以将给定/何时结构转换为更传统的if / elsif / else结构,但它们在转换中失去了很多美感。
Paul
#2
The main thing that jumps out for refactoring is the repetition in the matching, stripping, and storing. Something like this (untested) code is more concise:
跳出重构的主要问题是匹配,剥离和存储的重复。像这样(未经测试的)代码更简洁:
my( $err_msg , %data );
while (<ERR_LOG>) {
if(( my $key ) = $_ =~ s/^<(parameter|result|stderr)>:// ) {
if( $key eq 'stderr' ) {
push @{ $data{$key} } , $err_msg if $err_msg;
$err_msg = $_;
}
else { push @{ $data{$key} } , $_ }
}
else { $err_msg .= $_ }
}
# grab the last err_msg out of the hopper
push @{ $data{stderr} } , $err_msg;
... but it may be harder to understand six months from now... 8^)
......但是从现在起六个月后可能更难理解...... 8 ^)
#3
Looks nice. =) An improvement is probably to anchor those tags at the beginning of the line:
看起来不错。 =)改进可能是在行的开头锚定这些标记:
if (/^<parameter>:/)
It'll make the script a bit more robust.
它会使脚本更加健壮。
You can also avoid the stripping of the tag if you catch what's after it and use only that part:
如果你抓住它后面的内容并且只使用那个部分,你也可以避免剥离标签:
if (/^<parameter>:(.*)/s)
{
push @parameter, $1;
}