I have some files with below data.
我有一些包含以下数据的文件。
sample File 1:
示例文件1:
sitename1,2009-07-19,"A1",11975,17.23
sitename1,2009-07-19,"A2",11,0.02
sitename1,2009-07-20,"A1",2000,17.23
sitename1,2009-07-20,"A2",538,0.02
I want to map the values in column 4 with column 2 and 3 as shown below.
我想将第4列中的值映射到第2列和第3列,如下所示。
Output required.
Site,Type,2009-07-19,2009-07-20
sitename1,"A1",11975,2000
sitename1,"A2",11,538
Here is what I have tried so far:
这是我到目前为止所尝试的:
#! /usr/bin/perl -w
use strict;
use warnings;
my $column_header=["Site,Type"];
my $position={};
my $last_position=0;
my $current_event=[];
my $events=[];
while (<STDIN>) {
my ($site,$date,$type,$value,$percent) = split /[,\n]/, $_;
my $event_key = $date;
if (not defined $position->{$event_key}) {
$last_position+=1;
$position->{$event_key}=$last_position;
push @$column_header,$event_key;
}
my $pos = $position->{$event_key};
if (defined $current_event->[$pos]) {
dumpEvent();
}
if (not defined $current_event->[0]) {
$current_event->[0]="$site,$type";
}
$current_event->[$pos]=$value;
}
dumpEvent();
my $order = [];
for (my $scan=0; $scan<scalar(@$column_header); $scan++) {
push @$order,$scan;
}
printLine($column_header);
map { printLine($_) } @$events;
sub printLine {
my $record=shift;
my @result=();
foreach my $offset (@$order) {
if (defined $record->[$offset]) {
push @result,$record->[$offset];
} else {
push @result,"";
}
}
print join(",",@result)."\n";
}
sub dumpEvent {
return unless defined $current_event->[0];
push @$events,$current_event;
$current_event=[];
}
The output i am getting is as below.
我得到的输出如下。
*Site,Type,2009-07-19,2009-07-20*
sitename1,"A1",11975,
sitename1,"A2",11,
sitename1,"A1",,14620
sitename1,"A2",,538
2 个解决方案
#1
The folowing code produces the expected result and makes "some" sense. I don't know if it makes real sense.
下面的代码产生预期的结果,并使“一些”有意义。我不知道它是否真的有意义。
my %dates;
my %SiteType;
while (<DATA>) {
chomp;
my ($site,$date,$type,$value,$percent) = split /,/;
$dates{$date} = '1';
push @{$SiteType{"$site,$type"}}, $value ;
};
print 'Site,Type,', join(',', sort keys %dates), "\n";
foreach ( sort keys %SiteType) {
print $_, ',', join(',', @{$SiteType{$_}}), "\n";
};
#2
If I understand you correctly (and I have to admit I'm only guessing), you have several types of things at different dates and a value for each. Thus you need a data structure like this hash for each site:
如果我理解正确(我不得不承认我只是猜测),你在不同的日期有几种类型的东西,每种都有一个值。因此,您需要为每个站点提供类似此哈希的数据结构:
$foo = {
site => 'sitename1',
type => 'A1',
dates => [
{
date => '2009-07-19',
value => 11975,
},
{
date => '2009-07-20',
value => 538,
},
],
};
Is that even close?
这甚至接近了吗?
#1
The folowing code produces the expected result and makes "some" sense. I don't know if it makes real sense.
下面的代码产生预期的结果,并使“一些”有意义。我不知道它是否真的有意义。
my %dates;
my %SiteType;
while (<DATA>) {
chomp;
my ($site,$date,$type,$value,$percent) = split /,/;
$dates{$date} = '1';
push @{$SiteType{"$site,$type"}}, $value ;
};
print 'Site,Type,', join(',', sort keys %dates), "\n";
foreach ( sort keys %SiteType) {
print $_, ',', join(',', @{$SiteType{$_}}), "\n";
};
#2
If I understand you correctly (and I have to admit I'm only guessing), you have several types of things at different dates and a value for each. Thus you need a data structure like this hash for each site:
如果我理解正确(我不得不承认我只是猜测),你在不同的日期有几种类型的东西,每种都有一个值。因此,您需要为每个站点提供类似此哈希的数据结构:
$foo = {
site => 'sitename1',
type => 'A1',
dates => [
{
date => '2009-07-19',
value => 11975,
},
{
date => '2009-07-20',
value => 538,
},
],
};
Is that even close?
这甚至接近了吗?