I've two files with 3 identical columns and 4th a different one.
我有两个文件,其中3列相同,第4列不同。
File A
档案A.
a b c 100
e f g 50
h i j 25
File B
档案B.
a b c 200
e f g 20
h i j 15
How can files A
and B
be combined to look like file C
?
如何将文件A和B组合成文件C?
File C
文件C.
a b c 100 200
e f g 50 20
h i j 25 15
--UPDATE--
--UPDATE--
I've used the solutions provided by Jotne an Kent but both of the script's output have . (dot) instead of comma. it looks like
我使用了Jotne和Kent提供的解决方案,但两个脚本的输出都有。 (点)而不是逗号。看起来像
a,b,c,100.200
e,f,g,50.20
2 个解决方案
#1
2
Here is one awk
这是一个awk
awk 'FNR==NR {a[$1,$2,$3]=$4;next} {print $0,a[$1,$2,$3]}' B A > C
cat C
a b c 100 200
e f g 50 20
h i j 25 15
#2
0
If they had just a column in common, join
could make it. But let's use it and then parse the output:
如果他们只有一个共同的列,那么加入可以成功。但是让我们使用它然后解析输出:
$ join <(sort f1) <(sort f2)
a b c 100 b c 200
e f g 50 f g 20
h i j 25 i j 15
This joined based on the first column. Now, let's use cut
to get everything but columns 5 and 6:
这是基于第一列加入的。现在,让我们使用cut来获取除第5和第6列之外的所有内容:
$ join <(sort f1) <(sort f2) | cut -d' ' -f1-4,7
a b c 100 200
e f g 50 20
h i j 25 15
Note the usage of sort
to sort the files, because join
needs files to be sorted to work. With the sample data given it worked without sort
, but added for consistency.
请注意使用sort来对文件进行排序,因为join需要对文件进行排序才能工作。对于给定的样本数据,它没有排序,但增加了一致性。
#1
2
Here is one awk
这是一个awk
awk 'FNR==NR {a[$1,$2,$3]=$4;next} {print $0,a[$1,$2,$3]}' B A > C
cat C
a b c 100 200
e f g 50 20
h i j 25 15
#2
0
If they had just a column in common, join
could make it. But let's use it and then parse the output:
如果他们只有一个共同的列,那么加入可以成功。但是让我们使用它然后解析输出:
$ join <(sort f1) <(sort f2)
a b c 100 b c 200
e f g 50 f g 20
h i j 25 i j 15
This joined based on the first column. Now, let's use cut
to get everything but columns 5 and 6:
这是基于第一列加入的。现在,让我们使用cut来获取除第5和第6列之外的所有内容:
$ join <(sort f1) <(sort f2) | cut -d' ' -f1-4,7
a b c 100 200
e f g 50 20
h i j 25 15
Note the usage of sort
to sort the files, because join
needs files to be sorted to work. With the sample data given it worked without sort
, but added for consistency.
请注意使用sort来对文件进行排序,因为join需要对文件进行排序才能工作。对于给定的样本数据,它没有排序,但增加了一致性。