通过两个文本文件中的特定模式获取差异文件

时间:2023-01-09 15:29:42

I have 2 text files and I need to export "changes" to a new file. That means that the second file's rows are compared to the first file's rows and if a row isn't found there, then it will append it to the new (third) file.

我有2个文本文件,我需要将“更改”导出到新文件。这意味着第二个文件的行与第一个文件的行进行比较,如果在那里找不到行,那么它会将它附加到新的(第三个)文件。

Contents of the first file are:

第一个文件的内容是:

ABC 123 q1w2sd
DEF 321 sdajkn
GHI 123 jsdnaj
JKL 456 jsd223

The second file contains:

第二个文件包含:

ABC 123 XXXXXX
JKL 456 jsd223
DEF XXX sdajkn
GHI 123 jsdnaj

Notice that lines which start with ABC and DEF have changed. JKL has just changed it's place.

请注意,以ABC和DEF开头的行已更改。 JKL刚刚改变了它的位置。

The output file should contain: ABC 123 XXXXXX DEF XXX sdajkn

输出文件应包含:ABC 123 XXXXXX DEF XXX sdajkn

How to do this using 'awk' or 'sed'?

如何使用'awk'或'sed'来做到这一点?

Edit: Also new lines in the second file should be counted as changes..

编辑:第二个文件中的新行也应计为更改..

4 个解决方案

#1


4  

awk 'NR == FNR { f1[$0]; next } !($0 in f1)' file1 file2

With grep: grep -Fvxf file1 file2

使用grep:grep -Fvxf file1 file2

#2


3  

Assuming 1st file is named: fileA and 2nd file is named: fileB you can use awk like this:

假设第一个文件命名为:fileA,第二个文件命名为:fileB,你可以像这样使用awk:

awk 'NR==FNR {a[$1];b[$0];next} ($1 in a) && !($0 in b)' file{A,B}

OR simply:

awk 'NR==FNR {a[$1];b[$0];next} ($1 in a) && !($0 in b)' file1 file2

#3


2  

Code for GNU :

GNU sed代码:

$sed 's#\(.*\)#/\1/d#' file1|sed -f - file2
ABC 123 XXXXXX
DEF XXX sdajkn

This also treats "newlines" in file2.

这也处理file2中的“换行符”。

#4


0  

Using comm to find lines in 2nd file that are not in 1st:

使用comm查找第二个文件中不在1st中的行:

$ comm -13 <(sort first) <(sort second)
ABC 123 XXXXXX
DEF XXX sdajkn

#1


4  

awk 'NR == FNR { f1[$0]; next } !($0 in f1)' file1 file2

With grep: grep -Fvxf file1 file2

使用grep:grep -Fvxf file1 file2

#2


3  

Assuming 1st file is named: fileA and 2nd file is named: fileB you can use awk like this:

假设第一个文件命名为:fileA,第二个文件命名为:fileB,你可以像这样使用awk:

awk 'NR==FNR {a[$1];b[$0];next} ($1 in a) && !($0 in b)' file{A,B}

OR simply:

awk 'NR==FNR {a[$1];b[$0];next} ($1 in a) && !($0 in b)' file1 file2

#3


2  

Code for GNU :

GNU sed代码:

$sed 's#\(.*\)#/\1/d#' file1|sed -f - file2
ABC 123 XXXXXX
DEF XXX sdajkn

This also treats "newlines" in file2.

这也处理file2中的“换行符”。

#4


0  

Using comm to find lines in 2nd file that are not in 1st:

使用comm查找第二个文件中不在1st中的行:

$ comm -13 <(sort first) <(sort second)
ABC 123 XXXXXX
DEF XXX sdajkn