This question already has an answer here:
这个问题在这里已有答案:
- How can I delete duplicate lines in a file in Unix? 8 answers
- 如何在Unix中删除文件中的重复行? 8个答案
I want to remove duplicate entries from a text file, e.g:
我想从文本文件中删除重复的条目,例如:
kavitha= Tue Feb 20 14:00 19 IST 2012 (duplicate entry)
sree=Tue Jan 20 14:05 19 IST 2012
divya = Tue Jan 20 14:20 19 IST 2012
anusha=Tue Jan 20 14:45 19 IST 2012
kavitha= Tue Feb 20 14:00 19 IST 2012 (duplicate entry)
Is there any possible way to remove the duplicate entries using a Bash script?
有没有办法使用Bash脚本删除重复的条目?
Desired output
期望的输出
kavitha= Tue Feb 20 14:00 19 IST 2012
sree=Tue Jan 20 14:05 19 IST 2012
divya = Tue Jan 20 14:20 19 IST 2012
anusha=Tue Jan 20 14:45 19 IST 2012
4 个解决方案
#1
296
You can sort
then uniq
:
你可以然后排序uniq:
$ sort -u input.txt
Or use awk
:
或者使用awk:
$ awk '!a[$0]++' input.txt
#2
8
It deletes duplicate, consecutive lines from a file (emulates "uniq").
First line in a set of duplicate lines is kept, rest are deleted.
它从文件中删除重复的连续行(模拟“uniq”)。保留一组重复行中的第一行,删除其余行。
sed '$!N; /^\(.*\)\n\1$/!P; D'
#3
2
Perl one-liner similar to @kev's awk solution:
类似于@kev的awk解决方案的Perl单线程:
perl -ne 'print if ! $a{$_}++' input
This variation removes trailing whitespace before comparing:
此变体在比较之前删除尾随空格:
perl -lne 's/\s*$//; print if ! $a{$_}++' input
This variation edits the file in-place:
此变体就地编辑文件:
perl -i -ne 'print if ! $a{$_}++' input
This variation edits the file in-place, and makes a backup input.bak
此变体就地编辑文件,并生成备份input.bak
perl -i.bak -ne 'print if ! $a{$_}++' input
#4
0
This might work for you:
这可能对你有用:
cat -n file.txt |
sort -u -k2,7 |
sort -n |
sed 's/.*\t/ /;s/\([0-9]\{4\}\).*/\1/'
or this:
或这个:
awk '{line=substr($0,1,match($0,/[0-9][0-9][0-9][0-9]/)+3);sub(/^/," ",line);if(!dup[line]++)print line}' file.txt
#1
296
You can sort
then uniq
:
你可以然后排序uniq:
$ sort -u input.txt
Or use awk
:
或者使用awk:
$ awk '!a[$0]++' input.txt
#2
8
It deletes duplicate, consecutive lines from a file (emulates "uniq").
First line in a set of duplicate lines is kept, rest are deleted.
它从文件中删除重复的连续行(模拟“uniq”)。保留一组重复行中的第一行,删除其余行。
sed '$!N; /^\(.*\)\n\1$/!P; D'
#3
2
Perl one-liner similar to @kev's awk solution:
类似于@kev的awk解决方案的Perl单线程:
perl -ne 'print if ! $a{$_}++' input
This variation removes trailing whitespace before comparing:
此变体在比较之前删除尾随空格:
perl -lne 's/\s*$//; print if ! $a{$_}++' input
This variation edits the file in-place:
此变体就地编辑文件:
perl -i -ne 'print if ! $a{$_}++' input
This variation edits the file in-place, and makes a backup input.bak
此变体就地编辑文件,并生成备份input.bak
perl -i.bak -ne 'print if ! $a{$_}++' input
#4
0
This might work for you:
这可能对你有用:
cat -n file.txt |
sort -u -k2,7 |
sort -n |
sed 's/.*\t/ /;s/\([0-9]\{4\}\).*/\1/'
or this:
或这个:
awk '{line=substr($0,1,match($0,/[0-9][0-9][0-9][0-9]/)+3);sub(/^/," ",line);if(!dup[line]++)print line}' file.txt