I have a file with contents as this:
我有一个包含如下内容的文件:
- 2 equal files of size 288903252
- 2 equal files of size 284164096
"C:\E\100p disk util bak\Softwares\OSs\gparted-live-0.26.1-1-i686.iso"
"H:\Softwares\Linux\gparted-live-0.26.1-1-i686.iso"
- 2 equal files of size 277436598
- 2 equal files of size 161356649
"H:\Softwares\Dev Tools\Eclipse\Windows\eclipse-java-luna-SR1a-win32-x86_64.zip"
- 35 equal files of size 97078976
"C:\Windows\System32\DriverStore\FileRepository\nvacwu.inf_amd64_9934c34dc6ca0c4b\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvamwu.inf_amd64_d4715679184092a8\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvaowu.inf_amd64_785608ed2524cdea\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvblwu.inf_amd64_31f54e2d1ba058d5\NvCplSetupInt.exe"
I want to delete those lines with - X equal files of size
without having actual file paths following them. For example first and third bullet point:
我想删除那些大小为- X的行,而不是在它们后面有实际的文件路径。例如第一点和第三点:
- 2 equal files of size 284164096
"C:\E\100p disk util bak\Softwares\OSs\gparted-live-0.26.1-1-i686.iso"
"H:\Softwares\Linux\gparted-live-0.26.1-1-i686.iso"
- 2 equal files of size 161356649
"H:\Softwares\Dev Tools\Eclipse\Windows\eclipse-java-luna-SR1a-win32-x86_64.zip"
- 35 equal files of size 97078976
"C:\Windows\System32\DriverStore\FileRepository\nvacwu.inf_amd64_9934c34dc6ca0c4b\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvamwu.inf_amd64_d4715679184092a8\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvaowu.inf_amd64_785608ed2524cdea\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvblwu.inf_amd64_31f54e2d1ba058d5\NvCplSetupInt.exe"
I formed a regex that matches these lines:
我创建了一个与以下内容匹配的regex:
(^-.*\n)-
which can be checked in action at above link. I want to delete that first group which is essentially the whole line. But not able to guess how do I do the same with grep
or sed
. Can we do this in single command?
可以在上面的链接中进行检查。我要删除第一个组也就是整条线。但是无法猜到如何用grep或sed执行相同的操作。我们能在一个命令中做到这一点吗?
4 个解决方案
#1
2
Using sed
使用sed
sed '/^-/{N;/\n-/D}' file
- 2 equal files of size 284164096
"C:\E\100p disk util bak\Softwares\OSs\gparted-live-0.26.1-1-i686.iso"
"H:\Softwares\Linux\gparted-live-0.26.1-1-i686.iso"
- 2 equal files of size 161356649
"H:\Softwares\Dev Tools\Eclipse\Windows\eclipse-java-luna-SR1a-win32-x86_64.zip"
- 35 equal files of size 97078976
"C:\Windows\System32\DriverStore\FileRepository\nvacwu.inf_amd64_9934c34dc6ca0c4b\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvamwu.inf_amd64_d4715679184092a8\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvaowu.inf_amd64_785608ed2524cdea\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvblwu.inf_amd64_31f54e2d1ba058d5\NvCplSetupInt.exe"
Portable version for any version of sed
任何版本的sed的可移植版本
sed -e '/^-/{N' -e '/\
-/D' -e '}' file
If you want to remove the last line if it is -
如果你想删除最后一行,如果它是-
sed -e '/^-/{$d' -e 'N' -e '/\
-/D' -e '}' file
#2
1
You can just grep it:
你可以这样说:
grep -v -B1 "^-" test_file.txt | grep -v "\-\-"
- 2 equal files of size 284164096
"C:\E\100p disk util bak\Softwares\OSs\gparted-live-0.26.1-1-i686.iso"
"H:\Softwares\Linux\gparted-live-0.26.1-1-i686.iso"
- 2 equal files of size 161356649
"H:\Softwares\Dev Tools\Eclipse\Windows\eclipse-java-luna-SR1a-win32-x86_64.zip"
- 35 equal files of size 97078976
"C:\Windows\System32\DriverStore\FileRepository\nvacwu.inf_amd64_9934c34dc6ca0c4b\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvamwu.inf_amd64_d4715679184092a8\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvaowu.inf_amd64_785608ed2524cdea\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvblwu.inf_amd64_31f54e2d1ba058d5\NvCplSetupInt.exe"
How it works? It's merely selecting all lines and the lines before them that don't start with a -
. The second grep just removes the group separator, some grep versions support --no-group-separator
so you can do it in one go.
它是如何工作的?它只是选择所有的线和前面的线而不是以-开头。第二个grep只删除组分隔符,一些grep版本支持—无组分隔符,因此您可以一次完成。
#3
0
Is pepsi perl okay?
perl是百事可乐好吗?
cat input.txt | perl -pe 'BEGIN{undef $/;} s/^-.*?\n-/-/smg'
The BEGIN
block allows the multiline search by essentially telling perl that there is no end of line character. Then the s/
part will substitute any part matching your regex with a -
(no need for a capturing group).
通过告诉perl行字符没有结束,BEGIN块允许进行多行搜索。然后s/ part将用-(不需要捕获组)替换与regex匹配的任何部分。
Oh, and I slightly modified your regex to be greedy, with a ?
. Otherwise, the search being multiline, it would match from the first -
to the last one, and remove almost everything.
哦,我把你的正则表达式修改得有点贪心了。否则,搜索是多行的,它将从第一行匹配到最后一行,并删除几乎所有内容。
Edit: here is a lengthy and informative Q/A about multiline search, that shows it will be difficult with sed
.
编辑:这里有一个关于多行搜索的冗长且内容丰富的问题/ a,这表明使用sed会有困难。
Edit2: actually quite easy with a modern sed
, see @123's answer
实际上使用现代sed很容易,请参见@123的答案
#4
0
sed is for simple substitutions on individual lines, that is all. For anything else you should be using awk. If you are using sed constructs other than s, g, and p (with -n) then you are using constructs that became obsolete in the mid-1970s when awk was invented.
sed只表示单行上的简单替换,仅此而已。对于其他任何你应该使用awk的东西。如果您使用的是除s、g和p(带-n)之外的sed构造,那么您使用的构造在20世纪70年代中期发明awk时已经过时。
This will work robustly, efficiently, and portably with any awk on any UNIX box:
这对于任何UNIX机箱上的任何awk都将有效地、有效地、可移植地工作:
$ awk '/^ /{print p $0; p=""; next} {p=$0 ORS}' file
- 2 equal files of size 284164096
"C:\E\100p disk util bak\Softwares\OSs\gparted-live-0.26.1-1-i686.iso"
"H:\Softwares\Linux\gparted-live-0.26.1-1-i686.iso"
- 2 equal files of size 161356649
"H:\Softwares\Dev Tools\Eclipse\Windows\eclipse-java-luna-SR1a-win32-x86_64.zip"
- 35 equal files of size 97078976
"C:\Windows\System32\DriverStore\FileRepository\nvacwu.inf_amd64_9934c34dc6ca0c4b\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvamwu.inf_amd64_d4715679184092a8\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvaowu.inf_amd64_785608ed2524cdea\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvblwu.inf_amd64_31f54e2d1ba058d5\NvCplSetupInt.exe"
#1
2
Using sed
使用sed
sed '/^-/{N;/\n-/D}' file
- 2 equal files of size 284164096
"C:\E\100p disk util bak\Softwares\OSs\gparted-live-0.26.1-1-i686.iso"
"H:\Softwares\Linux\gparted-live-0.26.1-1-i686.iso"
- 2 equal files of size 161356649
"H:\Softwares\Dev Tools\Eclipse\Windows\eclipse-java-luna-SR1a-win32-x86_64.zip"
- 35 equal files of size 97078976
"C:\Windows\System32\DriverStore\FileRepository\nvacwu.inf_amd64_9934c34dc6ca0c4b\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvamwu.inf_amd64_d4715679184092a8\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvaowu.inf_amd64_785608ed2524cdea\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvblwu.inf_amd64_31f54e2d1ba058d5\NvCplSetupInt.exe"
Portable version for any version of sed
任何版本的sed的可移植版本
sed -e '/^-/{N' -e '/\
-/D' -e '}' file
If you want to remove the last line if it is -
如果你想删除最后一行,如果它是-
sed -e '/^-/{$d' -e 'N' -e '/\
-/D' -e '}' file
#2
1
You can just grep it:
你可以这样说:
grep -v -B1 "^-" test_file.txt | grep -v "\-\-"
- 2 equal files of size 284164096
"C:\E\100p disk util bak\Softwares\OSs\gparted-live-0.26.1-1-i686.iso"
"H:\Softwares\Linux\gparted-live-0.26.1-1-i686.iso"
- 2 equal files of size 161356649
"H:\Softwares\Dev Tools\Eclipse\Windows\eclipse-java-luna-SR1a-win32-x86_64.zip"
- 35 equal files of size 97078976
"C:\Windows\System32\DriverStore\FileRepository\nvacwu.inf_amd64_9934c34dc6ca0c4b\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvamwu.inf_amd64_d4715679184092a8\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvaowu.inf_amd64_785608ed2524cdea\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvblwu.inf_amd64_31f54e2d1ba058d5\NvCplSetupInt.exe"
How it works? It's merely selecting all lines and the lines before them that don't start with a -
. The second grep just removes the group separator, some grep versions support --no-group-separator
so you can do it in one go.
它是如何工作的?它只是选择所有的线和前面的线而不是以-开头。第二个grep只删除组分隔符,一些grep版本支持—无组分隔符,因此您可以一次完成。
#3
0
Is pepsi perl okay?
perl是百事可乐好吗?
cat input.txt | perl -pe 'BEGIN{undef $/;} s/^-.*?\n-/-/smg'
The BEGIN
block allows the multiline search by essentially telling perl that there is no end of line character. Then the s/
part will substitute any part matching your regex with a -
(no need for a capturing group).
通过告诉perl行字符没有结束,BEGIN块允许进行多行搜索。然后s/ part将用-(不需要捕获组)替换与regex匹配的任何部分。
Oh, and I slightly modified your regex to be greedy, with a ?
. Otherwise, the search being multiline, it would match from the first -
to the last one, and remove almost everything.
哦,我把你的正则表达式修改得有点贪心了。否则,搜索是多行的,它将从第一行匹配到最后一行,并删除几乎所有内容。
Edit: here is a lengthy and informative Q/A about multiline search, that shows it will be difficult with sed
.
编辑:这里有一个关于多行搜索的冗长且内容丰富的问题/ a,这表明使用sed会有困难。
Edit2: actually quite easy with a modern sed
, see @123's answer
实际上使用现代sed很容易,请参见@123的答案
#4
0
sed is for simple substitutions on individual lines, that is all. For anything else you should be using awk. If you are using sed constructs other than s, g, and p (with -n) then you are using constructs that became obsolete in the mid-1970s when awk was invented.
sed只表示单行上的简单替换,仅此而已。对于其他任何你应该使用awk的东西。如果您使用的是除s、g和p(带-n)之外的sed构造,那么您使用的构造在20世纪70年代中期发明awk时已经过时。
This will work robustly, efficiently, and portably with any awk on any UNIX box:
这对于任何UNIX机箱上的任何awk都将有效地、有效地、可移植地工作:
$ awk '/^ /{print p $0; p=""; next} {p=$0 ORS}' file
- 2 equal files of size 284164096
"C:\E\100p disk util bak\Softwares\OSs\gparted-live-0.26.1-1-i686.iso"
"H:\Softwares\Linux\gparted-live-0.26.1-1-i686.iso"
- 2 equal files of size 161356649
"H:\Softwares\Dev Tools\Eclipse\Windows\eclipse-java-luna-SR1a-win32-x86_64.zip"
- 35 equal files of size 97078976
"C:\Windows\System32\DriverStore\FileRepository\nvacwu.inf_amd64_9934c34dc6ca0c4b\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvamwu.inf_amd64_d4715679184092a8\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvaowu.inf_amd64_785608ed2524cdea\NvCplSetupInt.exe"
"C:\Windows\System32\DriverStore\FileRepository\nvblwu.inf_amd64_31f54e2d1ba058d5\NvCplSetupInt.exe"