Say I have this file data.txt
:
说我有这个文件data.txt:
a=0,b=3,c=5
a=2,b=0,c=4
a=3,b=6,c=7
I want to use grep
to extract 2 columns corresponding to the values of a
and c
:
我想用grep提取对应于a和c值的2列:
0 5
2 4
3 7
I know how to extract each column separately:
我知道如何分别提取每个列:
grep -oP 'a=\K([0-9]+)' data.txt
0
2
3
And:
和:
grep -oP 'c=\K([0-9]+)' data.txt
5
4
7
But I can't figure how to extract the two groups. I tried the following, which didn't work:
但我无法想象如何提取这两组。我尝试了以下,但没有用:
grep -oP 'a=\K([0-9]+),.+c=\K([0-9]+)' data.txt
5
4
7
3 个解决方案
#1
2
I am also curious about grep
being able to do so. \K
"removes" the previous content that is stored, so you cannot use it twice in the same expression: it will just show the last group. Hence, it should be done differently.
我也很好奇grep能够这样做。 \ K“删除”以前存储的内容,因此您不能在同一个表达式中使用它两次:它只显示最后一个组。因此,它应该以不同的方式完成。
In the meanwhile, I would use sed
:
同时,我会使用sed:
sed -r 's/^a=([0-9]+).*c=([0-9]+)$/\1 \2/' file
it catches the digits after a=
and c=
, whenever this happens on lines starting with a=
and not containing anything else after c=digits
.
它会在a =和c =之后捕获数字,只要这种情况发生在以a开头并且在c = digits之后不包含任何其他内容的行上。
For your input, it returns:
对于您的输入,它返回:
0 5
2 4
3 7
#2
4
You could try the below grep command. But note that , grep would display each match in separate new line. So you won't get the format like you mentioned in the question.
你可以尝试下面的grep命令。但请注意,grep会在单独的新行中显示每个匹配项。因此,您将无法获得问题中提到的格式。
$ grep -oP 'a=\K([0-9]+)|c=\K([0-9]+)' file
0
5
2
4
3
7
To get the mentioned format , you need to pass the output of grep
to paste
or any other commands .
要获得上述格式,您需要将grep的输出传递给paste或任何其他命令。
$ grep -oP 'a=\K([0-9]+)|c=\K([0-9]+)' file | paste -d' ' - -
0 5
2 4
3 7
#3
0
use this :
用这个 :
awk -F[=,] '{print $2" "$6}' data.txt
I am using the separators as =
and ,
, then spliting on them
我使用分隔符=和,,然后分裂它们
#1
2
I am also curious about grep
being able to do so. \K
"removes" the previous content that is stored, so you cannot use it twice in the same expression: it will just show the last group. Hence, it should be done differently.
我也很好奇grep能够这样做。 \ K“删除”以前存储的内容,因此您不能在同一个表达式中使用它两次:它只显示最后一个组。因此,它应该以不同的方式完成。
In the meanwhile, I would use sed
:
同时,我会使用sed:
sed -r 's/^a=([0-9]+).*c=([0-9]+)$/\1 \2/' file
it catches the digits after a=
and c=
, whenever this happens on lines starting with a=
and not containing anything else after c=digits
.
它会在a =和c =之后捕获数字,只要这种情况发生在以a开头并且在c = digits之后不包含任何其他内容的行上。
For your input, it returns:
对于您的输入,它返回:
0 5
2 4
3 7
#2
4
You could try the below grep command. But note that , grep would display each match in separate new line. So you won't get the format like you mentioned in the question.
你可以尝试下面的grep命令。但请注意,grep会在单独的新行中显示每个匹配项。因此,您将无法获得问题中提到的格式。
$ grep -oP 'a=\K([0-9]+)|c=\K([0-9]+)' file
0
5
2
4
3
7
To get the mentioned format , you need to pass the output of grep
to paste
or any other commands .
要获得上述格式,您需要将grep的输出传递给paste或任何其他命令。
$ grep -oP 'a=\K([0-9]+)|c=\K([0-9]+)' file | paste -d' ' - -
0 5
2 4
3 7
#3
0
use this :
用这个 :
awk -F[=,] '{print $2" "$6}' data.txt
I am using the separators as =
and ,
, then spliting on them
我使用分隔符=和,,然后分裂它们