如何使用grep提取多个组

时间:2021-01-07 19:17:10

Say I have this file data.txt:

说我有这个文件data.txt:

a=0,b=3,c=5
a=2,b=0,c=4
a=3,b=6,c=7

I want to use grep to extract 2 columns corresponding to the values of a and c:

我想用grep提取对应于a和c值的2列:

0 5
2 4
3 7

I know how to extract each column separately:

我知道如何分别提取每个列:

grep -oP 'a=\K([0-9]+)' data.txt
0
2
3

And:

和:

grep -oP 'c=\K([0-9]+)' data.txt
5
4
7

But I can't figure how to extract the two groups. I tried the following, which didn't work:

但我无法想象如何提取这两组。我尝试了以下,但没有用:

grep -oP 'a=\K([0-9]+),.+c=\K([0-9]+)' data.txt
5
4
7

3 个解决方案

#1


2  

I am also curious about grep being able to do so. \K "removes" the previous content that is stored, so you cannot use it twice in the same expression: it will just show the last group. Hence, it should be done differently.

我也很好奇grep能够这样做。 \ K“删除”以前存储的内容,因此您不能在同一个表达式中使用它两次:它只显示最后一个组。因此,它应该以不同的方式完成。

In the meanwhile, I would use sed:

同时,我会使用sed:

sed -r 's/^a=([0-9]+).*c=([0-9]+)$/\1 \2/' file

it catches the digits after a= and c=, whenever this happens on lines starting with a= and not containing anything else after c=digits.

它会在a =和c =之后捕获数字,只要这种情况发生在以a开头并且在c = digits之后不包含任何其他内容的行上。

For your input, it returns:

对于您的输入,它返回:

0 5
2 4
3 7

#2


4  

You could try the below grep command. But note that , grep would display each match in separate new line. So you won't get the format like you mentioned in the question.

你可以尝试下面的grep命令。但请注意,grep会在单独的新行中显示每个匹配项。因此,您将无法获得问题中提到的格式。

$ grep -oP 'a=\K([0-9]+)|c=\K([0-9]+)' file
0
5
2
4
3
7

To get the mentioned format , you need to pass the output of grep to paste or any other commands .

要获得上述格式,您需要将grep的输出传递给paste或任何其他命令。

$ grep -oP 'a=\K([0-9]+)|c=\K([0-9]+)' file | paste -d' ' - -
0 5
2 4
3 7

#3


0  

use this :

用这个 :

awk -F[=,] '{print $2" "$6}' data.txt 

I am using the separators as = and ,, then spliting on them

我使用分隔符=和,,然后分裂它们

#1


2  

I am also curious about grep being able to do so. \K "removes" the previous content that is stored, so you cannot use it twice in the same expression: it will just show the last group. Hence, it should be done differently.

我也很好奇grep能够这样做。 \ K“删除”以前存储的内容,因此您不能在同一个表达式中使用它两次:它只显示最后一个组。因此,它应该以不同的方式完成。

In the meanwhile, I would use sed:

同时,我会使用sed:

sed -r 's/^a=([0-9]+).*c=([0-9]+)$/\1 \2/' file

it catches the digits after a= and c=, whenever this happens on lines starting with a= and not containing anything else after c=digits.

它会在a =和c =之后捕获数字,只要这种情况发生在以a开头并且在c = digits之后不包含任何其他内容的行上。

For your input, it returns:

对于您的输入,它返回:

0 5
2 4
3 7

#2


4  

You could try the below grep command. But note that , grep would display each match in separate new line. So you won't get the format like you mentioned in the question.

你可以尝试下面的grep命令。但请注意,grep会在单独的新行中显示每个匹配项。因此,您将无法获得问题中提到的格式。

$ grep -oP 'a=\K([0-9]+)|c=\K([0-9]+)' file
0
5
2
4
3
7

To get the mentioned format , you need to pass the output of grep to paste or any other commands .

要获得上述格式,您需要将grep的输出传递给paste或任何其他命令。

$ grep -oP 'a=\K([0-9]+)|c=\K([0-9]+)' file | paste -d' ' - -
0 5
2 4
3 7

#3


0  

use this :

用这个 :

awk -F[=,] '{print $2" "$6}' data.txt 

I am using the separators as = and ,, then spliting on them

我使用分隔符=和,,然后分裂它们