正则表达式 - 与awk中的字符串部分完全匹配

时间:2022-09-13 11:15:09

I have a file where one column contains strings that are composed of characters separated by , example:

我有一个文件,其中一列包含由分隔的字符组成的字符串,例如:

a123456, a54321, a12312

I need to find lines that contain a specific number in the comma separated list. example: I want to find all lines that contain only a12345.

我需要在逗号分隔列表中找到包含特定数字的行。示例:我想查找仅包含a12345的所有行。

I tried to use the following:

我试着使用以下内容:

awk ' $1~/a12345/ {print}'

but this prints out the line containing:

但这打印出包含以下内容的行:

a123456, a54321, a12312

because the regex is matching the first 6 characters in a123456, I guess.

因为正则表达式匹配a123456中的前6个字符,我猜。

My question is, how can I make an regex that will only print out the lines that contain only an exact match?

我的问题是,如何制作一个仅打印出仅包含完全匹配的行的正则表达式?

3 个解决方案

#1


0  

Try using word match of grep like below:

尝试使用grep的单词匹配,如下所示:

grep -w a123456 myfile.txt

if you need in field that just starts, then use something like:

如果你需要刚开始的领域,那么使用类似的东西:

egrep -w ^a123456 myfile.txt

#2


1  

$ awk '/(^|[^[:alnum:]])a12345([^[:alnum:]]|$)/' file
$ awk '/(^|[^[:alnum:]])a123456([^[:alnum:]]|$)/' file
a123456, a54321, a12312

With GNU awk you could use word-delimiters:

使用GNU awk,您可以使用单词分隔符:

$ awk '/\<a12345\>/' file
$ awk '/\<a123456\>/' file
a123456, a54321, a12312

#3


0  

With awk:

用awk:

awk -F ',\\s*' '$1 == "a12345"' filename

To split the line along commas (optionally followed by whitespace) and select only those lines whose first field is exactly "a12345". This will work even if the field contains characters after "a12345" that count as a word boundary, which is to say that

要沿着逗号分隔该行(可选地后跟空格),只选择那些第一个字段正好是“a12345”的行。即使该字段包含“a12345”之后的字符,即字数边界,也就是说,这将起作用

a12345.foo, bar, baz

is filtered out.

被过滤掉了。

If more than a single field is to be tested, then you'll have to test all fields:

如果要测试多个字段,则必须测试所有字段:

awk -F ',\\s*' 'function check() { for(i = 1; i <= NF; ++i) { if($i == "a12345") return 1; } return 0 } check()' filename

#1


0  

Try using word match of grep like below:

尝试使用grep的单词匹配,如下所示:

grep -w a123456 myfile.txt

if you need in field that just starts, then use something like:

如果你需要刚开始的领域,那么使用类似的东西:

egrep -w ^a123456 myfile.txt

#2


1  

$ awk '/(^|[^[:alnum:]])a12345([^[:alnum:]]|$)/' file
$ awk '/(^|[^[:alnum:]])a123456([^[:alnum:]]|$)/' file
a123456, a54321, a12312

With GNU awk you could use word-delimiters:

使用GNU awk,您可以使用单词分隔符:

$ awk '/\<a12345\>/' file
$ awk '/\<a123456\>/' file
a123456, a54321, a12312

#3


0  

With awk:

用awk:

awk -F ',\\s*' '$1 == "a12345"' filename

To split the line along commas (optionally followed by whitespace) and select only those lines whose first field is exactly "a12345". This will work even if the field contains characters after "a12345" that count as a word boundary, which is to say that

要沿着逗号分隔该行(可选地后跟空格),只选择那些第一个字段正好是“a12345”的行。即使该字段包含“a12345”之后的字符,即字数边界,也就是说,这将起作用

a12345.foo, bar, baz

is filtered out.

被过滤掉了。

If more than a single field is to be tested, then you'll have to test all fields:

如果要测试多个字段,则必须测试所有字段:

awk -F ',\\s*' 'function check() { for(i = 1; i <= NF; ++i) { if($i == "a12345") return 1; } return 0 } check()' filename