在文本字符串中搜索模式，然后提取匹配的模式。

I am trying to match and then extract a pattern from a text string. I need to extract any pattern that matches the following in the text string:

我正在尝试匹配，然后从文本字符串中提取一个模式。我需要提取在文本字符串中匹配以下内容的任何模式:

10289 20244

Text File:

文本文件:

KBOS 032354Z 19012KT 10SM FEW060 SCT200 BKN320 24/17 A3009 RMK AO2 SLP187 CB DSNT NW T02440172 10289 20244 53009

I am trying to achieve this using the following bash code:

我正在尝试使用以下bash代码来实现这一点:

Bash Code:

Bash代码:

cat text_file | grep -Eow '\s10[0-9].*\s' | head -n 4 | awk '{print $1}'

The above code attempts to search for any group of approximately five numeric characters that begin with 10 followed by three numeric characters. After matching this pattern, the code prints out the rest of text string, capturing the second group of five numeric characters, beginning with 20.

上面的代码尝试搜索任何一组大约5个数字字符，这些字符以10开头，后面跟着3个数字字符。在匹配此模式之后，代码打印出其余的文本字符串，捕获第二组5个数字字符，以20开头。

I need a better, more reliable way to accomplish this because currently, this code fails. The numeric groups I need are separated by a space. I have attempted to account for this by inserting \s into the grep portion of the code.

我需要一种更好、更可靠的方式来实现这一点，因为目前，这段代码失败了。我需要的数字组由空格分隔。我试图通过将\s插入到代码的grep部分来说明这一点。

3 个解决方案

#1

grep solution:

grep的解决方案:

grep -Eow '10[0-9]{3}\b.*\b20[0-9]{3}' text_file

The output:

输出:

10289 20244

[0-9]{3} - matches 3 digits

[0-9]{3} -匹配3位
\b - word boundary

\ b字的边界

#2

awk '{print $(NF-2),$(NF-1)}' text_file

10289 20244

Prints next to last and the one previous.

打印在最后和前一个。

#3

awk '$17 ~ /^10[0-9]{3}$/ && $18 ~ /^20[0-9]{3}$/ { print $17, $18 }' text_file

This will check field 17 for "10xxx" and field 18 for "20xxx", and when BOTH match, print them.

这将检查字段17为“10xxx”，字段18为“20xxx”，当两者匹配时，打印它们。

#1