I have a file test.txt
, in which there are some formatted phone numbers. I'm trying to use grep
to find the lines containing a phone number.
我有一个文件test.txt,其中有一些格式化的电话号码。我正在尝试使用grep来查找包含电话号码的行。
It seems that grep -e "[0-9]{3}-[0-9]{3}-[0-9]{4}" test.txt
doesn't work and gives no results. But grep -E "[0-9]{3}-[0-9]{3}-[0-9]{4}" test.txt
works. So I wonder what's the difference between these 2 options.
似乎grep -e“[0-9] {3} - [0-9] {3} - [0-9] {4}”test.txt不起作用并且没有给出结果。但是grep -E“[0-9] {3} - [0-9] {3} - [0-9] {4}”test.txtworks。所以我想知道这两个选项之间有什么区别。
According to man grep
:
根据男子grep:
-E, --extended-regexp Interpret pattern as an extended regular expression (i.e. force grep to behave as egrep).
-E, - extended-regexp将模式解释为扩展正则表达式(即强制grep表现为egrep)。
-e pattern, --regexp=pattern Specify a pattern used during the search of the input: an input line is selected if it matches any of the specified patterns. This option is most useful when multiple -e options are used to specify multiple patterns, or when a pattern begins with a dash (`-').
-e pattern, - rengexp = pattern指定搜索输入时使用的模式:如果输入行匹配任何指定的模式,则选择该输入行。当多个-e选项用于指定多个模式时,或者当模式以短划线(“ - ”)开头时,此选项最有用。
But I don't quite understand it. What is an extended regex?
但我不太明白。什么是扩展的正则表达式?
3 个解决方案
#1
6
As you mentioned, grep -E
is for extended regular expressions
whereas
. From the man page:-e
is for basic regular expressions
如前所述,grep -E用于扩展正则表达式,而-e用于基本正则表达式。从手册页:
EDIT: As Jonathan pointed out below, grep -e
"specifies that the following argument is (one of) the regular expression(s) to be matched."
编辑:正如乔纳森在下面指出的那样,grep -e“指定以下参数是(一个)要匹配的正则表达式。”
Basic vs Extended Regular Expressions
In basic regular expressions the meta-characters
?
,+
,{
,|
,(
, and)
lose their special meaning; instead use the backslashed versions\?
,\+
,\{
,\|
,\(
, and\)
.在基本的正则表达式中,元字符?,+,{,|,(和)失去了它们的特殊含义;而是使用反斜杠版本\?,\ +,\ {,\ |,\(和\)。
Traditional
egrep
did not support the{
meta-character, and someegrep
implementations support\{
instead, so portable scripts should avoid{
ingrep -E
patterns and should use[{]
to match a literal{
.传统的egrep不支持{元字符,而一些egrep实现支持\ {相反,所以可移植脚本应该避免{在grep -E模式中,并且应该使用[{]来匹配文字{。
GNU
grep -E
attempts to support traditional usage by assuming that{
is not special if it would be the start of an invalid interval specification. For example, the commandgrep -E '{1'
searches for the two-character string{1
instead of reporting a syntax error in the regular expression. POSIX.2 allows this behavior as an extension, but portable scripts should avoid it.GNU grep -E试图通过假设{如果它是无效区间规范的开始而不是特殊的话来支持传统用法。例如,命令grep -E'{1'搜索双字符字符串{1,而不是在正则表达式中报告语法错误。 POSIX.2允许此行为作为扩展,但可移植脚本应该避免它。
But man pages are pretty terse, so for further info, check out this link:
但是手册页非常简洁,所以有关详细信息,请查看以下链接:
http://www.regular-expressions.info/posix.html
The part of the manpage regarding the {
meta character though specifically talks about what you are seeing with respect to the difference.
关于{元字符的手册页的一部分,虽然专门讨论了你所看到的关于差异的内容。
grep -e "[0-9]{3}-[0-9]{3}-[0-9]{4}"
won't work because it is not treating the {
character as you expect. Whereas
将无法正常工作,因为它没有按照您的预期处理{字符。而
grep -E "[0-9]{3}-[0-9]{3}-[0-9]{4}"
does because that is the extended grep version — or the egrep
version for example.
因为那是扩展的grep版本 - 例如egrep版本。
#2
3
Here is a simple test:
这是一个简单的测试:
$ cat file
apple is a fruit
so is orange
but onion is not
$ grep -e 'but' -e 'fruit' file #Allows you to pass multiple patterns explicitly
apple is a fruit
but onion is not
$ grep -E 'is (a|not)' file #Allows you to use extended regular expressions like ?, +, | etc
apple is a fruit
but onion is not
#3
1
The -e
option to grep
simply says that the following argument is the regular expression. Thus:
grep的-e选项只是说下面的参数是正则表达式。从而:
grep -e 'some.*thing' -r -l .
looks for some
followed by thing
on a line in all the files in the current directory and all its sub-directories. The same could be achieved by:
在当前目录及其所有子目录的所有文件中查找一些后跟一行的东西。同样可以通过以下方式实现:
grep -r -l 'some.*thing' .
(On Linux, the situation is confused by the behaviour of GNU getopt()
which, unless you set POSIXLY_CORRECT in the environment, permutes options, so you could also run:
(在Linux上,这种情况被GNU getopt()的行为所困惑,除非你在环境中设置POSIXLY_CORRECT,否则它会置换选项,所以你也可以运行:
grep 'some.*thing' -r -l .
and get the same result. Under POSIX and other systems not using GNU getopt()
, options need to precede arguments, and the grep
would look for a file called -r
and another called -l
.)
并得到相同的结果。在POSIX和其他不使用GNU getopt()的系统下,选项需要在参数之前,而grep会查找名为-r的文件和另一个名为-l的文件。)
The -E
option changes the regular expressions from 'basic' to 'extended'. It can be used with -e
:
-E选项将正则表达式从“基本”更改为“扩展”。它可以和-e一起使用:
grep -e "[0-9]{3}-[0-9]{3}-[0-9]{4}" test.txt
grep -E -e "[0-9]{3}-[0-9]{3}-[0-9]{4}" test.txt
The ERE option means the same regular expressions, more or less, as used to be recognized by the egrep
command, which is no longer a part of POSIX (having been replaced by grep -E
, and fgrep
by grep -F
).
ERE选项意味着相同的正则表达式,或多或少,正如egrep命令所识别的那样,它不再是POSIX的一部分(已由grep -E替换,而gregre -F替换为fgrep)。
#1
6
As you mentioned, grep -E
is for extended regular expressions
whereas
. From the man page:-e
is for basic regular expressions
如前所述,grep -E用于扩展正则表达式,而-e用于基本正则表达式。从手册页:
EDIT: As Jonathan pointed out below, grep -e
"specifies that the following argument is (one of) the regular expression(s) to be matched."
编辑:正如乔纳森在下面指出的那样,grep -e“指定以下参数是(一个)要匹配的正则表达式。”
Basic vs Extended Regular Expressions
In basic regular expressions the meta-characters
?
,+
,{
,|
,(
, and)
lose their special meaning; instead use the backslashed versions\?
,\+
,\{
,\|
,\(
, and\)
.在基本的正则表达式中,元字符?,+,{,|,(和)失去了它们的特殊含义;而是使用反斜杠版本\?,\ +,\ {,\ |,\(和\)。
Traditional
egrep
did not support the{
meta-character, and someegrep
implementations support\{
instead, so portable scripts should avoid{
ingrep -E
patterns and should use[{]
to match a literal{
.传统的egrep不支持{元字符,而一些egrep实现支持\ {相反,所以可移植脚本应该避免{在grep -E模式中,并且应该使用[{]来匹配文字{。
GNU
grep -E
attempts to support traditional usage by assuming that{
is not special if it would be the start of an invalid interval specification. For example, the commandgrep -E '{1'
searches for the two-character string{1
instead of reporting a syntax error in the regular expression. POSIX.2 allows this behavior as an extension, but portable scripts should avoid it.GNU grep -E试图通过假设{如果它是无效区间规范的开始而不是特殊的话来支持传统用法。例如,命令grep -E'{1'搜索双字符字符串{1,而不是在正则表达式中报告语法错误。 POSIX.2允许此行为作为扩展,但可移植脚本应该避免它。
But man pages are pretty terse, so for further info, check out this link:
但是手册页非常简洁,所以有关详细信息,请查看以下链接:
http://www.regular-expressions.info/posix.html
The part of the manpage regarding the {
meta character though specifically talks about what you are seeing with respect to the difference.
关于{元字符的手册页的一部分,虽然专门讨论了你所看到的关于差异的内容。
grep -e "[0-9]{3}-[0-9]{3}-[0-9]{4}"
won't work because it is not treating the {
character as you expect. Whereas
将无法正常工作,因为它没有按照您的预期处理{字符。而
grep -E "[0-9]{3}-[0-9]{3}-[0-9]{4}"
does because that is the extended grep version — or the egrep
version for example.
因为那是扩展的grep版本 - 例如egrep版本。
#2
3
Here is a simple test:
这是一个简单的测试:
$ cat file
apple is a fruit
so is orange
but onion is not
$ grep -e 'but' -e 'fruit' file #Allows you to pass multiple patterns explicitly
apple is a fruit
but onion is not
$ grep -E 'is (a|not)' file #Allows you to use extended regular expressions like ?, +, | etc
apple is a fruit
but onion is not
#3
1
The -e
option to grep
simply says that the following argument is the regular expression. Thus:
grep的-e选项只是说下面的参数是正则表达式。从而:
grep -e 'some.*thing' -r -l .
looks for some
followed by thing
on a line in all the files in the current directory and all its sub-directories. The same could be achieved by:
在当前目录及其所有子目录的所有文件中查找一些后跟一行的东西。同样可以通过以下方式实现:
grep -r -l 'some.*thing' .
(On Linux, the situation is confused by the behaviour of GNU getopt()
which, unless you set POSIXLY_CORRECT in the environment, permutes options, so you could also run:
(在Linux上,这种情况被GNU getopt()的行为所困惑,除非你在环境中设置POSIXLY_CORRECT,否则它会置换选项,所以你也可以运行:
grep 'some.*thing' -r -l .
and get the same result. Under POSIX and other systems not using GNU getopt()
, options need to precede arguments, and the grep
would look for a file called -r
and another called -l
.)
并得到相同的结果。在POSIX和其他不使用GNU getopt()的系统下,选项需要在参数之前,而grep会查找名为-r的文件和另一个名为-l的文件。)
The -E
option changes the regular expressions from 'basic' to 'extended'. It can be used with -e
:
-E选项将正则表达式从“基本”更改为“扩展”。它可以和-e一起使用:
grep -e "[0-9]{3}-[0-9]{3}-[0-9]{4}" test.txt
grep -E -e "[0-9]{3}-[0-9]{3}-[0-9]{4}" test.txt
The ERE option means the same regular expressions, more or less, as used to be recognized by the egrep
command, which is no longer a part of POSIX (having been replaced by grep -E
, and fgrep
by grep -F
).
ERE选项意味着相同的正则表达式,或多或少,正如egrep命令所识别的那样,它不再是POSIX的一部分(已由grep -E替换,而gregre -F替换为fgrep)。