Regex将在grep中查找“not后跟”

时间:2022-06-09 08:55:57

I am attempting to grep for all instances of Ui\. not followed by Line or even just the letter L

我正在尝试为所有Ui\实例提供grep。不是一行,甚至不是字母L

What is the proper way to write a regex for finding all instances of a particular string NOT followed by another string?

编写regex以查找特定字符串的所有实例而不跟随另一个字符串的正确方法是什么?

Using lookaheads

使用超前

grep "Ui\.(?!L)" *
bash: !L: event not found


grep "Ui\.(?!(Line))" *
nothing

6 个解决方案

#1


109  

Negative lookahead, which is what you're after, requires a more powerful tool than the standard grep. You need a PCRE-enabled grep.

消极的前瞻是您所追求的,它需要比标准grep更强大的工具。您需要一个支持pci的grep。

If you have GNU grep, the current version supports options -P or --perl-regexp and you can then use the regex you wanted.

如果您有GNU grep,当前版本支持选项-P或-perl-regexp,然后您可以使用您想要的regex。

If you don't have (a sufficiently recent version of) GNU grep, then consider getting ack.

如果您没有GNU grep的最新版本,那么请考虑使用ack。

#2


32  

The answer to part of your problem is here, and ack would behave the same way: Ack & negative lookahead giving errors

问题的部分答案在这里,ack的行为方式是相同的:ack & negative lookahead会产生错误

You are using double-quotes for grep, which permits bash to "interpret ! as history expand command."

您正在为grep使用双引号,它允许bash“解释!”为扩大命令历史。”

You need to wrap your pattern in SINGLE-QUOTES: grep 'Ui\.(?!L)' *

您需要用单引号来包装您的模式:grep 'Ui\ (?!L)' *

However, see @JonathanLeffler's answer to address the issues with negative lookaheads in standard grep!

但是,请参阅@JonathanLeffler的回答,以解决标准grep中带有负面lookahead的问题!

#3


6  

You probably cant perform standard negative lookaheads using grep, but usually you should be able to get equivalent behaviour using the "inverse" switch '-v'. Using that you can construct a regex for the complement of what you want to match and then pipe it through 2 greps.

您可能无法使用grep执行标准的阴性lookahead,但是通常您应该能够使用“逆”开关“-v”获得等效的行为。使用它,您可以为您想要匹配的内容构建一个regex,然后通过两个greps对其进行管道传输。

For the regex in question you might do something like

对于所讨论的regex,您可以执行以下操作

grep 'Ui\.' * | grep -v 'Ui\.L'

#4


3  

If you need to use a regex implementation that doesn't support negative lookaheads and you don't mind matching extra character(s)*, then you can use negated character classes [^L], alternation |, and the end of string anchor $.

如果你需要使用一个正则表达式实现不支持负超前和你不介意匹配额外字符(s)*,那么您可以使用否定字符类(L ^),交替|,并锚定美元结束的字符串。

In your case grep 'Ui\.\([^L]\|$\)' * does the job.

在你的情况中grep的Ui \ \([^ L]\ | $ \)”*的工作。

  • Ui\. matches the string you're interested in

    Ui \。匹配您感兴趣的字符串

  • \([^L]\|$\) matches any single character other than L or it matches the end of the line: [^L] or $.

    \(^[L]\ | $ \)匹配任何单个的字符以外的L或者它匹配的结束:[^ L]或美元。

If you want to exclude more than just one character, then you just need to throw more alternation and negation at it. To find a not followed by bc:

如果你想要排除一个以上的角色,那么你只需要对它进行更多的修改和否定。找到a后面跟着bc:

grep 'a\(\([^b]\|$\)\|\(b\([^c]\|$\)\)\)' *

grep的\ \(b[^]\ | $ \)\ | \ b \(^[c]\ | $ \)\)\)”*

Which is either (a followed by not b or followed by the end of the line: a then [^b] or $) or (a followed by b which is either followed by not c or is followed by the end of the line: a then b, then [^c] or $.

要么是(其次是b或结束后跟线:然后[^ b]或美元)或(其次是b,要么是紧随其后的是c或结束后跟线:一个b,然后[^ c]或美元。

This kind of expression gets to be pretty unwieldy and error prone with even a short string. You could write something to generate the expressions for you, but it'd probably be easier to just use a regex implementation that supports negative lookaheads.

这种表达式变得非常笨拙,甚至出现了短字符串的错误倾向。您可以编写一些内容来为您生成表达式,但是使用支持负lookahead的regex实现可能更容易。

*If your implementation supports non-capturing groups then you can avoid capturing extra characters.

*如果您的实现支持非捕获组,则可以避免捕获额外的字符。

#5


1  

I think that this link can help you, first to understand how the regex works and second, how to built your regex: http://www.regular-expressions.info/tutorialcnt.html

我认为这个链接可以帮助您,首先了解regex是如何工作的,其次,如何构建您的regex: http://www.regular-expressions.info/tutorialcnt.html

#6


0  

If your grep doesn't support -P or --perl-regexp, and you can install PCRE-enabled grep, e.g. "pcregrep", than it won't need any command-line options like GNU grep to accept Perl-compatible regular expressions, you just run

如果您的grep不支持-P或-perl-regexp,并且您可以安装启用pcreare的grep,例如。“pcregrep”,它不需要任何命令行选项(比如GNU grep)来接受与perl兼容的正则表达式,只需运行即可

pcregrep "Ui\.(?!Line)"

You don't need another nested group for "Line" as in your example "Ui.(?!(Line))" -- the outer group is sufficient, like I've shown above.

您不需要像示例“Ui.(?!(行)”中那样为“Line”创建另一个嵌套组。外基团是充分的,就像我上面展示的。

Let me give you another example of looking negative assertions: when you have list of lines, returned by "ipset", each line showing number of packets in a middle of the line, and you don't need lines with zero packets, you just run:

让我再给你一个看负面断言的例子:当你有行列表,由“ipset”返回,每行显示行中间的包数,你不需要零包的行,你只需要运行:

ipset list | pcregrep "packets(?! 0 )"

If you like perl-compatible regular expressions and have perl but don't have pcregrep or your grep doesn't support --perl-regexp, you can you one-line perl scripts that work the same way like grep:

如果您喜欢perl兼容的正则表达式,并且拥有perl,但是没有pcregrep或者grep不支持您的grep——perl-regexp,那么您可以使用与grep类似的一行perl脚本:

perl -e "while (<>) {if (/Ui\.(?!Lines)/){print;};}"

Perl accepts stdin the same way like grep, e.g.

Perl接受stdin的方式与接受grep类似。

ipset list | perl -e "while (<>) {if (/packets(?! 0 )/){print;};}"

#1


109  

Negative lookahead, which is what you're after, requires a more powerful tool than the standard grep. You need a PCRE-enabled grep.

消极的前瞻是您所追求的,它需要比标准grep更强大的工具。您需要一个支持pci的grep。

If you have GNU grep, the current version supports options -P or --perl-regexp and you can then use the regex you wanted.

如果您有GNU grep,当前版本支持选项-P或-perl-regexp,然后您可以使用您想要的regex。

If you don't have (a sufficiently recent version of) GNU grep, then consider getting ack.

如果您没有GNU grep的最新版本,那么请考虑使用ack。

#2


32  

The answer to part of your problem is here, and ack would behave the same way: Ack & negative lookahead giving errors

问题的部分答案在这里,ack的行为方式是相同的:ack & negative lookahead会产生错误

You are using double-quotes for grep, which permits bash to "interpret ! as history expand command."

您正在为grep使用双引号,它允许bash“解释!”为扩大命令历史。”

You need to wrap your pattern in SINGLE-QUOTES: grep 'Ui\.(?!L)' *

您需要用单引号来包装您的模式:grep 'Ui\ (?!L)' *

However, see @JonathanLeffler's answer to address the issues with negative lookaheads in standard grep!

但是,请参阅@JonathanLeffler的回答,以解决标准grep中带有负面lookahead的问题!

#3


6  

You probably cant perform standard negative lookaheads using grep, but usually you should be able to get equivalent behaviour using the "inverse" switch '-v'. Using that you can construct a regex for the complement of what you want to match and then pipe it through 2 greps.

您可能无法使用grep执行标准的阴性lookahead,但是通常您应该能够使用“逆”开关“-v”获得等效的行为。使用它,您可以为您想要匹配的内容构建一个regex,然后通过两个greps对其进行管道传输。

For the regex in question you might do something like

对于所讨论的regex,您可以执行以下操作

grep 'Ui\.' * | grep -v 'Ui\.L'

#4


3  

If you need to use a regex implementation that doesn't support negative lookaheads and you don't mind matching extra character(s)*, then you can use negated character classes [^L], alternation |, and the end of string anchor $.

如果你需要使用一个正则表达式实现不支持负超前和你不介意匹配额外字符(s)*,那么您可以使用否定字符类(L ^),交替|,并锚定美元结束的字符串。

In your case grep 'Ui\.\([^L]\|$\)' * does the job.

在你的情况中grep的Ui \ \([^ L]\ | $ \)”*的工作。

  • Ui\. matches the string you're interested in

    Ui \。匹配您感兴趣的字符串

  • \([^L]\|$\) matches any single character other than L or it matches the end of the line: [^L] or $.

    \(^[L]\ | $ \)匹配任何单个的字符以外的L或者它匹配的结束:[^ L]或美元。

If you want to exclude more than just one character, then you just need to throw more alternation and negation at it. To find a not followed by bc:

如果你想要排除一个以上的角色,那么你只需要对它进行更多的修改和否定。找到a后面跟着bc:

grep 'a\(\([^b]\|$\)\|\(b\([^c]\|$\)\)\)' *

grep的\ \(b[^]\ | $ \)\ | \ b \(^[c]\ | $ \)\)\)”*

Which is either (a followed by not b or followed by the end of the line: a then [^b] or $) or (a followed by b which is either followed by not c or is followed by the end of the line: a then b, then [^c] or $.

要么是(其次是b或结束后跟线:然后[^ b]或美元)或(其次是b,要么是紧随其后的是c或结束后跟线:一个b,然后[^ c]或美元。

This kind of expression gets to be pretty unwieldy and error prone with even a short string. You could write something to generate the expressions for you, but it'd probably be easier to just use a regex implementation that supports negative lookaheads.

这种表达式变得非常笨拙,甚至出现了短字符串的错误倾向。您可以编写一些内容来为您生成表达式,但是使用支持负lookahead的regex实现可能更容易。

*If your implementation supports non-capturing groups then you can avoid capturing extra characters.

*如果您的实现支持非捕获组,则可以避免捕获额外的字符。

#5


1  

I think that this link can help you, first to understand how the regex works and second, how to built your regex: http://www.regular-expressions.info/tutorialcnt.html

我认为这个链接可以帮助您,首先了解regex是如何工作的,其次,如何构建您的regex: http://www.regular-expressions.info/tutorialcnt.html

#6


0  

If your grep doesn't support -P or --perl-regexp, and you can install PCRE-enabled grep, e.g. "pcregrep", than it won't need any command-line options like GNU grep to accept Perl-compatible regular expressions, you just run

如果您的grep不支持-P或-perl-regexp,并且您可以安装启用pcreare的grep,例如。“pcregrep”,它不需要任何命令行选项(比如GNU grep)来接受与perl兼容的正则表达式,只需运行即可

pcregrep "Ui\.(?!Line)"

You don't need another nested group for "Line" as in your example "Ui.(?!(Line))" -- the outer group is sufficient, like I've shown above.

您不需要像示例“Ui.(?!(行)”中那样为“Line”创建另一个嵌套组。外基团是充分的,就像我上面展示的。

Let me give you another example of looking negative assertions: when you have list of lines, returned by "ipset", each line showing number of packets in a middle of the line, and you don't need lines with zero packets, you just run:

让我再给你一个看负面断言的例子:当你有行列表,由“ipset”返回,每行显示行中间的包数,你不需要零包的行,你只需要运行:

ipset list | pcregrep "packets(?! 0 )"

If you like perl-compatible regular expressions and have perl but don't have pcregrep or your grep doesn't support --perl-regexp, you can you one-line perl scripts that work the same way like grep:

如果您喜欢perl兼容的正则表达式,并且拥有perl,但是没有pcregrep或者grep不支持您的grep——perl-regexp,那么您可以使用与grep类似的一行perl脚本:

perl -e "while (<>) {if (/Ui\.(?!Lines)/){print;};}"

Perl accepts stdin the same way like grep, e.g.

Perl接受stdin的方式与接受grep类似。

ipset list | perl -e "while (<>) {if (/packets(?! 0 )/){print;};}"