
时间:2021-10-14 20:47:00

I have two strings a and b here:


irb(main):022:0> a
=> "[:error] [pid 10101:tid 139953357145856] 2015-03-15 20:33:44,848 pid  10101 tid 139953357145856 INFO     env      Using"

irb(main):023:0> b
=> "[:error] [pid 10101:tid 139953357145856] 2015-03-15 20:33:45,712 pid  10101 tid 139953357145856 ERROR     env      Using"

I want to write a regex that can ignore a and match b.


In string a, ':error' is followed by 'INFO'.


In the second string b, ':error' is followed by 'ERROR'


I have tried this



But the regex will return match for both a and b


The use of match is a must because I am trying to pass the regex to a sensu script (https://github.com/sensu/sensu-community-plugins/blob/master/plugins/logging/check-log.rb#L189)

匹配的使用是必须的,因为我试图将正则表达式传递给sensu脚本(https://github.com/sensu/sensu-community-plugins/blob/master/plugins/logging/check-log.rb# L189)

2 个解决方案



The preceding .* should be placed inside of the lookahead assertion ...



Rubular — Also, I would consider using word boundaries.

Rubular - 另外,我会考虑使用单词边界。



You can match 'error' twice instead.




As pointed out by Cary Swoveland, this will also match INFO log entries containing "ERROR" string inside as you can see below:

正如Cary Swoveland所指出的,这也将匹配包含“ERROR”字符串的INFO日志条目,如下所示:

irb(main):035:0> "error INFO ERROR".match(".*error.*ERROR.*")
=> #<MatchData "error INFO ERROR">

irb(main):036:0> "error ERROR INFO".match(".*error.*ERROR.*") # <-- HERE
=> #<MatchData "error ERROR INFO">

irb(main):037:0> "error INFO Praesent quis nisl posuere.".match(".*error.*ERROR.*")
=> nil

It will also happen with your initial regexp - skipping errors that contain the INFO string, like you can see below too:

您的初始正则表达式也会发生 - 跳过包含INFO字符串的错误,如下所示:

irb(main):048:0> "error INFO ERROR".match(".*error(?!.*INFO).*")
=> nil

irb(main):049:0> "error ERROR INFO".match(".*error(?!.*INFO).*")
=> nil

irb(main):050:0> "error INFO Praesent quis nisl posuere.".match(".*error(?!.*INFO).*")
=> nil

To avoid skipping or matching incorrect log entries I would rely in more parts of that string.


For that, getting your two initial samples, I would rely in the timestamp, check it out:


irb(main):055:0> "[:error] [pid 10101:tid 139953357145856] 2015-03-15 20:33:44,848 pid  10101 tid 139953357145856 INFO     env      Using ERROR".match(".*error(?!.*[0-9] INFO).*")
=> nil

irb(main):056:0> "[:error] [pid 10101:tid 139953357145856] 2015-03-15 20:33:44,848 pid  10101 tid 139953357145856 INFO     env      Using ERROR".match(".*error(?!.*[0-9] INFO).*")
=> nil

irb(main):057:0> "[:error] [pid 10101:tid 139953357145856] 2015-03-15 20:33:45,712 pid  10101 tid 139953357145856 ERROR     env      Using INFO".match(".*error(?!.*[0-9] INFO).*")
=> #<MatchData "[:error] [pid 10101:tid 139953357145856] 2015-03-15 20:33:45,712 pid  10101 tid 139953357145856 ERROR     env      Using INFO">

irb(main):058:0> "[:error] [pid 10101:tid 139953357145856] 2015-03-15 20:33:45,712 pid  10101 tid 139953357145856 ERROR     env      Using INFO".match(".*error(?!.*[0-9] INFO).*")
=> #<MatchData "[:error] [pid 10101:tid 139953357145856] 2015-03-15 20:33:45,712 pid  10101 tid 

So, my final version would be: ".*error(?!.*[0-9] INFO).*".

所以,我的最终版本是:“。*错误(?!。* [0-9] INFO)。*”。



The preceding .* should be placed inside of the lookahead assertion ...



Rubular — Also, I would consider using word boundaries.

Rubular - 另外,我会考虑使用单词边界。



You can match 'error' twice instead.




As pointed out by Cary Swoveland, this will also match INFO log entries containing "ERROR" string inside as you can see below:

正如Cary Swoveland所指出的,这也将匹配包含“ERROR”字符串的INFO日志条目,如下所示:

irb(main):035:0> "error INFO ERROR".match(".*error.*ERROR.*")
=> #<MatchData "error INFO ERROR">

irb(main):036:0> "error ERROR INFO".match(".*error.*ERROR.*") # <-- HERE
=> #<MatchData "error ERROR INFO">

irb(main):037:0> "error INFO Praesent quis nisl posuere.".match(".*error.*ERROR.*")
=> nil

It will also happen with your initial regexp - skipping errors that contain the INFO string, like you can see below too:

您的初始正则表达式也会发生 - 跳过包含INFO字符串的错误,如下所示:

irb(main):048:0> "error INFO ERROR".match(".*error(?!.*INFO).*")
=> nil

irb(main):049:0> "error ERROR INFO".match(".*error(?!.*INFO).*")
=> nil

irb(main):050:0> "error INFO Praesent quis nisl posuere.".match(".*error(?!.*INFO).*")
=> nil

To avoid skipping or matching incorrect log entries I would rely in more parts of that string.


For that, getting your two initial samples, I would rely in the timestamp, check it out:


irb(main):055:0> "[:error] [pid 10101:tid 139953357145856] 2015-03-15 20:33:44,848 pid  10101 tid 139953357145856 INFO     env      Using ERROR".match(".*error(?!.*[0-9] INFO).*")
=> nil

irb(main):056:0> "[:error] [pid 10101:tid 139953357145856] 2015-03-15 20:33:44,848 pid  10101 tid 139953357145856 INFO     env      Using ERROR".match(".*error(?!.*[0-9] INFO).*")
=> nil

irb(main):057:0> "[:error] [pid 10101:tid 139953357145856] 2015-03-15 20:33:45,712 pid  10101 tid 139953357145856 ERROR     env      Using INFO".match(".*error(?!.*[0-9] INFO).*")
=> #<MatchData "[:error] [pid 10101:tid 139953357145856] 2015-03-15 20:33:45,712 pid  10101 tid 139953357145856 ERROR     env      Using INFO">

irb(main):058:0> "[:error] [pid 10101:tid 139953357145856] 2015-03-15 20:33:45,712 pid  10101 tid 139953357145856 ERROR     env      Using INFO".match(".*error(?!.*[0-9] INFO).*")
=> #<MatchData "[:error] [pid 10101:tid 139953357145856] 2015-03-15 20:33:45,712 pid  10101 tid 

So, my final version would be: ".*error(?!.*[0-9] INFO).*".

所以,我的最终版本是:“。*错误(?!。* [0-9] INFO)。*”。