从右到左匹配regex ?

时间:2022-06-05 03:30:26

Is there any way of matching a regex from right to left? What Im looking for is a regex that gets

是否有办法从右到左匹配一个regex ?我要找的是一个得到的正则表达式

MODULE WAS INSERTED              EVENT
LOST SIGNAL ON E1/T1 LINK        OFF
CRC ERROR                        EVENT
CLK IS DIFF FROM MASTER CLK SRC  OF

from this input

从这个输入

CLI MUX trap received: (022) CL-B  MCL-2ETH             MODULE WAS INSERTED              EVENT   07-05-2010 12:08:40
CLI MUX trap received: (090) IO-2  ML-1E1        EX1    LOST SIGNAL ON E1/T1 LINK        OFF     04-06-2010 09:58:58
CLI MUX trap received: (094) IO-2  ML-1E1        EX1    CRC ERROR                        EVENT   04-06-2010 09:58:59
CLI MUX trap received: (009)                            CLK IS DIFF FROM MASTER CLK SRC  OFF     07-05-2010 12:07:32

If i could have done the matching from right to left I could have written something like everything to right of (EVENT|OFF) until the second appearance of more than one space [ ]+

如果我可以从右到左进行匹配,那么我就可以将所有东西写到右(事件|OFF),直到第二个出现多个空格[]+

The best I managed today is to get everything from (022) to EVENT with the regex

我今天做的最好的事情是从(022)到regex事件

CLI MUX trap received: \([0-9]+\)[ ]+(.*[  ]+(EVENT|OFF))

But that is not really what I wanted :)

但那不是我想要的:)

edit: What language its for? Its actually a config string for a filter we have but my guess it is using standard GNU C Regex library.

编辑:什么语言?它实际上是一个过滤器的配置字符串,但是我猜它使用的是标准的GNU C Regex库。

edit2: I like the answers about cutting by length but Amarghosh was probably more what I was looking for. Do not really know why I did not think about just cutting on length like:

我喜欢关于长度切割的答案,但是Amarghosh可能是我想要的。不知道为什么我没有想过要剪长,就像:

^.{56}(.{39}).*$

Super thanks for the quick answers...

非常感谢您的快速回答……

6 个解决方案

#1


2  

If tokens are guaranteed to be separated by more than one space and words within the string before EVENT|OFF are guaranteed to be separated by just one space - only then you can look for single-space-separated words followed by spaces followed by EVENT or OFF

如果保证令牌在事件|之前被多个空格和单词分隔开,那么就保证只有一个空格隔开,只有这样,你才能查找空格分隔的单词,然后是空格,然后是事件或OFF。

var s = "CLI MUX trap received: (022) CL-B  MCL-2ETH             MODULE WAS INSERTED              EVENT   07-05-2010 12:08:40"
        + "\nCLI MUX trap received: (090) IO-2  ML-1E1        EX1    LOST SIGNAL ON E1/T1 LINK        OFF     04-06-2010 09:58:58"
        + "\nCLI MUX trap received: (094) IO-2  ML-1E1        EX1    CRC ERROR                        EVENT   04-06-2010 09:58:59"
        + "\nCLI MUX trap received: (009)                            CLK IS DIFF FROM MASTER CLK SRC  OFF     07-05-2010 12:07:32"

var r = /\([0-9]+\).+?((?:[^ ]+ )* +(?:EVENT|OFF))/g;
var m;
while((m = r.exec(s)) != null)
  console.log(m[1]);

Output:

输出:

MODULE WAS INSERTED              EVENT
LOST SIGNAL ON E1/T1 LINK        OFF
CRC ERROR                        EVENT
CLK IS DIFF FROM MASTER CLK SRC  OFF

Regex: /\([0-9]+\).+?((?:[^ ]+ )* +(?:EVENT|OFF))/g

Regex:/ \([0 - 9]+ \)。+ ?((?):[^]+)* +(?:事件|))/ g

\([0-9]+\)       #digits in parentheses followed by  
.+?              #some characters - minimum required (non-greedy)  
(                #start capturing 
(?:[^ ]+ )*      #non-space characters separated by a space  
` +`             #more spaces (separating string and event/off - 
                 #backticks added for emphasis), followed by
(?:EVENT|OFF)    #EVENT or OFF
)                #stop capturing

#2


16  

In .NET you could use the RightToLeft option :

在。net中,你可以使用右向左选项:

Regex RE = new Regex(Pattern, RegexOptions.RightToLeft);
Match theMatch = RE.Match(Source);

#3


3  

With regex, you could simply replace this:

使用regex,只需替换以下内容:

^.{56}|.{19}$

with the empty string.

空字符串。

But really, you only need to cut out the string from "position 56" to "string-length - 19" with a substring function. That's easier and much faster than regex.

但实际上,您只需要使用子字符串函数将字符串从“56”裁剪到“string-length - 19”。这比regex更简单,也快得多。

Here's an example in JavaScript, other languages work more or less the same:

这里有一个JavaScript的例子,其他语言或多或少都是一样的:

var lines = [
  'CLI MUX trap received: (022) CL-B  MCL-2ETH             MODULE WAS INSERTED              EVENT   07-05-2010 12:08:40',
  'CLI MUX trap received: (090) IO-2  ML-1E1        EX1    LOST SIGNAL ON E1/T1 LINK        OFF     04-06-2010 09:58:58',
  'CLI MUX trap received: (094) IO-2  ML-1E1        EX1    CRC ERROR                        EVENT   04-06-2010 09:58:59',
  'CLI MUX trap received: (009)                            CLK IS DIFF FROM MASTER CLK SRC  OFF     07-05-2010 12:07:32'
];
for (var i=0; i<lines.length; i++) {
  alert( lines[i].substring(56, lines[i].length-19) );
}

#4


1  

Does the input file fit nicely into fixed width tabular text like this? Because if it does, then the simplest solution is to just take the right substring of each line, from column 56 to column 94.

输入文件是否适合于像这样的固定宽度列表文本?因为如果是这样,那么最简单的解就是从第56列到第94列,取每一行的右子串。

In Unix, you can use the cut command:

在Unix中,可以使用cut命令:

cut -c56-94 yourfile

See also


In Java, you can write something like this:

在Java中,可以这样写:

String[] lines = {
    "CLI MUX trap received: (022) CL-B  MCL-2ETH             MODULE WAS INSERTED              EVENT   07-05-2010 12:08:40",
    "CLI MUX trap received: (090) IO-2  ML-1E1        EX1    LOST SIGNAL ON E1/T1 LINK        OFF     04-06-2010 09:58:58",
    "CLI MUX trap received: (094) IO-2  ML-1E1        EX1    CRC ERROR                        EVENT   04-06-2010 09:58:59",
    "CLI MUX trap received: (009)                            CLK IS DIFF FROM MASTER CLK SRC  OFF     07-05-2010 12:07:32",
};
for (String line : lines) {
    System.out.println(line.substring(56, 94));
}

This prints:

这个打印:

MODULE WAS INSERTED              EVENT
LOST SIGNAL ON E1/T1 LINK        OFF  
CRC ERROR                        EVENT
CLK IS DIFF FROM MASTER CLK SRC  OFF  

A regex solution

This is most likely not necessary, but something like this works (as seen on ideone.com):

这很可能不是必要的,但类似的东西很有用(如ideone.com上看到的):

line.replaceAll(".*  \\b(.+  .+)   \\S+ \\S+", "$1")

As you can see, it's not very readable, and you have to know your regex to really understand what's going on.

正如您所看到的,它不是很容易读懂,您必须了解您的regex才能真正了解发生了什么。

Essentially you match this to each line:

本质上,你把它和每一行匹配起来:

.*  \b(.+  .+)   \S+ \S+

And you replace it with whatever group 1 matched. This relies on the usage of two consecutive spaces exclusively for separating the columns in this table.

把它替换成1组匹配的。这完全依赖于使用两个连续的空格来分隔该表中的列。

#5


0  

How about

如何

.{56}(.*(EVENT|OFF))

#6


0  

Can you do field-oriented processing, rather than a regex? In awk/sh, this would look like:

你能做面向字段的处理而不是正则表达式吗?在awk/sh中,这看起来是:

< $datafile awk '{ print $(NF-3), $(NF-2) }' | column

< $丢失awk ' {打印(NF-3),美元(NF-2)}”|列

which seems rather cleaner than specifying a regex.

这似乎比指定regex更简洁。

#1


2  

If tokens are guaranteed to be separated by more than one space and words within the string before EVENT|OFF are guaranteed to be separated by just one space - only then you can look for single-space-separated words followed by spaces followed by EVENT or OFF

如果保证令牌在事件|之前被多个空格和单词分隔开,那么就保证只有一个空格隔开,只有这样,你才能查找空格分隔的单词,然后是空格,然后是事件或OFF。

var s = "CLI MUX trap received: (022) CL-B  MCL-2ETH             MODULE WAS INSERTED              EVENT   07-05-2010 12:08:40"
        + "\nCLI MUX trap received: (090) IO-2  ML-1E1        EX1    LOST SIGNAL ON E1/T1 LINK        OFF     04-06-2010 09:58:58"
        + "\nCLI MUX trap received: (094) IO-2  ML-1E1        EX1    CRC ERROR                        EVENT   04-06-2010 09:58:59"
        + "\nCLI MUX trap received: (009)                            CLK IS DIFF FROM MASTER CLK SRC  OFF     07-05-2010 12:07:32"

var r = /\([0-9]+\).+?((?:[^ ]+ )* +(?:EVENT|OFF))/g;
var m;
while((m = r.exec(s)) != null)
  console.log(m[1]);

Output:

输出:

MODULE WAS INSERTED              EVENT
LOST SIGNAL ON E1/T1 LINK        OFF
CRC ERROR                        EVENT
CLK IS DIFF FROM MASTER CLK SRC  OFF

Regex: /\([0-9]+\).+?((?:[^ ]+ )* +(?:EVENT|OFF))/g

Regex:/ \([0 - 9]+ \)。+ ?((?):[^]+)* +(?:事件|))/ g

\([0-9]+\)       #digits in parentheses followed by  
.+?              #some characters - minimum required (non-greedy)  
(                #start capturing 
(?:[^ ]+ )*      #non-space characters separated by a space  
` +`             #more spaces (separating string and event/off - 
                 #backticks added for emphasis), followed by
(?:EVENT|OFF)    #EVENT or OFF
)                #stop capturing

#2


16  

In .NET you could use the RightToLeft option :

在。net中,你可以使用右向左选项:

Regex RE = new Regex(Pattern, RegexOptions.RightToLeft);
Match theMatch = RE.Match(Source);

#3


3  

With regex, you could simply replace this:

使用regex,只需替换以下内容:

^.{56}|.{19}$

with the empty string.

空字符串。

But really, you only need to cut out the string from "position 56" to "string-length - 19" with a substring function. That's easier and much faster than regex.

但实际上,您只需要使用子字符串函数将字符串从“56”裁剪到“string-length - 19”。这比regex更简单,也快得多。

Here's an example in JavaScript, other languages work more or less the same:

这里有一个JavaScript的例子,其他语言或多或少都是一样的:

var lines = [
  'CLI MUX trap received: (022) CL-B  MCL-2ETH             MODULE WAS INSERTED              EVENT   07-05-2010 12:08:40',
  'CLI MUX trap received: (090) IO-2  ML-1E1        EX1    LOST SIGNAL ON E1/T1 LINK        OFF     04-06-2010 09:58:58',
  'CLI MUX trap received: (094) IO-2  ML-1E1        EX1    CRC ERROR                        EVENT   04-06-2010 09:58:59',
  'CLI MUX trap received: (009)                            CLK IS DIFF FROM MASTER CLK SRC  OFF     07-05-2010 12:07:32'
];
for (var i=0; i<lines.length; i++) {
  alert( lines[i].substring(56, lines[i].length-19) );
}

#4


1  

Does the input file fit nicely into fixed width tabular text like this? Because if it does, then the simplest solution is to just take the right substring of each line, from column 56 to column 94.

输入文件是否适合于像这样的固定宽度列表文本?因为如果是这样,那么最简单的解就是从第56列到第94列,取每一行的右子串。

In Unix, you can use the cut command:

在Unix中,可以使用cut命令:

cut -c56-94 yourfile

See also


In Java, you can write something like this:

在Java中,可以这样写:

String[] lines = {
    "CLI MUX trap received: (022) CL-B  MCL-2ETH             MODULE WAS INSERTED              EVENT   07-05-2010 12:08:40",
    "CLI MUX trap received: (090) IO-2  ML-1E1        EX1    LOST SIGNAL ON E1/T1 LINK        OFF     04-06-2010 09:58:58",
    "CLI MUX trap received: (094) IO-2  ML-1E1        EX1    CRC ERROR                        EVENT   04-06-2010 09:58:59",
    "CLI MUX trap received: (009)                            CLK IS DIFF FROM MASTER CLK SRC  OFF     07-05-2010 12:07:32",
};
for (String line : lines) {
    System.out.println(line.substring(56, 94));
}

This prints:

这个打印:

MODULE WAS INSERTED              EVENT
LOST SIGNAL ON E1/T1 LINK        OFF  
CRC ERROR                        EVENT
CLK IS DIFF FROM MASTER CLK SRC  OFF  

A regex solution

This is most likely not necessary, but something like this works (as seen on ideone.com):

这很可能不是必要的,但类似的东西很有用(如ideone.com上看到的):

line.replaceAll(".*  \\b(.+  .+)   \\S+ \\S+", "$1")

As you can see, it's not very readable, and you have to know your regex to really understand what's going on.

正如您所看到的,它不是很容易读懂,您必须了解您的regex才能真正了解发生了什么。

Essentially you match this to each line:

本质上,你把它和每一行匹配起来:

.*  \b(.+  .+)   \S+ \S+

And you replace it with whatever group 1 matched. This relies on the usage of two consecutive spaces exclusively for separating the columns in this table.

把它替换成1组匹配的。这完全依赖于使用两个连续的空格来分隔该表中的列。

#5


0  

How about

如何

.{56}(.*(EVENT|OFF))

#6


0  

Can you do field-oriented processing, rather than a regex? In awk/sh, this would look like:

你能做面向字段的处理而不是正则表达式吗?在awk/sh中,这看起来是:

< $datafile awk '{ print $(NF-3), $(NF-2) }' | column

< $丢失awk ' {打印(NF-3),美元(NF-2)}”|列

which seems rather cleaner than specifying a regex.

这似乎比指定regex更简洁。