如何在python中拆分非常长的正则表达式

i have a regular expression which is very long.

我有一个非常长的正则表达式。

 vpa_pattern = '(VAP) ([0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}): (.*)'

My code to match group as follows:

我的代码匹配组如下：

 class ReExpr:
def __init__(self):
    self.string=None

def search(self,regexp,string):
    self.string=string
    self.rematch = re.search(regexp, self.string)
    return bool(self.rematch)

def group(self,i):
    return self.rematch.group(i)

 m = ReExpr()

 if m.search(vpa_pattern,line):
    print m.group(1)
    print m.group(2)
    print m.group(3)

I tried to make the regular expression pattern to multiple line in following ways,

我尝试通过以下方式将正则表达式模式设置为多行，

vpa_pattern = '(VAP) \
    ([0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}):\
    (.*)'

Or Even i tried:

或者甚至我试过：

 vpa_pattern = re.compile(('(VAP) \
    ([0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}):\
    (.*)'))

But above methods are not working. For each group i have a space () after open and close parenthesis. I guess it is not picking up when i split to multiple lines.

但上述方法不起作用。对于每个组，我在打开和关闭括号后有一个空格（）。当我分成多行时，我猜它不会捡起来。

3 个解决方案

#1

Look at re.X flag. It allows comments and ignores white spaces in regex.

看看re.X标志。它允许注释并忽略正则表达式中的空格。

a = re.compile(r"""\d +  # the integral part
               \.    # the decimal point
               \d *  # some fractional digits""", re.X)

#2

Python allows writing text strings in parts if enclosed in parenthesis:

如果括在括号中，Python允许在部分中编写文本字符串：

>>> text = ("alfa" "beta"
... "gama")
...
>>> text
'alfabetagama'

or in your code:

或者在你的代码中：

text = ("alfa" "beta"
        "gama" "delta"
        "omega")
print text

will print

将打印

"alfabetagamadeltaomega"

#3

Its actually quite simple. You already use the {} notation. Use it again. So instead of:

它其实很简单。您已使用{}表示法。再次使用它。所以代替：

'([0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}):'

which is just a repeat of [0-9A-Fa-f]{2}: 6 times, you can use:

这只是[0-9A-Fa-f] {2}的重复：6次，你可以使用：

'([0-9A-Fa-f]{2}:){6}'

We can even simplify it further by using \d to represent digits:

我们甚至可以通过使用\ d来表示数字来进一步简化它：

'([\dA-Fa-f]{2}:){6}'

NOTE: Depending on what re function you use, you can pass in re.IGNORE_CASE and simplify that chunk down to [\da-f]{2}:

注意：根据您使用的函数，您可以传入re.IGNORE_CASE并将该块简化为[\ da-f] {2}：

So your final regex is:

所以你的最终正则表达式是：

'(VAP) ([\dA-Fa-f]{2}:){6} (.*)'

#1