“什么都不匹配”的正则表达式语法?

时间:2022-08-03 20:24:39

I have a python template engine that heavily uses regexp. It uses concatenation like:

我有一个使用regexp的python模板引擎。它使用连接:

re.compile( regexp1 + "|" + regexp2 + "*|" + regexp3 + "+" )

I can modify individual substrings (regexp1, regexp2 etc).

我可以修改单独的子字符串(regexp1、regexp2等)。

Is there any small and light expression that matches nothing, which I can use inside a template where I don't want any matches? Unfortunately, sometimes '+' or '*' is appended to the regexp atom so I can't use empty string - a "nothing to repeat" error will rise.

是否有不匹配的小而轻的表达式,我可以在不需要匹配的模板中使用?不幸的是,有时'+'或'*'被附加到regexp原子上,因此我不能使用空字符串——“没什么可重复的”错误将会上升。

7 个解决方案

#1


79  

This shouldn't match anything:

任何事情都不匹配。

re.compile('$^')

So if you replace regexp1, regexp2 and regexp3 with '$^' it will be impossible to find a match. Unless you are using the multi line mode.

如果你更换regexp1,regexp2和regexp3 ' $ ^ '不可能找到一个匹配。除非你使用多行模式。


After some tests I found a better solution

经过一些测试,我找到了更好的解决方案

re.compile('a^')

It is impossible to match and will fail earlier than the previous solution. You can replace a with any other character and it will always be impossible to match

它是不可能匹配的,并且会比以前的解决方案更早地失败。你可以用任何其他字符替换a,它总是不可能匹配

#2


24  

(?!) should always fail to match. It is the zero-width negative look-ahead. If what is in the parentheses matches then the whole match fails. Given that it has nothing in it, it will fail the match for anything (including nothing).

(?!)总是不匹配。这是零宽度的负面展望。如果括号中的内容匹配,则整个匹配失败。考虑到它没有任何内容,它将无法匹配任何内容(包括任何内容)。

#3


15  

To match an empty string - even in multiline mode - you can use \A\Z, so:

为了匹配一个空字符串,即使是在多行模式中,你也可以使用\ \Z,所以:

re.compile('\A\Z|\A\Z*|\A\Z+')

The difference is that \A and \Z are start and end of string, whilst ^ and $ these can match start/end of lines, so $^|$^*|$^+ could potentially match a string containing newlines (if the flag is enabled).

所不同的是,A和\ \ Z是字符串的开始和结束,而^和$可以匹配的行开始/结束,所以$ ^ | $ ^ * | $ ^ +可能匹配字符串包含换行(如果启用了国旗)。

And to fail to match anything (even an empty string), simply attempt to find content before the start of the string, e.g:

如果不能匹配任何内容(即使是空字符串),只需在字符串开始之前查找内容,例如:

re.compile('.\A|.\A*|.\A+')

Since no characters can come before \A (by definition), this will always fail to match.

因为没有字符可以在\A(根据定义)之前出现,所以这总是不能匹配。

#4


3  

"()"

matches nothing and nothing only.

什么都不匹配,什么都不匹配。

#5


2  

Maybe '.{0}'?

也许“{ 0 }。”?

#6


1  

You could use
\z..
This is the absolute end of string, followed by two of anything

您可以使用\ z . .这是弦的绝对端点,后面跟着两个

If + or * is tacked on the end this still works refusing to match anything

如果+或*被钉在末尾,这仍然有效,不能匹配任何内容

#7


0  

Or, use some list comprehension to remove the useless regexp entries and join to put them all together. Something like:

或者,使用一些列表理解来删除无用的regexp条目并将它们组合在一起。喜欢的东西:

re.compile('|'.join([x for x in [regexp1, regexp2, ...] if x != None]))

Be sure to add some comments next to that line of code though :-)

请确保在代码行旁边添加一些注释:-)

#1


79  

This shouldn't match anything:

任何事情都不匹配。

re.compile('$^')

So if you replace regexp1, regexp2 and regexp3 with '$^' it will be impossible to find a match. Unless you are using the multi line mode.

如果你更换regexp1,regexp2和regexp3 ' $ ^ '不可能找到一个匹配。除非你使用多行模式。


After some tests I found a better solution

经过一些测试,我找到了更好的解决方案

re.compile('a^')

It is impossible to match and will fail earlier than the previous solution. You can replace a with any other character and it will always be impossible to match

它是不可能匹配的,并且会比以前的解决方案更早地失败。你可以用任何其他字符替换a,它总是不可能匹配

#2


24  

(?!) should always fail to match. It is the zero-width negative look-ahead. If what is in the parentheses matches then the whole match fails. Given that it has nothing in it, it will fail the match for anything (including nothing).

(?!)总是不匹配。这是零宽度的负面展望。如果括号中的内容匹配,则整个匹配失败。考虑到它没有任何内容,它将无法匹配任何内容(包括任何内容)。

#3


15  

To match an empty string - even in multiline mode - you can use \A\Z, so:

为了匹配一个空字符串,即使是在多行模式中,你也可以使用\ \Z,所以:

re.compile('\A\Z|\A\Z*|\A\Z+')

The difference is that \A and \Z are start and end of string, whilst ^ and $ these can match start/end of lines, so $^|$^*|$^+ could potentially match a string containing newlines (if the flag is enabled).

所不同的是,A和\ \ Z是字符串的开始和结束,而^和$可以匹配的行开始/结束,所以$ ^ | $ ^ * | $ ^ +可能匹配字符串包含换行(如果启用了国旗)。

And to fail to match anything (even an empty string), simply attempt to find content before the start of the string, e.g:

如果不能匹配任何内容(即使是空字符串),只需在字符串开始之前查找内容,例如:

re.compile('.\A|.\A*|.\A+')

Since no characters can come before \A (by definition), this will always fail to match.

因为没有字符可以在\A(根据定义)之前出现,所以这总是不能匹配。

#4


3  

"()"

matches nothing and nothing only.

什么都不匹配,什么都不匹配。

#5


2  

Maybe '.{0}'?

也许“{ 0 }。”?

#6


1  

You could use
\z..
This is the absolute end of string, followed by two of anything

您可以使用\ z . .这是弦的绝对端点,后面跟着两个

If + or * is tacked on the end this still works refusing to match anything

如果+或*被钉在末尾,这仍然有效,不能匹配任何内容

#7


0  

Or, use some list comprehension to remove the useless regexp entries and join to put them all together. Something like:

或者,使用一些列表理解来删除无用的regexp条目并将它们组合在一起。喜欢的东西:

re.compile('|'.join([x for x in [regexp1, regexp2, ...] if x != None]))

Be sure to add some comments next to that line of code though :-)

请确保在代码行旁边添加一些注释:-)