带有可选参数的python re.search

I am trying to find a substring with this pattern: .*(_\d+)?

我试图找到这种模式的子串:。*(_ \ d +)?

Example:

abc_4
abc_345
abc

Just one regular string, followed by an optional "_", followed by at least one digit.

只有一个常规字符串,后跟一个可选的“_”,后跟至少一个数字。

But when I use:

但是当我使用时:

re.search("(.*)(_\d+)?" , str).group(1)

it always returns the entire string.

它总是返回整个字符串。

3 个解决方案

#1

The problem is that * is greedy, it tries to match the longest possible string, so long as the rest of the regexp can match. Since the part after _ is optional, .* can gobble it up, since the rest of the regexp can match an empty string.

问题是*是贪婪的,它试图匹配最长的字符串,只要正则表达式的其余部分可以匹配。由于_之后的部分是可选的,因此。*可以吞噬它,因为正则表达式的其余部分可以匹配空字符串。

Change .* to [^_]* so that it can't match the underscore before the number.

将。*更改为[^ _] *,使其与数字前的下划线不匹配。

([^]*)(_\d+)?

#2

instead of (.*) use [^_]*? to stop at the first _ character.

而不是(。*)使用[^ _] *?停在第一个_字符处。

#3

You have made the _nnn part optional (?), so the .* is matching the whole string always (greedy). Make it non-greedy:

你已经使_nnn部分成为可选(?),所以。*总是匹配整个字符串(贪婪)。让它不贪心:

.*?(_\d+)?

#1