python学习之re 5 *+? *? +? ??

*Causes the resulting RE to match 0 or more repetitions of the preceding RE, as many repetitions as are possible. ab* will match ‘a’, ‘ab’, or ‘a’ followed by any number of ‘b’s.

* 代表的数量是自然数集

+Causes the resulting RE to match 1 or more repetitions of the preceding RE. ab+ will match ‘a’ followed by any non-zero number of ‘b’s; it will not match just ‘a’.

+ 代表整数集合

?Causes the resulting RE to match 0 or 1 repetitions of the preceding RE. ab? will match either ‘a’ or ‘ab’.？代表集合【0，1】

*?, +?, ??The '*', '+', and '?' qualifiers are all greedy; they match as much text as possible. Sometimes this behaviour isn’t desired; if the RE <.*> is matched against '<a> b <c>', it will match the entire string, and not just '<a>'. Adding ? after the qualifier makes it perform the match in non-greedy or minimal fashion; as fewcharacters as possible will be matched. Using the RE <.*?> will match only '<a>'.

前面的标识符都是所表示的集合都是贪婪地，它们尽最大的力度去匹配字符。有时候这些行为是不可取的。比如用RE表达式<.*>

可以匹配'<a> b <c>'配，这个表达式匹配整个字符串，而不是仅仅<a>添加？在前面的标识符后面可以让他们以非贪心或者最小模式进行匹配，以最小可以匹配的长度为界限。使用<.*?> 可以仅仅匹配出 '<a>'.

案例一

import re
string1 = """hello this is the end,
no ,this is no the end.
"""
print("stringr",re.match("a*.*",string1))
print("stringr",re.match("a+.+",string1))
print("stringr",re.match("a?.+",string1))
print("stringr",re.match(".*end",string1,re.DOTALL))
print("stringr",re.match(".*?end",string1,re.DOTALL))

输出

stringr <re.Match object; span=(0, 22), match='hello this is the end,'>
stringr None
stringr <re.Match object; span=(0, 22), match='hello this is the end,'>
stringr <re.Match object; span=(0, 45), match='hello this is the end,\nno ,this is no the end'>
stringr <re.Match object; span=(0, 21), match='hello this is the end'>

我们可以从输出和条件进行对比出其中的差异。

第一条输出语句a可以出现的次数是0次，所以匹配。

第二条输出语句a必须出现一次，与匹配规则与原串没有匹配处，所以为none

第三条输出语句a可以是0可以是1，所以匹配。

第四条语句是没有使用标识符？的，匹配规则是以贪心的方式进行，最大长度匹配。

第五条输出语句添加了标识符？，以最小长度进行匹配。

与此同时同样适用的是

{m}

Specifies that exactly m copies of the previous RE should be matched; fewer matches cause the entire RE not to match. For example, a{6} will match exactly six 'a' characters, but not five.

{m,n}Causes the resulting RE to match from m to n repetitions of the preceding RE, attempting to match as many repetitions as possible. For example, a{3,5} will match from 3 to 5 'a' characters. Omitting m specifies a lower bound of zero, and omitting n specifies an infinite upper bound. As an example, a{4,}b will match 'aaaab' or a thousand 'a' characters followed by a 'b', but not 'aaab'. The comma may not be omitted or the modifier would be confused with the previously described form. {m,n}?Causes the resulting RE to match from m to n repetitions of the preceding RE, attempting to match as fewrepetitions as possible. This is the non-greedy version of the previous qualifier. For example, on the 6-character string 'aaaaaa', a{3,5} will match 5 'a' characters, while a{3,5}? will only match 3 characters.？标识符表示非贪心匹配。

秒客网

python学习之re 5 +? ? +? ??

相关文章