python学习之re 8 |或运算

| A|B, where A and B can be arbitrary REs, creates a regular expression that will match either A or B. An arbitrary number of REs can be separated by the '|' in this way. This can be used inside groups (see below) as well. As the target string is scanned, REs separated by '|' are tried from left to right. When one pattern completely matches, that branch is accepted. This means that once A matches, B will not be tested further, even if it would produce a longer overall match. In other words, the '|' operator is never greedy. To match a literal '|', use \|, or enclose it inside a character class, as in [|].翻译： A|B这样的一个RE表达式，其中A和B可以是任意的RE表达式，它的含义是，这整个RE表达式将使用两者中的一个表达式进行匹配。复合的RE表达式可以通过标识符 '|' 通过标识符来进行分隔如 A|B|C。这种表达式也可以放在组内如 (A|B|C) ，这个表达式表示一个组，后面可以通过group进行取出。当对目的串进行扫描时，复合RE表达式将会通过|分隔并从左到右依次匹配子表达式。当某一个子表达式匹配成功，将会接受那个分支表达式。也就是说只要A匹配成功，B就不会再进行匹配了。即使B可能会匹配更长的子串。这也意味着，通过'|' 分隔符分隔的表达式都是非贪心的。如果要匹配字符'|' ，需要使用反斜杠进行转义，或者将这个字符放入集合中如[|].

import re
string1 = "<div>aaa</div><div>bbb</div>"
print(len(string1),"stringr")
rs = re.match("<div>([\d]+|[a-z]+)</div><div>([\d]+|[a-z]+)</div>",string1)
print(rs.group(0))
print(rs.group(1))
print(rs.group(2))

输出

28 stringr
<div>aaa</div><div>bbb</div>
aaa
bbb

group(0)是匹配的长度，group组是从1开始标号的（将在下一篇进行说明）。

正则匹配的整个左边和整个右边，如果要有界限可以通过括号来分隔

秒客网

python学习之re 8 |或运算

相关文章