在python中使用正则表达式(二)

这一节主要学习一下compile()函数和group()方法

1. re.compile()

compile 函数用于编译正则表达式，生成一个正则表达式（ Pattern ）对象，然后就可以用编译后的正则表达式去匹配字符串

语法如下：
>>> help(re.compile)

Help on function compile in module re:

compile(pattern, flags=0)

    Compile a regular expression pattern, returning a pattern object.
>>>

pattern : 一个字符串形式的正则表达式 
flags ：可选，表示匹配模式，比如忽略大小写，多行模式等

示例：

>>> test_pattern = re.compile(r'\d{2}')   # 编译一个正则表达式，并将其赋给一个变量

>>> m = test_pattern.match('12bc34')   # 使用编译后的正则表达式对象直接匹配字符串

>>> m

<_sre.SRE_Match object; span=(0, 2), match=''>

>>> test_pattern = re.compile(r'a\w+')  # 生成一个正则表达式对象(这里是匹配以a开头的单词)

>>> m = test_pattern.findall('apple,blue,alone,shot,attack') # 使用findall()函数匹配所有满足匹配规则的子串

>>> m

['apple', 'alone', 'attack']

2.group()和groups()

在python中使用正则表达式(二)

一般用match()或search()函数匹配，得到匹配对象后，需要用group()方法获得匹配内容；同时也可以提取分组截获的字符串（正则表达式中()用来分组）

示例：

>>> pattern = re.compile(r'^(\d{3})-(\d{3,8})$')  # 匹配一个3位数开头，然后一个-，然后跟着3-8位数字的字符串

>>> m = pattern.match('020-1234567')

>>> m

<_sre.SRE_Match object; span=(0, 11), match='020-1234567'>

>>> m.group()   #  显示整个匹配到的字符

'020-1234567'

>>> m.group(0)  # 同样是显示整个匹配到的字符

'020-1234567'

>>> m.group(1)   # 提取第1个分组中的子串

''

>>> m.group(2)   # 提取第2个分组中的子串

''

>>> m.group(3)   # 因为不存在第3个分组，所以这里会报错：没有这样的分组

Traceback (most recent call last):

  File "<pyshell#73>", line 1, in <module>

    m.group(3)

IndexError: no such group

>>> m.groups()
('020', '1234567')
>>>

2018-06-07 22:43:46

秒客网

在python中使用正则表达式(二)

1. re.compile()

2.group()和groups()

相关文章