python re模块使用(一)

正则表达式语法表如下：

语法	意义	说明
"."	任意字符
"^"	字符串开始	'^hello'匹配'helloworld'而不匹配'aaaahellobbb'
"$"	字符串结尾	与上同理
"*"	0 个或多个字符（贪婪匹配）	<*>匹配<title>chinaunix</title>
"+"	1 个或多个字符（贪婪匹配）	与上同理
"?"	0 个或多个字符（贪婪匹配）	与上同理
*?,+?,??	以上三个取第一个匹配结果（非贪婪匹配）	<*>匹配<title>
{m,n}	对于前一个字符重复m到n次，{m}亦可	a{6}匹配6个a、a{2,4}匹配2到4个a
{m,n}?	对于前一个字符重复m到n次，并取尽可能少	‘aaaaaa’中a{2,4}只会匹配2个
"\\"	特殊字符转义或者特殊序列
[]	表示一个字符集	[0-9]、[a-z]、[A-Z]、[^0]
"\|"	或	A\|B,或运算
(...)	匹配括号中任意表达式
(?#...)	注释，可忽略
(?=...)	Matches if ... matches next, but doesn't consume the string.	'(?=test)' 在hellotest中匹配hello
(?!...)	Matches if ... doesn't match next.	'(?!=test)' 若hello后面不为test，匹配hello
(?<=...)	Matches if preceded by ... (must be fixed length).	'(?<=hello)test' 在hellotest中匹配test
(?<!...)	Matches if not preceded by ... (must be fixed length).	'(?<!hello)test' 在hellotest中不匹配test

特殊序列符号	意义
\A	只在字符串开始进行匹配
\Z	只在字符串结尾进行匹配
\b	匹配位于开始或结尾的空字符串
\B	匹配不位于开始或结尾的空字符串
\d	相当于[0-9]
\D	相当于[^0-9]
\s	匹配任意空白字符:[\t\n\r\r\v]
\S	匹配任意非空白字符:[^\t\n\r\r\v]
\w	匹配任意数字和字母:[a-zA-Z0-9]
\W	匹配任意非数字和字母:[^a-zA-Z0-9]

re.match()
从字符串的起点开始做匹配
Python 2.7.6 (default, Nov 10 2013, 19:24:24) [MSC v.1500 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import re
>>> re.match("a","abcdefg")
<_sre.SRE_Match object at 0x0000000002D74988> #表示匹配成功
>>>print re.match("a","cabacdefg")
None #表示怕匹配失败返回空

re.search()
字符串做任意匹配
>>> re.search("a","bcdefag")
<_sre.SRE_Match object at 0x0000000002D74988> #表示匹配成功
>>>print re.searc("k","cabcdefg")
None #表示怕匹配失败返回空

re.compile() #表示不是很明白讲正则表达式编译成对象
>>> a1 = re.compile("a")
>>> print a1.search("bcdefag")
<_sre.SRE_Match object at 0x0000000002D749F0>
等价于 re.search("a","bcdefag")
正则表达式可以多次重复使用速度效率更高更快

re.split(pattern, string, maxsplit=0) #分隔字符串
>>> re.split('w', 'howareyou') #讲howareyou 按照小写w分隔并返回列表
['ho', 'areyou']

re.findall(pattern,string,flags=0)
找到 RE 匹配的所有子串，并把它们作为一个列表返回。这个匹配是从左到右有序地返回。如果无匹配，返回空列表。
>>> re.findall(r"\d+","12a32bc43jf3") #\d 表示匹配数字 +表示匹配一个或者多个
['12', '32', '43', '3']

秒客网

python re模块使用(一)

相关文章