八、Python:正则表达式

时间:2022-06-01 20:31:32

    在本章我们来一起了解下Python中的正则表达式的使用方式

 

一、re.match的函数原型为:re.match(pattern, string, flags) 只匹配字符串的开头

# -*- coding: gb18030 -*- 

#例1:
import re  
text = "china,Hello"  
m = re.match("china", text)  

if m is not None:  
    print m.group()
else:  
    print 'not match'  
#输出 => china

    
#例2:
pattn = ".end"
m = re.match(pattn," end")  
if m is not None:
    print 'result:' + m.group()
else:
    print 'not match'
#输出 => result: end


二、re.search re.search(pattern, string, flags) 从左到右开始匹配字符串

# -*- coding: gb18030 -*- 

import re
text = "Hello china"  
m = re.search("china", text)  
if m is not None:  
    print 'result:' + m.group() 
else:  
    print 'not search'  
    
# 输出=> result:china

 

三、搜索并替换 re.sub() 或re.subn()

# -*- coding: gb18030 -*- 

import re

print re.sub("B","I","CHBNA")               #输出=> CHINA
print re.subn("B","I","CHBNA")              #输出=>('CHINA',1),1表示替换次数

 

四、split

# -*- coding: gb18030 -*- 

import re
print re.split(",","a,b,c")           #=>输出['a', 'b', 'c']
print "a,b,c".split(",")              #=>输出['a', 'b', 'c']
print re.split("\s\w{2}","1 aa2 bb3") #=>输出['1', '2', '3']


五、综合

# -*- coding: gb18030 -*- 
import re

pattn = "^[ab]\w{3,4}\d+"  #匹配以a或b开头,后面3或4个字母,再接1个或多个数字
li = ["aabb123","bbaba456","bcdefghi789","abce"]

for i in range(0,len(li)):
    m = re.search(pattn,li[i]) 
    
    if m is not None:
        print 'result:' + m.group()
    else:
        print 'not match'

'''
 输出如下结果=>:
result:aabb123
result:bbaba456
not match
not match
'''