python如何使用正则表达式的前向、后向搜索及前向搜索否定模式详解

前言

在许多的情况下，很多要匹配内容是一起出现，或者一起不出现的。比如《》，< >，这样的括号，不存在使用半个的情况。因此，在正则表达式里也有一致性的判断，要么两个尖括号一起出现，要么一个也不要出现。怎么样来实现这种判断呢？针对这种情况得引入新的正则表达式的语法：(?=pattern)，这个语法它会向前搜索或者向后搜索相关内容，如果不会出现就不能匹配。不过，这个匹配不会消耗任何输入的字符，它只是查看一下。

例子如下：

									#python 3.6 

									#蔡军生 

									#http://blog.csdn.net/caimouse/article/details/51749579 

									# 

									import re 

									address = re.compile( 

									 ''''' 

									 # A name is made up of letters, and may include "." 

									 # for title abbreviations and middle initials. 

									 ((?P<name> 

									  ([\w.,]+\s+)*[\w.,]+ 

									  ) 

									  \s+ 

									 ) # name is no longer optional 

									 # LOOKAHEAD 

									 # Email addresses are wrapped in angle brackets, but only 

									 # if both are present or neither is. 

									 (?= (<.*>$)  # remainder wrapped in angle brackets 

									  | 

									  ([^<].*[^>]$) # remainder *not* wrapped in angle brackets 

									  ) 

									 <? # optional opening angle bracket 

									 # The address itself: username@domain.tld 

									 (?P<email> 

									  [\w\d.+-]+  # username 

									  @ 

									  ([\w\d.]+\.)+ # domain name prefix 

									  (com|org|edu) # limit the allowed top-level domains 

									 ) 

									 >? # optional closing angle bracket 

									 ''', 

									 re.VERBOSE) 

									candidates = [ 

									 u'First Last <first.last@example.com>', 

									 u'No Brackets first.last@example.com', 

									 u'Open Bracket <first.last@example.com', 

									 u'Close Bracket first.last@example.com>', 

									] 

									for candidate in candidates: 

									 print('Candidate:', candidate) 

									 match = address.search(candidate) 

									 if match: 

									  print(' Name :', match.groupdict()['name']) 

									  print(' Email:', match.groupdict()['email']) 

									 else: 

									  print(' No match')

结果输出如下：

				 
				?

									Candidate: First Last <first.last@example.com>

									 Name : First Last

									 Email: first.last@example.com

									Candidate: No Brackets first.last@example.com

									 Name : No Brackets

									 Email: first.last@example.com

									Candidate: Open Bracket <first.last@example.com

									 No match

									Candidate: Close Bracket first.last@example.com>

									 No match

python里使用正则表达式的前向搜索否定模式

上面学习前向搜索或后向搜索模式(?=pattern)，这个模式里看到有等于号=，它是表示一定相等，其实前向搜索模式里，还有不相等的判断。比如你需要识别EMAIL地址：noreply@example.com，这个EMAIL地址大多数是不需要回复的，所以我们要把这个EMAIL地址识别出来，并且丢掉它。怎么办呢？这时你就需要使用前向搜索否定模式，它的语法是这样：(?!pattern)，这里的感叹号就是表示非，不需要的意思。比如遇到这样的字符串：noreply@example.com，它会判断noreply@是否相同，如果相同，就丢掉这个模式识别，不再匹配。

例子如下：

				 
				?

									#python 3.6 

									#蔡军生 

									#http://blog.csdn.net/caimouse/article/details/51749579 

									# 

									import re 

									address = re.compile( 

									 ''''' 

									 ^ 

									 # An address: username@domain.tld 

									 # Ignore noreply addresses 

									 (?!noreply@.*$) 

									 [\w\d.+-]+  # username 

									 @ 

									 ([\w\d.]+\.)+ # domain name prefix 

									 (com|org|edu) # limit the allowed top-level domains 

									 $ 

									 ''', 

									 re.VERBOSE) 

									candidates = [ 

									 u'first.last@example.com', 

									 u'noreply@example.com', 

									] 

									for candidate in candidates: 

									 print('Candidate:', candidate) 

									 match = address.search(candidate) 

									 if match: 

									  print(' Match:', candidate[match.start():match.end()]) 

									 else: 

									  print(' No match')

结果输出如下：

				 
				?

									Candidate: first.last@example.com

									 Match: first.last@example.com

									Candidate: noreply@example.com

									 No match

总结

以上就是这篇文章的全部内容了，希望本文的内容对大家的学习或者工作具有一定的参考学习价值，如果有疑问大家可以留言交流，谢谢大家对服务器之家的支持。

原文链接：http://blog.csdn.net/caimouse/article/details/78472377

秒客网

python如何使用正则表达式的前向、后向搜索及前向搜索否定模式详解

相关文章