1.6 练习
正则表达式
1-1 [bh][aiu]t;
1-2 \w+ \w+;
1-3 \w+,\s\w+;
1-4 [A-Za-z_]+[\w_]+
python有效标识符的定义:
1.python中的标识符是区分大小写的。
2.标示符以字母或下划线开头,可包括字母,下划线和数字。
3.以下划线开头的标识符是有特殊意义的。
1-5 \d+(\s\w+)+
1-6 (1)^w{3}://.+com/?$ (2)^\w+://.+?\.\w{3}/?$
1-7 [\+-]?\d+
1-8 [\+-]?\d+[lL] (注:python3已经把int和长整数合并了,没有123L这种表示法了)
1-9 [\+-]?\d*\.\d* (注: 此处用*是因为 .1 表示 0.1,1. 表示 1.0)
1-10 ([\+-]?\d*(\.\d*)?){2}i?
1-11 \w+[\w_-\.]*\w+@\w+\.\w{2,3}
电子邮件表示:用户名+@ +邮箱域名.com/org/cn...
有效电子邮件地址:用户名,可以自己选择。由字母、数字、点、减号或下划线组成。只能以数字或字母开头和结尾。
1-12 https?://[\w_\.-]*\w+/? (并不知道所谓的有效URL是什么规则,这答案也不知道是否能全匹配成功)
1-13 \s'(.+)'
1-14 1[0-2]
1-15
(1)\d{4}-(\d+)-(\d+)(?:-(\d{4}))?
(2)
def check(card): contexts = re.search(r'\d{4}-(\d+)-(\d+)(?:-(\d{4}))?', card).groups() if not contexts[2] and len(contexts[0]) == 6 and len(contexts[1]) == 5: return True elif contexts[2] and len(contexts[0]) == 4 and len(contexts[1]) == 4 and len(contexts[2]) == 4: return True else: return False
使用gendata.py
1-16
from random import * from string import ascii_lowercase as lc from sys import * from time import * tlds = ('com', 'edu', 'net', 'org', 'gov') with open('reddata.txt', 'w') as f: for i in range(randrange(5, 11)): dtint = randrange(maxsize) dtstr = ctime(dtint) llen = randrange(4, 8) login = ''.join(choice(lc) for j in range(llen)) dlen = randrange(llen, 13) dom = ''.join(choice(lc) for j in range(dlen)) f.write('%s::%s@%s.%s::%d-%d-%d\n' % (dtstr, login, dom, choice(tlds), dtint, llen, dlen))
1-17
import re def Count(filename): weeks = {} with open(filename) as f: contexts = f.read() result = re.findall(r'(\w{3})\s\w{3}', contexts) for i in result: weeks[i] = weeks.get(i, 0) + 1 return weeks weeks = Count('reddata.txt') print(weeks)1-18 看了很多遍,还是无法理解这道题的意思
1-19 \w{3}\s.*?\s\d{4}
1-20 \w+@\w+\.\w{3}
1-21 \s(\w{3})\s
1-22 \s(\d{4})
1-23 \d{2}:\d{2}:\d{2}
1-24 (\w+)@(\w+\.\w{3})
1-25 (\w+)@(\w+)\.(\w{3})
1-26
import re def Replace(filename, mail): with open(filename) as f: contexts = f.read() replace = re.sub(r'\w*@\w*\.\w{3}', mail, contexts) print(replace) Replace('reddata.txt', 'mymail@qq.com')
1-27
import re def print_time(filename): with open(filename) as f: contexts = f.read() time = re.findall(r'\s(\w{3})\s(\d{2}).*?(\d{4})', contexts) for t in time: print(', '.join(t)) print_time('reddata.txt')
处理电话号码
1-28 (\d{3}-)?\d{3}-\d{3}
1-29 (\()?(\d{3})?(?(1)\) |-?)\d{3}-\d{3} 简单化:((\d{3}-)|(\(\d{3}\) ))?\d{3}-\d{3}
正则表达式应用程序
1-30、1-31、1-32 高深的题目,知识有限,暂时无法解决这些题。