I have three strings:
我有三个字符串:
s1 = "A blah blah blah." # match A
s2 = "Blah blah blah. A blah blah." # match A
s3 = "Blah blah blah A." # don't match 'A'
I'm trying to write a regular expression that will match the occurrences of A
in the first two strings, but not the third: i.e., I want to match an occurrence of A
at the beginning of a line or sentence but not elsewhere.
我正在尝试编写一个正则表达式,它将匹配前两个字符串中A的出现次数,但不是第三个字符串中的出现次数:即,我希望在行或句子的开头匹配A的出现,但不匹配其他地方。
I've tried the following regexes:
我试过以下正则表达式:
regex = "(^|(. ))A"
regex = "[^(. )]A"
Using re.search()
, The first of these matches all three A's; the second matches none of them.
使用re.search(),第一个匹配所有三个A;第二个没有匹配。
I'm using Python 3.5.
我正在使用Python 3.5。
2 个解决方案
#1
2
You had it almost correct. "(^|\. )A"
works. You have to escape the dot, because it means "any character" in regex.
你几乎是正确的。 “(^ | \。)A”有效。你必须逃避点,因为它在正则表达式中意味着“任何字符”。
>>> s1 = "A blah blah blah." # match A
... s2 = "Blah blah blah. A blah blah." # match A
... s3 = "Blah blah blah A." # don't match 'A'
>>> import re
>>> re.search("(^|\. )A", s1)
<_sre.SRE_Match object; span=(0, 1), match='A'>
>>> re.search("(^|\. )A", s2)
<_sre.SRE_Match object; span=(14, 17), match='. A'>
>>> re.search("(^|\. )A", s3)
If you want it to work with more punctuation, you can use a character class. Then you don't have to escape.
如果您希望它使用更多标点符号,则可以使用字符类。然后你不必逃避。
>>> re.search("(^|[.!?]) A", 'Good? Ay.')
8: <_sre.SRE_Match object; span=(4, 7), match='? A'>
#2
0
EDIT You could do as the following:
编辑您可以执行以下操作:
>>> import re
>>> s1 = "A blah blah blah."
>>> s2 = "Blah blah blah. A blah blah."
>>> s3 = "Blah blah blah A."
>>> re.findall('(?:^\s*|[?!.]\s+)(A)',s1)
['A']
>>> re.findall('(?:^\s*|[?!.]\s+)(A)',s2)
['A']
>>> re.findall('(?:^\s*|[?!.]\s+)(A)',s3)
[]
#1
2
You had it almost correct. "(^|\. )A"
works. You have to escape the dot, because it means "any character" in regex.
你几乎是正确的。 “(^ | \。)A”有效。你必须逃避点,因为它在正则表达式中意味着“任何字符”。
>>> s1 = "A blah blah blah." # match A
... s2 = "Blah blah blah. A blah blah." # match A
... s3 = "Blah blah blah A." # don't match 'A'
>>> import re
>>> re.search("(^|\. )A", s1)
<_sre.SRE_Match object; span=(0, 1), match='A'>
>>> re.search("(^|\. )A", s2)
<_sre.SRE_Match object; span=(14, 17), match='. A'>
>>> re.search("(^|\. )A", s3)
If you want it to work with more punctuation, you can use a character class. Then you don't have to escape.
如果您希望它使用更多标点符号,则可以使用字符类。然后你不必逃避。
>>> re.search("(^|[.!?]) A", 'Good? Ay.')
8: <_sre.SRE_Match object; span=(4, 7), match='? A'>
#2
0
EDIT You could do as the following:
编辑您可以执行以下操作:
>>> import re
>>> s1 = "A blah blah blah."
>>> s2 = "Blah blah blah. A blah blah."
>>> s3 = "Blah blah blah A."
>>> re.findall('(?:^\s*|[?!.]\s+)(A)',s1)
['A']
>>> re.findall('(?:^\s*|[?!.]\s+)(A)',s2)
['A']
>>> re.findall('(?:^\s*|[?!.]\s+)(A)',s3)
[]