I am using Python to match a few words within a sentence and testing them against unit tests. I want a regular expression that matches all these words and gives me these outputs mentioned below:
我正在使用Python来匹配句子中的几个单词,并对它们进行单元测试。我想要一个符合所有这些词的正则表达式,并给出下面提到的输出:
firstword = "<p>This is @Timberlake</p>"
outputfirstword = "@Timberlake"
Finds the word that starts with the @ symbol
查找以@符号开头的单词
secondword = "<p>This is @timber.lake</p>"
outputsecondword = "@timber.lake"
Period between words are okay.
单词之间的句点是可以的。
thirdword = "This is @Timberlake. Yo!"
outputthirdword = "@Timberlake"
If there is a space after the period then both the period and space don't count towards the outputthirdword
如果在句点之后有一个空格,那么句点和空格都不能算作outputthirdword
fourthword = "This is @Timberlake."
outputfourthword = "@Timberlake"
The final period (.) is not included.
最后一个周期(.)不包括在内。
4 个解决方案
#1
2
Using this regex:
使用这个正则表达式:
(?i)@[a-z.]+\b
You are able to extract the needed part by using capturing groups. Live demo
您可以通过使用捕获组提取所需的部分。现场演示
Explanations:
解释:
(?i) # Enabling case-insensitive modifier
@ # Literal @
[a-z.] # Match letters a to z as well as a period
\b # Ending at a word boundary
#2
1
@[a-zA-Z]+\b(?:\.[a-zA-Z]+\b)?
You can use this.See demo.
你可以使用这个。看到演示。
import re
p = re.compile(r'@[a-zA-Z]+\b(?:\.[a-zA-Z]+\b)?')
test_str = "This is @Timberlake. Yo!\n<p>This is @timber.lake</p>"
re.findall(p, test_str)
#3
0
One way is using following regex and strip the result with dot :
一种方法是使用如下regex,用dot去掉结果:
@[a-zA-Z.]+
For example if you use re.search
you can do :
例如,如果你使用re.search,你可以:
re.search(r'@[a-zA-Z.]+','my_string').group(0).strip('.')
And you can use following regex that doesn't need strip
:
您可以使用以下不需要带的regex:
@[a-zA-Z]+.?[a-zA-Z]+
演示
#4
0
As @Kasra mention, the regex works nice. But it will not remove the dot in the end.
正如@Kasra提到的,regex运行良好。但它最终不会去掉这个点。
Use the regex below and i believe that it is what you expect.
使用下面的regex,我相信这就是您所期望的。
@[a-zA-Z.]+[a-zA-Z]+
See an example below, it is not in Python, but the regex should be the same.
请参见下面的示例,它不是在Python中,但是regex应该是相同的。
$ (echo "<p>This is @Timberlake</p>"; echo "<p>This is @timber.lake</p>"; echo "This is @Timberlake."; echo "<p>This is @tim.ber.lake</p>") | grep -Eo '@[a-zA-Z.]+[a-zA-Z]+'
@Timberlake
@timber.lake
@Timberlake
@tim.ber.lake
#1
2
Using this regex:
使用这个正则表达式:
(?i)@[a-z.]+\b
You are able to extract the needed part by using capturing groups. Live demo
您可以通过使用捕获组提取所需的部分。现场演示
Explanations:
解释:
(?i) # Enabling case-insensitive modifier
@ # Literal @
[a-z.] # Match letters a to z as well as a period
\b # Ending at a word boundary
#2
1
@[a-zA-Z]+\b(?:\.[a-zA-Z]+\b)?
You can use this.See demo.
你可以使用这个。看到演示。
import re
p = re.compile(r'@[a-zA-Z]+\b(?:\.[a-zA-Z]+\b)?')
test_str = "This is @Timberlake. Yo!\n<p>This is @timber.lake</p>"
re.findall(p, test_str)
#3
0
One way is using following regex and strip the result with dot :
一种方法是使用如下regex,用dot去掉结果:
@[a-zA-Z.]+
For example if you use re.search
you can do :
例如,如果你使用re.search,你可以:
re.search(r'@[a-zA-Z.]+','my_string').group(0).strip('.')
And you can use following regex that doesn't need strip
:
您可以使用以下不需要带的regex:
@[a-zA-Z]+.?[a-zA-Z]+
演示
#4
0
As @Kasra mention, the regex works nice. But it will not remove the dot in the end.
正如@Kasra提到的,regex运行良好。但它最终不会去掉这个点。
Use the regex below and i believe that it is what you expect.
使用下面的regex,我相信这就是您所期望的。
@[a-zA-Z.]+[a-zA-Z]+
See an example below, it is not in Python, but the regex should be the same.
请参见下面的示例,它不是在Python中,但是regex应该是相同的。
$ (echo "<p>This is @Timberlake</p>"; echo "<p>This is @timber.lake</p>"; echo "This is @Timberlake."; echo "<p>This is @tim.ber.lake</p>") | grep -Eo '@[a-zA-Z.]+[a-zA-Z]+'
@Timberlake
@timber.lake
@Timberlake
@tim.ber.lake