在以Python中的$符号开头的字符串中查找所有单词

时间:2021-11-11 19:24:56

How can I extract all words in a string that start with the $ sign? For example in the string

如何在以$符号开头的字符串中提取所有单词?例如在字符串中

This $string is an $example

I want to extract the words $string and $example.

我想提取单词$string和$example。

I tried with this regex \b[$]\S* but it works fine only if I use a normal character rather than dollar.

我试过使用这个regex \b[$]\S*,但只有当我使用普通字符而不是美元时,它才能正常工作。

4 个解决方案

#1


21  

>>> [word for word in mystring.split() if word.startswith('$')]
['$string', '$example']

#2


6  

The problem with your expr is that \b doesn't match between a space and a $. If you remove it, everything works:

您的expr的问题是,\b在空格和$之间不匹配。如果你把它移除,一切都可以:

z = 'This $string is an $example'
import re
print re.findall(r'[$]\S*', z) # ['$string', '$example']

To avoid matching words$like$this, add a lookbehind assertion:

为了避免类似$ $this的匹配词,添加一个lookbehind断言:

z = 'This $string is an $example and this$not'
import re
print re.findall(r'(?<=\W)[$]\S*', z) # ['$string', '$example']

#3


5  

The \b escape matches at word boundaries, but the $ sign is not considered part of word you can match. Match on the start or spaces instead:

“\b”可以在单词边界上进行“转义”,但是“$”符号不被认为是你可以匹配的单词的一部分。在开始或空格处匹配:

re.compile(r'(?:^|\s)(\$\w+)')

I've used a backslash escape for the dollar here instead of a character class, and the \w+ word character class with a minimum of 1 character to better reflect your intent.

我在这里使用了反斜杠转义,而不是字符类,以及最小为1字符的\w+单词字符类,以更好地反映您的意图。

Demo:

演示:

>>> import re
>>> dollaredwords = re.compile(r'(?:^|\s)(\$\w+)')
>>> dollaredwords.search('Here is an $example for you!')
<_sre.SRE_Match object at 0x100882a80>

#4


2  

Several approaches, depending on what you want define as a 'word' and if all are delineated by spaces:

有几种方法,取决于你想要定义的“单词”,如果所有的方法都用空格来描述:

>>> s='This $string is an $example $second$example'

>>> re.findall(r'(?<=\s)\$\w+',s)
['$string', '$example', '$second']

>>> re.findall(r'(?<=\s)\$\S+',s)
['$string', '$example', '$second$example']

>>> re.findall(r'\$\w+',s)
['$string', '$example', '$second', '$example']

If you might have a 'word' at the beginning of a line:

如果你可能在一行的开头有一个“word”:

>>> re.findall(r'(?:^|\s)(\$\w+)','$string is an $example $second$example')
['$string', '$example', '$second']

#1


21  

>>> [word for word in mystring.split() if word.startswith('$')]
['$string', '$example']

#2


6  

The problem with your expr is that \b doesn't match between a space and a $. If you remove it, everything works:

您的expr的问题是,\b在空格和$之间不匹配。如果你把它移除,一切都可以:

z = 'This $string is an $example'
import re
print re.findall(r'[$]\S*', z) # ['$string', '$example']

To avoid matching words$like$this, add a lookbehind assertion:

为了避免类似$ $this的匹配词,添加一个lookbehind断言:

z = 'This $string is an $example and this$not'
import re
print re.findall(r'(?<=\W)[$]\S*', z) # ['$string', '$example']

#3


5  

The \b escape matches at word boundaries, but the $ sign is not considered part of word you can match. Match on the start or spaces instead:

“\b”可以在单词边界上进行“转义”,但是“$”符号不被认为是你可以匹配的单词的一部分。在开始或空格处匹配:

re.compile(r'(?:^|\s)(\$\w+)')

I've used a backslash escape for the dollar here instead of a character class, and the \w+ word character class with a minimum of 1 character to better reflect your intent.

我在这里使用了反斜杠转义,而不是字符类,以及最小为1字符的\w+单词字符类,以更好地反映您的意图。

Demo:

演示:

>>> import re
>>> dollaredwords = re.compile(r'(?:^|\s)(\$\w+)')
>>> dollaredwords.search('Here is an $example for you!')
<_sre.SRE_Match object at 0x100882a80>

#4


2  

Several approaches, depending on what you want define as a 'word' and if all are delineated by spaces:

有几种方法,取决于你想要定义的“单词”,如果所有的方法都用空格来描述:

>>> s='This $string is an $example $second$example'

>>> re.findall(r'(?<=\s)\$\w+',s)
['$string', '$example', '$second']

>>> re.findall(r'(?<=\s)\$\S+',s)
['$string', '$example', '$second$example']

>>> re.findall(r'\$\w+',s)
['$string', '$example', '$second', '$example']

If you might have a 'word' at the beginning of a line:

如果你可能在一行的开头有一个“word”:

>>> re.findall(r'(?:^|\s)(\$\w+)','$string is an $example $second$example')
['$string', '$example', '$second']