I was wondering how you can find words with special characters in a sentence, in all generality.
我想知道如何在一般情况下在一个句子中找到带有特殊字符的单词。
For example if we have the following sentence
例如,如果我们有以下句子
I love (#coo$kies) the following cookies: $cookie[1], $cookie[2, @cookie, @cookie%, hot@dog
Put aside the fact that this is not how variables should be used in a string. What is the regex to retrieve '#coo$kies' $cookie[1]
, $cookie[2
, @cookie
, @cookie%
, hot@dog
and not I
, love
... cookies
(or cookies:
).
抛开这不是变量应该如何在字符串中使用的事实。什么是正则表达式来检索'#coo $ kies'$ cookie [1],$ cookie [2,@ cooly,@ cookie%,hot @ dog而不是我,喜欢... cookies(或cookies :)。
I used the following regex:
我使用了以下正则表达式:
'#(\S+(?!\w+))#'
but the negation doesn't work, and I get every word back ("I", "love"..."cookies:").
但是这种否定是行不通的,我会回复每一句话(“我”,“爱”......“饼干:”)。
2 个解决方案
#1
1
Try this one:
试试这个:
(?:^| )+((\w*?[^ :,\w]+?\w*?)*)(?: |,|: |$)
(?:^ |)+((\ w *?[^:,\ w] +?\ w *?)*)(?:|,|:| $)
You may try it here
你可以在这里试试
#2
1
No way to fetch $cookie[2]
as it does not present in the source string.
无法获取$ cookie [2],因为它不存在于源字符串中。
For the rest you need to separate word delimiters [ ,:]
and special chars, which are part of the word: [\$\[\]\@\%]
. It should be something like this:
其余的你需要分隔单词分隔符[,:]和特殊字符,它们是单词的一部分:[\ $ \ [\] \ @ \ _%]。它应该是这样的:
((\w*[\$\[\]\@\%]+\w*)+?)[ ,:]*
If you can, add a space to the end of the source string, so you can use mandatory delimiters without loosing last word:
如果可以,请在源字符串的末尾添加一个空格,这样就可以使用强制分隔符而不会丢失最后一个字:
((\w*[\$\[\]\@\%]+\w*)+?)[ ,:]+
#1
1
Try this one:
试试这个:
(?:^| )+((\w*?[^ :,\w]+?\w*?)*)(?: |,|: |$)
(?:^ |)+((\ w *?[^:,\ w] +?\ w *?)*)(?:|,|:| $)
You may try it here
你可以在这里试试
#2
1
No way to fetch $cookie[2]
as it does not present in the source string.
无法获取$ cookie [2],因为它不存在于源字符串中。
For the rest you need to separate word delimiters [ ,:]
and special chars, which are part of the word: [\$\[\]\@\%]
. It should be something like this:
其余的你需要分隔单词分隔符[,:]和特殊字符,它们是单词的一部分:[\ $ \ [\] \ @ \ _%]。它应该是这样的:
((\w*[\$\[\]\@\%]+\w*)+?)[ ,:]*
If you can, add a space to the end of the source string, so you can use mandatory delimiters without loosing last word:
如果可以,请在源字符串的末尾添加一个空格,这样就可以使用强制分隔符而不会丢失最后一个字:
((\w*[\$\[\]\@\%]+\w*)+?)[ ,:]+