如何在字符串句子中查找带有特殊字符的单词

时间:2021-11-11 19:25:02

I was wondering how you can find words with special characters in a sentence, in all generality.

我想知道如何在一般情况下在一个句子中找到带有特殊字符的单词。

For example if we have the following sentence

例如,如果我们有以下句子

I love (#coo$kies) the following cookies: $cookie[1], $cookie[2, @cookie, @cookie%, hot@dog

Put aside the fact that this is not how variables should be used in a string. What is the regex to retrieve '#coo$kies' $cookie[1], $cookie[2, @cookie, @cookie%, hot@dog and not I, love ... cookies (or cookies:).

抛开这不是变量应该如何在字符串中使用的事实。什么是正则表达式来检索'#coo $ kies'$ cookie [1],$ cookie [2,@ cooly,@ cookie%,hot @ dog而不是我,喜欢... cookies(或cookies :)。

I used the following regex:

我使用了以下正则表达式:

'#(\S+(?!\w+))#'

but the negation doesn't work, and I get every word back ("I", "love"..."cookies:").

但是这种否定是行不通的,我会回复每一句话(“我”,“爱”......“饼干:”)。

2 个解决方案

#1


1  

Try this one:

试试这个:

(?:^| )+((\w*?[^ :,\w]+?\w*?)*)(?: |,|: |$)

(?:^ |)+((\ w *?[^:,\ w] +?\ w *?)*)(?:|,|:| $)

You may try it here

你可以在这里试试

#2


1  

No way to fetch $cookie[2] as it does not present in the source string.

无法获取$ cookie [2],因为它不存在于源字符串中。

For the rest you need to separate word delimiters [ ,:] and special chars, which are part of the word: [\$\[\]\@\%]. It should be something like this:

其余的你需要分隔单词分隔符[,:]和特殊字符,它们是单词的一部分:[\ $ \ [\] \ @ \ _%]。它应该是这样的:

((\w*[\$\[\]\@\%]+\w*)+?)[ ,:]*

If you can, add a space to the end of the source string, so you can use mandatory delimiters without loosing last word:

如果可以,请在源字符串的末尾添加一个空格,这样就可以使用强制分隔符而不会丢失最后一个字:

((\w*[\$\[\]\@\%]+\w*)+?)[ ,:]+

#1


1  

Try this one:

试试这个:

(?:^| )+((\w*?[^ :,\w]+?\w*?)*)(?: |,|: |$)

(?:^ |)+((\ w *?[^:,\ w] +?\ w *?)*)(?:|,|:| $)

You may try it here

你可以在这里试试

#2


1  

No way to fetch $cookie[2] as it does not present in the source string.

无法获取$ cookie [2],因为它不存在于源字符串中。

For the rest you need to separate word delimiters [ ,:] and special chars, which are part of the word: [\$\[\]\@\%]. It should be something like this:

其余的你需要分隔单词分隔符[,:]和特殊字符,它们是单词的一部分:[\ $ \ [\] \ @ \ _%]。它应该是这样的:

((\w*[\$\[\]\@\%]+\w*)+?)[ ,:]*

If you can, add a space to the end of the source string, so you can use mandatory delimiters without loosing last word:

如果可以,请在源字符串的末尾添加一个空格,这样就可以使用强制分隔符而不会丢失最后一个字:

((\w*[\$\[\]\@\%]+\w*)+?)[ ,:]+