如何替换字符串中的特殊字符?

时间:2021-07-25 16:52:08

I have a string with lots of special characters. I want to remove all those, but keep alphabetical characters.

我有一个有很多特殊字符的字符串。我想把这些都去掉,但要保留字母。

How can I do this?

我该怎么做呢?

7 个解决方案

#1


150  

That depends on what you mean. If you just want to get rid of them, do this:
(Update: Apparently you want to keep digits as well, use the second lines in that case)

这取决于你的意思。如果你只是想摆脱它们,可以这样做:(更新:显然你也想保留数字,在这种情况下使用第二行)

String alphaOnly = input.replaceAll("[^a-zA-Z]+","");
String alphaAndDigits = input.replaceAll("[^a-zA-Z0-9]+","");

or the equivalent:

或相当于:

String alphaOnly = input.replaceAll("[^\\p{Alpha}]+","");
String alphaAndDigits = input.replaceAll("[^\\p{Alpha}\\p{Digit}]+","");

(All of these can be significantly improved by precompiling the regex pattern and storing it in a constant)

(所有这些都可以通过预先编译regex模式并将其存储在常量中得到显著改进)

Or, with Guava:

或者,番石榴:

private static final CharMatcher ALNUM =
  CharMatcher.inRange('a', 'z').or(CharMatcher.inRange('A', 'Z'))
  .or(CharMatcher.inRange('0', '9')).precomputed();
// ...
String alphaAndDigits = ALNUM.retainFrom(input);

But if you want to turn accented characters into something sensible that's still ascii, look at these questions:

但是如果你想要把重音字符转换成同样是ascii码的敏感字符,看看这些问题:

#2


56  

I am using this.

我使用这个。

s = s.replaceAll("\\W", ""); 

It replace all special characters from string.

它替换字符串中的所有特殊字符。

Here

在这里

\w : A word character, short for [a-zA-Z_0-9]

\w:单词字符,缩写为[A- za - z_0 -9]

\W : A non-word character

一个非文字的角色

#3


5  

You can use the following method to keep alphanumeric characters.

您可以使用以下方法来保存字母数字字符。

replaceAll("[^a-zA-Z0-9]", "");

And if you want to keep only alphabetical characters use this

如果你想只保留字母字符,可以用这个

replaceAll("[^a-zA-Z]", "");

#4


1  

string Output = Regex.Replace(Input, @"([ a-zA-Z0-9&, _]|^\s)", "");

Here all the special characters except space, comma, and ampersand are replaced. You can also omit space, comma and ampersand by the following regular expression.

在这里,除空格、逗号和&外的所有特殊字符都被替换。还可以通过以下正则表达式省略空格、逗号和&号。

string Output = Regex.Replace(Input, @"([ a-zA-Z0-9_]|^\s)", "");

Where Input is the string which we need to replace the characters.

输入是我们需要替换字符的字符串。

#5


0  

You can use basic regular expressions on strings to find all special characters or use pattern and matcher classes to search/modify/delete user defined strings. This link has some simple and easy to understand examples for regular expressions: http://www.vogella.de/articles/JavaRegularExpressions/article.html

您可以使用字符串上的基本正则表达式来查找所有特殊字符,或者使用模式和matcher类来搜索/修改/删除用户定义的字符串。这个链接有一些简单且容易理解的正则表达式示例:http://www.vogella.de/articles/javaregularexpress/article.html

#6


0  

You can get unicode for that junk character from charactermap tool in window pc and add \u e.g. \u00a9 for copyright symbol. Now you can use that string with that particular junk caharacter, don't remove any junk character but replace with proper unicode.

你可以从视窗电脑的文字地图工具中获得unicode字符,并加入\u,例如\u00a9的版权符号。现在,您可以将该字符串与那个特定的垃圾caharacter一起使用,不要删除任何垃圾字符,而是使用适当的unicode替换。

#7


0  

For spaces use "[^a-z A-Z 0-9]" this pattern

对于空间使用“^[a - z - z 0 - 9]”这种模式

#1


150  

That depends on what you mean. If you just want to get rid of them, do this:
(Update: Apparently you want to keep digits as well, use the second lines in that case)

这取决于你的意思。如果你只是想摆脱它们,可以这样做:(更新:显然你也想保留数字,在这种情况下使用第二行)

String alphaOnly = input.replaceAll("[^a-zA-Z]+","");
String alphaAndDigits = input.replaceAll("[^a-zA-Z0-9]+","");

or the equivalent:

或相当于:

String alphaOnly = input.replaceAll("[^\\p{Alpha}]+","");
String alphaAndDigits = input.replaceAll("[^\\p{Alpha}\\p{Digit}]+","");

(All of these can be significantly improved by precompiling the regex pattern and storing it in a constant)

(所有这些都可以通过预先编译regex模式并将其存储在常量中得到显著改进)

Or, with Guava:

或者,番石榴:

private static final CharMatcher ALNUM =
  CharMatcher.inRange('a', 'z').or(CharMatcher.inRange('A', 'Z'))
  .or(CharMatcher.inRange('0', '9')).precomputed();
// ...
String alphaAndDigits = ALNUM.retainFrom(input);

But if you want to turn accented characters into something sensible that's still ascii, look at these questions:

但是如果你想要把重音字符转换成同样是ascii码的敏感字符,看看这些问题:

#2


56  

I am using this.

我使用这个。

s = s.replaceAll("\\W", ""); 

It replace all special characters from string.

它替换字符串中的所有特殊字符。

Here

在这里

\w : A word character, short for [a-zA-Z_0-9]

\w:单词字符,缩写为[A- za - z_0 -9]

\W : A non-word character

一个非文字的角色

#3


5  

You can use the following method to keep alphanumeric characters.

您可以使用以下方法来保存字母数字字符。

replaceAll("[^a-zA-Z0-9]", "");

And if you want to keep only alphabetical characters use this

如果你想只保留字母字符,可以用这个

replaceAll("[^a-zA-Z]", "");

#4


1  

string Output = Regex.Replace(Input, @"([ a-zA-Z0-9&, _]|^\s)", "");

Here all the special characters except space, comma, and ampersand are replaced. You can also omit space, comma and ampersand by the following regular expression.

在这里,除空格、逗号和&外的所有特殊字符都被替换。还可以通过以下正则表达式省略空格、逗号和&号。

string Output = Regex.Replace(Input, @"([ a-zA-Z0-9_]|^\s)", "");

Where Input is the string which we need to replace the characters.

输入是我们需要替换字符的字符串。

#5


0  

You can use basic regular expressions on strings to find all special characters or use pattern and matcher classes to search/modify/delete user defined strings. This link has some simple and easy to understand examples for regular expressions: http://www.vogella.de/articles/JavaRegularExpressions/article.html

您可以使用字符串上的基本正则表达式来查找所有特殊字符,或者使用模式和matcher类来搜索/修改/删除用户定义的字符串。这个链接有一些简单且容易理解的正则表达式示例:http://www.vogella.de/articles/javaregularexpress/article.html

#6


0  

You can get unicode for that junk character from charactermap tool in window pc and add \u e.g. \u00a9 for copyright symbol. Now you can use that string with that particular junk caharacter, don't remove any junk character but replace with proper unicode.

你可以从视窗电脑的文字地图工具中获得unicode字符,并加入\u,例如\u00a9的版权符号。现在,您可以将该字符串与那个特定的垃圾caharacter一起使用,不要删除任何垃圾字符,而是使用适当的unicode替换。

#7


0  

For spaces use "[^a-z A-Z 0-9]" this pattern

对于空间使用“^[a - z - z 0 - 9]”这种模式