如何从字符串中删除特殊字符?

时间:2021-08-13 22:16:32

I want to remove special characters like:

我想删除一些特殊的字符,比如:

- + ^ . : ,

from an String using Java.

使用Java的字符串。

8 个解决方案

#1


205  

That depends on what you define as special characters, but try replaceAll(...):

这取决于您将什么定义为特殊字符,但是请尝试replaceAll(…):

String result = yourString.replaceAll("[-+.^:,]","");

Note that the ^ character must not be the first one in the list, since you'd then either have to escape it or it would mean "any but these characters".

注意,^字符不能在列表中第一个,因为你要么逃避它或者它将意味着“任何但这些字符”。

Another note: the - character needs to be the first or last one on the list, otherwise you'd have to escape it or it would define a range ( e.g. :-, would mean "all characters in the range : to ,).

另一个注意事项:-字符必须是列表中的第一个或最后一个字符,否则您必须转义它,否则它将定义一个范围(例如:-,将意味着“范围中的所有字符:to,)”。

So, in order to keep consistency and not depend on character positioning, you might want to escape all those characters that have a special meaning in regular expressions (the following list is not complete, so be aware of other characters like (, {, $ etc.):

因此,为了保持一致性,而不依赖于字符定位,您可能希望转义正则表达式中所有具有特殊意义的字符(以下列表不完整,所以请注意其他字符,如(,{,$等):

String result = yourString.replaceAll("[\\-\\+\\.\\^:,]","");


If you want to get rid of all punctuation and symbols, try this regex: \p{P}\p{S} (keep in mind that in Java strings you'd have to escape back slashes: "\\p{P}\\p{S}").

如果你想摆脱所有的标点和符号,试试这个regex: \p{p}\p{S}(记住,在Java字符串中,你必须转义回斜线:“\p{p}\p{S}”)。

A third way could be something like this, if you can exactly define what should be left in your string:

第三种方法可以是这样的,如果你能准确地定义你的字符串中应该剩下什么:

String  result = yourString.replaceAll("[^\\w\\s]","");

This means: replace everything that is not a word character (a-z in any case, 0-9 or _) or whitespace.

这意味着:替换所有非文字字符(无论如何是a-z, 0-9或_)或空格。

Edit: please note that there are a couple of other patterns that might prove helpful. However, I can't explain them all, so have a look at the reference section of regular-expressions.info.

编辑:请注意,还有一些其他的模式可能会有帮助。但是,我不能解释全部,所以看看常规表达的参考部分。

Here's less restrictive alternative to the "define allowed characters" approach, as suggested by Ray:

以下是Ray建议的“定义允许字符”方法的限制性较弱的替代方法:

String  result = yourString.replaceAll("[^\\p{L}\\p{Z}]","");

The regex matches everything that is not a letter in any language and not a separator (whitespace, linebreak etc.). Note that you can't use [\P{L}\P{Z}] (upper case P means not having that property), since that would mean "everything that is not a letter or not whitespace", which almost matches everything, since letters are not whitespace and vice versa.

regex匹配所有不是任何语言中的字母,也不是分隔符(空格、换行符等)的内容。请注意,您不能使用[\P{L} P{Z}](大写字母P表示没有该属性),因为这意味着“所有不是字母或非空格的内容”,这几乎匹配所有内容,因为字母不是空格,反之亦然。

Additional information on Unicode

关于Unicode的更多信息

Some unicode characters seem to cause problems due to different possible ways to encode them (as a single code point or a combination of code points). Please refer to regular-expressions.info for more information.

一些unicode字符似乎会导致问题,因为可能的方法将它们编码(单个代码点或代码点的组合)。更多信息请参考常规表达。info。

#2


14  

As described here http://developer.android.com/reference/java/util/regex/Pattern.html

这里描述http://developer.android.com/reference/java/util/regex/Pattern.html

Patterns are compiled regular expressions. In many cases, convenience methods such as String.matches, String.replaceAll and String.split will be preferable, but if you need to do a lot of work with the same regular expression, it may be more efficient to compile it once and reuse it. The Pattern class and its companion, Matcher, also offer more functionality than the small amount exposed by String.

模式是编译的正则表达式。在许多情况下,方便的方法如字符串。比赛,字符串。replaceAll和字符串。最好是拆分,但是如果您需要对相同的正则表达式做大量工作,那么编译一次并重用它可能会更有效。与String公开的少量功能相比,Pattern类及其伙伴Matcher也提供了更多的功能。

public class RegularExpressionTest {

public static void main(String[] args) {
    System.out.println("String is = "+getOnlyStrings("!&(*^*(^(+one(&(^()(*)(*&^%$#@!#$%^&*()("));
    System.out.println("Number is = "+getOnlyDigits("&(*^*(^(+91-&*9hi-639-0097(&(^("));
}

 public static String getOnlyDigits(String s) {
    Pattern pattern = Pattern.compile("[^0-9]");
    Matcher matcher = pattern.matcher(s);
    String number = matcher.replaceAll("");
    return number;
 }
 public static String getOnlyStrings(String s) {
    Pattern pattern = Pattern.compile("[^a-z A-Z]");
    Matcher matcher = pattern.matcher(s);
    String number = matcher.replaceAll("");
    return number;
 }
}

Result

结果

String is = one
Number is = 9196390097

#3


13  

Try replaceAll() method of the String class.

尝试字符串类的replaceAll()方法。

BTW here is the method, return type and parameters.

这里是方法,返回类型和参数。

public String replaceAll(String regex,
                         String replacement)

Example:

例子:

String str = "Hello +-^ my + - friends ^ ^^-- ^^^ +!";
str = str.replaceAll("[-+^]*", "");

It should remove all the {'^', '+', '-'} chars that you wanted to remove!

它应该删除所有{“^”,“+”,“-”}字符,你想删除!

#4


2  

Use the String.replaceAll() method in Java. replaceAll should be good enough for your problem.

在Java中使用String.replaceAll()方法。replaceAll应该足以解决你的问题。

#5


2  

To Remove Specail character

删除特殊字符

String t2 = "!@#$%^&*()-';,./?><+abdd";

t2 = t2.replaceAll("\\\W+","");

Output will be : abdd.

输出将是:abdd。

This works perfectly.

这是完美的。

#6


1  

You can remove single char as follows:

您可以删除单个字符如下:

String str="+919595354336";

 String result = str.replaceAll("\\\\+","");

 System.out.println(result);

OUTPUT:

输出:

919595354336

#7


1  

This will replace all the characters except alphanumeric

这将替换除字母数字之外的所有字符

replaceAll("[^A-Za-z0-9]","");

#8


0  

If you just want to do a literal replace in java, use Pattern.quote(string) to escape any string to a literal.

如果您只想在java中执行文字替换,请使用Pattern.quote(string)将任何字符串转义为文字。

myString.replaceAll(Pattern.quote(matchingStr), replacementStr)

#1


205  

That depends on what you define as special characters, but try replaceAll(...):

这取决于您将什么定义为特殊字符,但是请尝试replaceAll(…):

String result = yourString.replaceAll("[-+.^:,]","");

Note that the ^ character must not be the first one in the list, since you'd then either have to escape it or it would mean "any but these characters".

注意,^字符不能在列表中第一个,因为你要么逃避它或者它将意味着“任何但这些字符”。

Another note: the - character needs to be the first or last one on the list, otherwise you'd have to escape it or it would define a range ( e.g. :-, would mean "all characters in the range : to ,).

另一个注意事项:-字符必须是列表中的第一个或最后一个字符,否则您必须转义它,否则它将定义一个范围(例如:-,将意味着“范围中的所有字符:to,)”。

So, in order to keep consistency and not depend on character positioning, you might want to escape all those characters that have a special meaning in regular expressions (the following list is not complete, so be aware of other characters like (, {, $ etc.):

因此,为了保持一致性,而不依赖于字符定位,您可能希望转义正则表达式中所有具有特殊意义的字符(以下列表不完整,所以请注意其他字符,如(,{,$等):

String result = yourString.replaceAll("[\\-\\+\\.\\^:,]","");


If you want to get rid of all punctuation and symbols, try this regex: \p{P}\p{S} (keep in mind that in Java strings you'd have to escape back slashes: "\\p{P}\\p{S}").

如果你想摆脱所有的标点和符号,试试这个regex: \p{p}\p{S}(记住,在Java字符串中,你必须转义回斜线:“\p{p}\p{S}”)。

A third way could be something like this, if you can exactly define what should be left in your string:

第三种方法可以是这样的,如果你能准确地定义你的字符串中应该剩下什么:

String  result = yourString.replaceAll("[^\\w\\s]","");

This means: replace everything that is not a word character (a-z in any case, 0-9 or _) or whitespace.

这意味着:替换所有非文字字符(无论如何是a-z, 0-9或_)或空格。

Edit: please note that there are a couple of other patterns that might prove helpful. However, I can't explain them all, so have a look at the reference section of regular-expressions.info.

编辑:请注意,还有一些其他的模式可能会有帮助。但是,我不能解释全部,所以看看常规表达的参考部分。

Here's less restrictive alternative to the "define allowed characters" approach, as suggested by Ray:

以下是Ray建议的“定义允许字符”方法的限制性较弱的替代方法:

String  result = yourString.replaceAll("[^\\p{L}\\p{Z}]","");

The regex matches everything that is not a letter in any language and not a separator (whitespace, linebreak etc.). Note that you can't use [\P{L}\P{Z}] (upper case P means not having that property), since that would mean "everything that is not a letter or not whitespace", which almost matches everything, since letters are not whitespace and vice versa.

regex匹配所有不是任何语言中的字母,也不是分隔符(空格、换行符等)的内容。请注意,您不能使用[\P{L} P{Z}](大写字母P表示没有该属性),因为这意味着“所有不是字母或非空格的内容”,这几乎匹配所有内容,因为字母不是空格,反之亦然。

Additional information on Unicode

关于Unicode的更多信息

Some unicode characters seem to cause problems due to different possible ways to encode them (as a single code point or a combination of code points). Please refer to regular-expressions.info for more information.

一些unicode字符似乎会导致问题,因为可能的方法将它们编码(单个代码点或代码点的组合)。更多信息请参考常规表达。info。

#2


14  

As described here http://developer.android.com/reference/java/util/regex/Pattern.html

这里描述http://developer.android.com/reference/java/util/regex/Pattern.html

Patterns are compiled regular expressions. In many cases, convenience methods such as String.matches, String.replaceAll and String.split will be preferable, but if you need to do a lot of work with the same regular expression, it may be more efficient to compile it once and reuse it. The Pattern class and its companion, Matcher, also offer more functionality than the small amount exposed by String.

模式是编译的正则表达式。在许多情况下,方便的方法如字符串。比赛,字符串。replaceAll和字符串。最好是拆分,但是如果您需要对相同的正则表达式做大量工作,那么编译一次并重用它可能会更有效。与String公开的少量功能相比,Pattern类及其伙伴Matcher也提供了更多的功能。

public class RegularExpressionTest {

public static void main(String[] args) {
    System.out.println("String is = "+getOnlyStrings("!&(*^*(^(+one(&(^()(*)(*&^%$#@!#$%^&*()("));
    System.out.println("Number is = "+getOnlyDigits("&(*^*(^(+91-&*9hi-639-0097(&(^("));
}

 public static String getOnlyDigits(String s) {
    Pattern pattern = Pattern.compile("[^0-9]");
    Matcher matcher = pattern.matcher(s);
    String number = matcher.replaceAll("");
    return number;
 }
 public static String getOnlyStrings(String s) {
    Pattern pattern = Pattern.compile("[^a-z A-Z]");
    Matcher matcher = pattern.matcher(s);
    String number = matcher.replaceAll("");
    return number;
 }
}

Result

结果

String is = one
Number is = 9196390097

#3


13  

Try replaceAll() method of the String class.

尝试字符串类的replaceAll()方法。

BTW here is the method, return type and parameters.

这里是方法,返回类型和参数。

public String replaceAll(String regex,
                         String replacement)

Example:

例子:

String str = "Hello +-^ my + - friends ^ ^^-- ^^^ +!";
str = str.replaceAll("[-+^]*", "");

It should remove all the {'^', '+', '-'} chars that you wanted to remove!

它应该删除所有{“^”,“+”,“-”}字符,你想删除!

#4


2  

Use the String.replaceAll() method in Java. replaceAll should be good enough for your problem.

在Java中使用String.replaceAll()方法。replaceAll应该足以解决你的问题。

#5


2  

To Remove Specail character

删除特殊字符

String t2 = "!@#$%^&*()-';,./?><+abdd";

t2 = t2.replaceAll("\\\W+","");

Output will be : abdd.

输出将是:abdd。

This works perfectly.

这是完美的。

#6


1  

You can remove single char as follows:

您可以删除单个字符如下:

String str="+919595354336";

 String result = str.replaceAll("\\\\+","");

 System.out.println(result);

OUTPUT:

输出:

919595354336

#7


1  

This will replace all the characters except alphanumeric

这将替换除字母数字之外的所有字符

replaceAll("[^A-Za-z0-9]","");

#8


0  

If you just want to do a literal replace in java, use Pattern.quote(string) to escape any string to a literal.

如果您只想在java中执行文字替换,请使用Pattern.quote(string)将任何字符串转义为文字。

myString.replaceAll(Pattern.quote(matchingStr), replacementStr)