用RegExp删除所有特殊字符。

时间:2021-10-19 00:23:34

I would like a RegExp that will remove all special characters from a string. I am trying something like this but it doesn’t work in IE7, though it works in Firefox.

我想要一个RegExp,它将删除字符串中的所有特殊字符。我正在尝试这样的方法,但它在IE7中不起作用,尽管它在Firefox中运行。

var specialChars = "!@#$^&%*()+=-[]\/{}|:<>?,.";

for (var i = 0; i < specialChars.length; i++) {
  stringToReplace = stringToReplace.replace(new RegExp("\\" + specialChars[i], "gi"), "");
}

A detailed description of the RegExp would be helpful as well.

对RegExp的详细描述也会有所帮助。

8 个解决方案

#1


440  

var desired = stringToReplace.replace(/[^\w\s]/gi, '')

As was mentioned in the comments it's easier to do this as a whitelist - replace the characters which aren't in your safelist.

正如在评论中提到的那样,作为一个白名单更容易做到这一点——替换那些不在你的安全列表中的字符。

The caret (^) character is the negation of the set [...], gi say global and case-insensitive (the latter is a bit redundant but I wanted to mention it) and the safelist in this example is digits, word characters, underscores (\w) and whitespace (\s).

符号()字符是对集合的否定[…]),gi说全局和不区分大小写(后者有点多余,但我想提一下),这个例子中的safelist是数字、字字符、下划线(\w)和空格(\s)。

#2


63  

Note that if you still want to exclude a set, including things like slashes and special characters you can do the following:

注意,如果您仍然想排除一个集合,包括斜杠和特殊字符,您可以执行以下操作:

var outString = sourceString.replace(/[`~!@#$%^&*()_|+\-=?;:'",.<>\{\}\[\]\\\/]/gi, '');

take special note that in order to also include the "minus" character, you need to escape it with a backslash like the latter group. if you don't it will also select 0-9 which is probably undesired.

特别注意,为了还包含“减”字符,您需要像后面的组一样使用反斜杠来摆脱它。如果你不这样做,它也会选择0-9,这可能是不需要的。

#3


9  

Plain Javascript regex does not handle Unicode letters. Do not use [^\w\s], this will remove letters with accents (like àèéìòù), not mentions to Cyrillic or Chinese, such language letters will be completed removed.

普通的Javascript正则表达式不处理Unicode字母。不要使用[s ^ \ w \],这将删除信件与口音(比如aeeiou),不会提到斯拉夫字母或中文,这样的语言字母将完成删除。

You really don't want remove these letters together with all the special characters. You have two chances:

你真的不想把这些字母和所有的特殊字符放在一起。你有两个可能性:

  • Add in your regex all the special characters you don't want remove,
    for example: [^èéòàùì\w\s].
  • 添加在你的正则表达式所有你不想删除特殊字符,例如:[^ eeoaui \ w \ s]。
  • Have a look at xregexp.com. XRegExp adds base support for Unicode matching via the \p{...} syntax.
  • 看看xregexp.com。XRegExp为通过\p{…}语法。

var str = "Їжак::: résd,$%& adùf"
var search = XRegExp('([^?<first>\\pL ]+)');
var res = XRegExp.replace(str, search, '',"all");

console.log(res); // returns "Їжак::: resd,adf"
console.log(str.replace(/[^\w\s]/gi, '') ); // returns " rsd adf"
console.log(str.replace(/[^\wèéòàùì\s]/gi, '') ); // returns " résd adùf"
<script src="https://cdnjs.cloudflare.com/ajax/libs/xregexp/3.1.1/xregexp-all.js"></script>

#4


3  

The first solution does not work for any UTF-8 alphaben. (It will cut text such as Їжак). I have managed to create function which do not use RegExp and use good UTF-8 support in JavaScript engine. The idea is simple if symbol is equal in uppercase and lowercase it is special character. The only exception is made for whitespace.

第一个解决方案不适用于任何UTF-8 alphaben。(它将会削减文本,如)。我已经成功创建了不使用RegExp并在JavaScript引擎中使用良好的UTF-8支持的函数。这个想法很简单,如果符号是大写的,小写的是特殊字符。惟一的例外是空白。

function removeSpecials(str) {
    var lower = str.toLowerCase();
    var upper = str.toUpperCase();

    var res = "";
    for(var i=0; i<lower.length; ++i) {
        if(lower[i] != upper[i] || lower[i].trim() === '')
            res += str[i];
    }
    return res;
}

#5


1  

why dont you do something like:

你为什么不做点什么呢?

re = /^[a-z0-9 ]$/i;
var isValid = re.test(yourInput);

to check if your input contain any special char

检查您的输入是否包含任何特殊字符。

#6


1  

I use RegexBuddy for debbuging my regexes it has almost all languages very usefull. Than copy/paste for the targeted language. Terrific tool and not very expensive.

我使用RegexBuddy来消除我的regex,它几乎所有的语言都非常有用。比复制/粘贴的目标语言。很棒的工具,不是很贵。

So I copy/pasted your regex and your issue is that [,] are special characters in regex, so you need to escape them. So the regex should be : /!@#$^&%*()+=-[\x5B\x5D]\/{}|:<>?,./im

所以我复制/粘贴你的regex,你的问题是,[,]是regex中的特殊字符,所以你需要逃离它们。所以正则表达式应该:/ ! @ # $ % ^ & *()+ = -[\ x5B \ x5D]\ / { } |:< > ?,。/我

#7


0  

str.replace(/\s|[0-9_]|\W|[#$%^&*()]/g, "") I did sth like this. But there is some people who did it much easier like str.replace(/\W_/g,"");

str.replace(/ \ s | | \ W(0-9_)|(# $ % ^ & *())/ g," ")我做了某事。但是也有一些人做起来更容易一些,比如str.replace(/\W_/g,");

#8


-9  

use regex ^[^/\\()~!@#$%^&*{«»„““”‘’|\n\t….,;`^"<>'}+:?®©]*$

使用regex ^[^ / \ \()~ ! @ # $ % ^ & * {«»„”“”“| \ n \ t ....;“^“< >”} +:?®©]*美元

#1


440  

var desired = stringToReplace.replace(/[^\w\s]/gi, '')

As was mentioned in the comments it's easier to do this as a whitelist - replace the characters which aren't in your safelist.

正如在评论中提到的那样,作为一个白名单更容易做到这一点——替换那些不在你的安全列表中的字符。

The caret (^) character is the negation of the set [...], gi say global and case-insensitive (the latter is a bit redundant but I wanted to mention it) and the safelist in this example is digits, word characters, underscores (\w) and whitespace (\s).

符号()字符是对集合的否定[…]),gi说全局和不区分大小写(后者有点多余,但我想提一下),这个例子中的safelist是数字、字字符、下划线(\w)和空格(\s)。

#2


63  

Note that if you still want to exclude a set, including things like slashes and special characters you can do the following:

注意,如果您仍然想排除一个集合,包括斜杠和特殊字符,您可以执行以下操作:

var outString = sourceString.replace(/[`~!@#$%^&*()_|+\-=?;:'",.<>\{\}\[\]\\\/]/gi, '');

take special note that in order to also include the "minus" character, you need to escape it with a backslash like the latter group. if you don't it will also select 0-9 which is probably undesired.

特别注意,为了还包含“减”字符,您需要像后面的组一样使用反斜杠来摆脱它。如果你不这样做,它也会选择0-9,这可能是不需要的。

#3


9  

Plain Javascript regex does not handle Unicode letters. Do not use [^\w\s], this will remove letters with accents (like àèéìòù), not mentions to Cyrillic or Chinese, such language letters will be completed removed.

普通的Javascript正则表达式不处理Unicode字母。不要使用[s ^ \ w \],这将删除信件与口音(比如aeeiou),不会提到斯拉夫字母或中文,这样的语言字母将完成删除。

You really don't want remove these letters together with all the special characters. You have two chances:

你真的不想把这些字母和所有的特殊字符放在一起。你有两个可能性:

  • Add in your regex all the special characters you don't want remove,
    for example: [^èéòàùì\w\s].
  • 添加在你的正则表达式所有你不想删除特殊字符,例如:[^ eeoaui \ w \ s]。
  • Have a look at xregexp.com. XRegExp adds base support for Unicode matching via the \p{...} syntax.
  • 看看xregexp.com。XRegExp为通过\p{…}语法。

var str = "Їжак::: résd,$%& adùf"
var search = XRegExp('([^?<first>\\pL ]+)');
var res = XRegExp.replace(str, search, '',"all");

console.log(res); // returns "Їжак::: resd,adf"
console.log(str.replace(/[^\w\s]/gi, '') ); // returns " rsd adf"
console.log(str.replace(/[^\wèéòàùì\s]/gi, '') ); // returns " résd adùf"
<script src="https://cdnjs.cloudflare.com/ajax/libs/xregexp/3.1.1/xregexp-all.js"></script>

#4


3  

The first solution does not work for any UTF-8 alphaben. (It will cut text such as Їжак). I have managed to create function which do not use RegExp and use good UTF-8 support in JavaScript engine. The idea is simple if symbol is equal in uppercase and lowercase it is special character. The only exception is made for whitespace.

第一个解决方案不适用于任何UTF-8 alphaben。(它将会削减文本,如)。我已经成功创建了不使用RegExp并在JavaScript引擎中使用良好的UTF-8支持的函数。这个想法很简单,如果符号是大写的,小写的是特殊字符。惟一的例外是空白。

function removeSpecials(str) {
    var lower = str.toLowerCase();
    var upper = str.toUpperCase();

    var res = "";
    for(var i=0; i<lower.length; ++i) {
        if(lower[i] != upper[i] || lower[i].trim() === '')
            res += str[i];
    }
    return res;
}

#5


1  

why dont you do something like:

你为什么不做点什么呢?

re = /^[a-z0-9 ]$/i;
var isValid = re.test(yourInput);

to check if your input contain any special char

检查您的输入是否包含任何特殊字符。

#6


1  

I use RegexBuddy for debbuging my regexes it has almost all languages very usefull. Than copy/paste for the targeted language. Terrific tool and not very expensive.

我使用RegexBuddy来消除我的regex,它几乎所有的语言都非常有用。比复制/粘贴的目标语言。很棒的工具,不是很贵。

So I copy/pasted your regex and your issue is that [,] are special characters in regex, so you need to escape them. So the regex should be : /!@#$^&%*()+=-[\x5B\x5D]\/{}|:<>?,./im

所以我复制/粘贴你的regex,你的问题是,[,]是regex中的特殊字符,所以你需要逃离它们。所以正则表达式应该:/ ! @ # $ % ^ & *()+ = -[\ x5B \ x5D]\ / { } |:< > ?,。/我

#7


0  

str.replace(/\s|[0-9_]|\W|[#$%^&*()]/g, "") I did sth like this. But there is some people who did it much easier like str.replace(/\W_/g,"");

str.replace(/ \ s | | \ W(0-9_)|(# $ % ^ & *())/ g," ")我做了某事。但是也有一些人做起来更容易一些,比如str.replace(/\W_/g,");

#8


-9  

use regex ^[^/\\()~!@#$%^&*{«»„““”‘’|\n\t….,;`^"<>'}+:?®©]*$

使用regex ^[^ / \ \()~ ! @ # $ % ^ & * {«»„”“”“| \ n \ t ....;“^“< >”} +:?®©]*美元