
时间:2022-09-15 15:35:54

Here's a short piece of code:


var utility = {
    escapeQuotes: function(string) {
        return string.replace(new RegExp('"', 'g'),'\\"');
    unescapeQuotes: function(string) {
        return string.replace(new RegExp('\\"', 'g'),'"');

var a = 'hi "';

var b = utility.escapeQuotes(a);
var c = utility.unescapeQuotes(b);

console.log(b + ' | ' + c);

I would expect this code to work, however as a result I receive:


hi \" | hi \"

If I change the first parameter of the new RegExp constructor in the unescapeQuotes method to 4 backslashes everything starts working as it should.


string.replace(new RegExp('\\\\"', 'g'),'"');

The result:

hi \" | hi " 

Why are four backslashes needed as the first parameter of the new RegExp constructor? Why doesn't it work with only 2 of them?


1 个解决方案



The problem is that you're using the RegExp constructor, which accepts a string, rather than using a regular expression literal. So in this line in your unescape:


return string.replace(new RegExp('\\"', 'g'),'"');

...the \\ is interpreted by the JavaScript parser as part handling the string, resulting in a single backslash being handed to the regular expression parser. So the expression the regular expression parser sees is \". The backslash is an escape character in regex, too, but \" doesn't mean anything special and just ends up being ". To have an actual backslash in a regex, you have to have two of them; to do that in a string literal, you have to have four (so they survive both layers of interpretation).

... \\是由JavaScript解析器解释为处理字符串的部分,导致将单个反斜杠传递给正则表达式解析器。所以正则表达式解析器看到的表达式是\“。反斜杠也是正则表达式中的转义字符,但是\”并不意味着什么特别的东西,最终只是“。要在正则表达式中有一个实际的反斜杠,你有拥有其中两个;要在字符串文字中执行此操作,您必须有四个(因此它们可以在两个解释层中存活)。

Unless you have a very good reason to use the RegExp constructor (e.g., you have to use some varying input), always use the literal form:


var utility = {
    escapeQuotes: function(string) {
        return string.replace(/"/g, '\\"');
    unescapeQuotes: function(string) {
        return string.replace(/\\"/g, '"');

It's a lot less confusing.




The problem is that you're using the RegExp constructor, which accepts a string, rather than using a regular expression literal. So in this line in your unescape:


return string.replace(new RegExp('\\"', 'g'),'"');

...the \\ is interpreted by the JavaScript parser as part handling the string, resulting in a single backslash being handed to the regular expression parser. So the expression the regular expression parser sees is \". The backslash is an escape character in regex, too, but \" doesn't mean anything special and just ends up being ". To have an actual backslash in a regex, you have to have two of them; to do that in a string literal, you have to have four (so they survive both layers of interpretation).

... \\是由JavaScript解析器解释为处理字符串的部分,导致将单个反斜杠传递给正则表达式解析器。所以正则表达式解析器看到的表达式是\“。反斜杠也是正则表达式中的转义字符,但是\”并不意味着什么特别的东西,最终只是“。要在正则表达式中有一个实际的反斜杠,你有拥有其中两个;要在字符串文字中执行此操作,您必须有四个(因此它们可以在两个解释层中存活)。

Unless you have a very good reason to use the RegExp constructor (e.g., you have to use some varying input), always use the literal form:


var utility = {
    escapeQuotes: function(string) {
        return string.replace(/"/g, '\\"');
    unescapeQuotes: function(string) {
        return string.replace(/\\"/g, '"');

It's a lot less confusing.
