有人可以帮我理解正则表达式的exec方法吗?

时间:2021-11-13 22:44:35

The best place I have found for the exec method is Eloquent Javascript Chapter 9:

我在exec方法中找到的最好的地方是Eloquent Javascript第9章:

"Regular expressions also have an exec (execute) method that will return null if no match was found and return an object with information about the match otherwise. An object returned from exec has an index property that tells us where in the string the successful match begins. Other than that, the object looks like (and in fact is) an array of strings, whose first element is the string that was matched...."

“正则表达式也有一个exec(execute)方法,如果没有找到匹配则返回null,否则返回一个带有匹配信息的对象。从exec返回的对象有一个index属性告诉我们字符串中成功匹配的位置除此之外,该对象看起来像(实际上是)一个字符串数组,其第一个元素是匹配的字符串....“

So far this makes sense but then it gets a bit confusing:

到目前为止,这是有道理的,但后来有点令人困惑:

"When the regular expression contains subexpressions grouped with parentheses, the text that matched those groups will also show up in the array. The whole match is always the first element."

“当正则表达式包含用括号分组的子表达式时,匹配这些组的文本也将显示在数组中。整个匹配始终是第一个元素。”

okay but...

"The next element is the part matched by the first group (the one whose opening parenthesis comes first in the expression), then the second group, and so on."

“下一个元素是与第一个组匹配的部分(其左括号在表达式中首先出现的那个),然后是第二个组,依此类推。”

var quotedText = /'([^']*)'/;
console.log(quotedText.exec("she said 'hello'"));
// → ["'hello'", "hello"]

My confusion is with the repeated hello in this example. I don't understand why it would give me two hellos back?

我的困惑在于这个例子中的重复问候。我不明白为什么它会给我两个hellos回来?

And then the the topic is wrapped up by the following:

然后该主题由以下内容组成:

"When a group does not end up being matched at all (for example, when followed by a question mark), its position in the output array will hold undefined. Similarly, when a group is matched multiple times, only the last match ends up in the array."

“当一个组最终没有匹配时(例如,后跟一个问号),它在输出数组中的位置将保持未定义。同样,当一个组多次匹配时,只有最后一个匹配结束在数组中。“

console.log(/bad(ly)?/.exec("bad"));
// → ["bad", undefined]
console.log(/(\d)+/.exec("123"));
// → ["123", "3"]

This last sentence and example keep me confused....

最后一句话和例子让我感到困惑....

Any light shed on this would be much appreciated!

在这上面的任何光线将非常感谢!

1 个解决方案

#1


7  

I don't understand why it would give me two hellos back?

我不明白为什么它会给我两个hellos回来?

Because the first entry in the array is the overall match for the expression, which is then followed by the content of any capture groups the expression defines. Since the expression defines one capture group, you get back two entries. The overall match is 'hello' (with the single quotes), and the capture group is hello (without them), because in the regular expression, only the hello is in the capture group (the parentheses), while the ' are outside it:

因为数组中的第一个条目是表达式的整体匹配,然后是表达式定义的任何捕获组的内容。由于表达式定义了一个捕获组,因此您将返回两个条目。整体匹配是'hello'(带单引号),捕获组是hello(没有它们),因为在正则表达式中,只有hello在捕获组(括号中),而'在它之外:

 vvvvvvvvv----- Overall expression
/'([^']*)'/
  ^^^^^^^------ Capture group

Let's look at that /bad(ly)?/ example: What it says is "match bad optionally followed by ly, capturing the ly if it's there." So you get:

让我们看看那个/不好(ly)?/例子:它说的是“匹配不好,可选择跟随,如果它存在则捕获它们。”所以你得到:

console.log(/bad(ly)?/.exec("bad"));
// -> ["bad", undefined]
//     ^      ^
//     |      +--- first capture group has nothing in it
//     +---------- overall match is "bad"
console.log(/bad(ly)?/.exec("badly"));
// -> ["badly", "ly"]
//     ^        ^
//     |        +- first capture group has "ly"
//     +---------- overall match is "badly"

Suppose we put the l and y in individual capture groups, and make both of them optional:

假设我们将l和y放在各个捕获组中,并使它们都是可选的:

console.log(/bad(l)?(y)?/.exec("bad"));
// -> ["bad", undefined, undefined]
//     ^      ^          ^
//     |      |          +--- Nothing in the second capture group
//     |      +-------------- Nothing in the first capture group
//     +--------------------- Overall match is "bad"
console.log(/bad(l)?(y)?/.exec("badly"));
// -> ["badly", "l", "y"]
//     ^        ^    ^
//     |        |    +------- Second capture group has "y"
//     |        +------------ First capture group has "l"
//     +--------------------- Overall match is "badly"
console.log(/bad(l)?(y)?/.exec("badl"));
// -> ["badl", "l", undefined]
//     ^       ^    ^
//     |       |    +-------- Second capture group has nothing in it
//     |       +------------- First capture group has "l"
//     +--------------------- Overall match is "badl"
console.log(/bad(l)?(y)?/.exec("bady"));
// -> ["bady", undefined, "y"]
//     ^       ^          ^
//     |       |          +-- Second capture group has "y"
//     |       +------------- First capture group has nothing in it
//     +--------------------- Overall match is "bady"

#1


7  

I don't understand why it would give me two hellos back?

我不明白为什么它会给我两个hellos回来?

Because the first entry in the array is the overall match for the expression, which is then followed by the content of any capture groups the expression defines. Since the expression defines one capture group, you get back two entries. The overall match is 'hello' (with the single quotes), and the capture group is hello (without them), because in the regular expression, only the hello is in the capture group (the parentheses), while the ' are outside it:

因为数组中的第一个条目是表达式的整体匹配,然后是表达式定义的任何捕获组的内容。由于表达式定义了一个捕获组,因此您将返回两个条目。整体匹配是'hello'(带单引号),捕获组是hello(没有它们),因为在正则表达式中,只有hello在捕获组(括号中),而'在它之外:

 vvvvvvvvv----- Overall expression
/'([^']*)'/
  ^^^^^^^------ Capture group

Let's look at that /bad(ly)?/ example: What it says is "match bad optionally followed by ly, capturing the ly if it's there." So you get:

让我们看看那个/不好(ly)?/例子:它说的是“匹配不好,可选择跟随,如果它存在则捕获它们。”所以你得到:

console.log(/bad(ly)?/.exec("bad"));
// -> ["bad", undefined]
//     ^      ^
//     |      +--- first capture group has nothing in it
//     +---------- overall match is "bad"
console.log(/bad(ly)?/.exec("badly"));
// -> ["badly", "ly"]
//     ^        ^
//     |        +- first capture group has "ly"
//     +---------- overall match is "badly"

Suppose we put the l and y in individual capture groups, and make both of them optional:

假设我们将l和y放在各个捕获组中,并使它们都是可选的:

console.log(/bad(l)?(y)?/.exec("bad"));
// -> ["bad", undefined, undefined]
//     ^      ^          ^
//     |      |          +--- Nothing in the second capture group
//     |      +-------------- Nothing in the first capture group
//     +--------------------- Overall match is "bad"
console.log(/bad(l)?(y)?/.exec("badly"));
// -> ["badly", "l", "y"]
//     ^        ^    ^
//     |        |    +------- Second capture group has "y"
//     |        +------------ First capture group has "l"
//     +--------------------- Overall match is "badly"
console.log(/bad(l)?(y)?/.exec("badl"));
// -> ["badl", "l", undefined]
//     ^       ^    ^
//     |       |    +-------- Second capture group has nothing in it
//     |       +------------- First capture group has "l"
//     +--------------------- Overall match is "badl"
console.log(/bad(l)?(y)?/.exec("bady"));
// -> ["bady", undefined, "y"]
//     ^       ^          ^
//     |       |          +-- Second capture group has "y"
//     |       +------------- First capture group has nothing in it
//     +--------------------- Overall match is "bady"