在正则表达式模式中()和[]的区别是什么?

时间:2021-12-14 22:35:33

What is the difference between encasing part of a regular expression in () (parentheses) and doing it in [] (square brackets)?

在()(括号)内的正则表达式的封闭部分和在[](方括号)中执行它的区别是什么?

How does this:

如何:

[a-z0-9]

differ from this:

不同于这个:

(a-z0-9)

?

吗?

6 个解决方案

#1


32  

[] denotes a character class. () denotes a capturing group.

表示字符类。()表示捕获组。

[a-z0-9] -- One character that is in the range of a-z OR 0-9

[a-z0-9]——a-z或0-9范围内的一个字符

(a-z0-9) -- Explicit capture of a-z0-9. No ranges.

(a-z0-9)——显式捕获a-z0-9。没有范围。

a -- Can be captured by [a-z0-9].

a——可以被[a-z0-9]捕获。

a-z0-9 -- Can be captured by (a-z0-9) and then can be referenced in a replacement and/or later in the expression.

a-z0-9——可以被(a-z0-9)捕获,然后可以在替换和/或后面的表达式中引用。

#2


13  

(…) is a group that groups the contents like in math; (a-z0-9) is the grouped sequence of a-z0-9. Groups are particularly used with quantifiers that allow the preceding expression to be repeated as a whole: a*b* matches any number of a’s followed by any number of b’s, e.g. a, aaab, bbbbb, etc.; in contrast to that, (ab)* matches any number of ab’s, e.g. ab, abababab, etc.

(…)是像数学一样对内容进行分组的一组;(a-z0-9)是a-z0-9的分组序列。组特别用于量词,使前面的表达式作为一个整体重复:a*b*匹配任意数量的a,后面跟着任意数量的b,例如a、aaab、bbbbb等;与此相反,(ab)*与ab、abababab等任意数量的ab匹配。

[…] is a character class that describes the options for one single character; [a-z0-9] describes one single character that can be of the range az or 09.

是描述单个字符的选项的字符类;[a-z0-9]描述一个可以是a-z或0-9范围的单个字符。

#3


9  

The [] construct in a regex is essentially shorthand for an | on all of the contents. For example [abc] matches a, b or c. Additionally the - character has special meaning inside of a []. It provides a range construct. The regex [a-z] will match any letter a through z.

regex中的[]构造本质上是对所有内容的|的简写。例如,[abc]匹配a、b或c。此外,-字符在a[]中有特殊的含义。它提供一个范围结构。regex [a-z]将匹配任何字母a到z。

The () construct is a grouping construct establishing a precedence order (it also has impact on accessing matched substrings but that's a bit more of an advanced topic). The regex (abc) will match the string "abc".

()构造是一个建立优先顺序的分组构造(它也对访问匹配的子字符串有影响,但这更像是一个高级主题)。regex (abc)将与字符串“abc”匹配。

#4


4  

[a-z0-9] will match any lowercase letter or number. (a-z0-9) will match the exact string "a-z0-9" and allows two additional things: You can apply modifiers like * and ? and + to the whole group, and you can reference this match after the match with $1 or \1. Not useful with your example, though.

[a-z0-9]将匹配任何小写字母或数字。(a-z0-9)将匹配确切的字符串“a-z0-9”,并允许另外两件事:您可以应用*和?和+到整个组,您可以在匹配后使用$1或\1引用此匹配。不过,在您的示例中没有用处。

#5


0  

Try ([a-z0-9]) to capture a mixed string of lowercase letters and numbers, as well as capture for back references (or extraction).

尝试([a-z0-9])捕获小写字母和数字的混合字符串,并捕获反向引用(或提取)。

#6


-1  

[a-z0-9] will match one of abcdefghijklmnopqrstuvwxyz0123456789. In other words, square brackets match exactly one character.

[a-z0-9]将匹配abcdefghijklmnopqrstuvwxyz0123456789。换句话说,方括号正好匹配一个字符。

(a-z0-9) will match two characters, the first is one of abcdefghijklmnopqrstuvwxyz, the second is one of 0123456789, just as if the parenthesis weren't there. The () will allow you to read exactly which characters were matched. Parenthesis are also useful for OR'ing two expressions with the bar | character. For example, (a-z|0-9) will match one character -- any of the lowercase alpha or digit.

(a-z0-9)将匹配两个字符,第一个字符是abcdefghijklmnopqrstuvwxyz,第二个字符是0123456789中的一个,就好像括号不存在一样。()将允许您准确读取匹配的字符。括号对于使用bar |字符的两个表达式也很有用。例如,(a-z| -9)将匹配一个字符——任何小写字母或数字。

#1


32  

[] denotes a character class. () denotes a capturing group.

表示字符类。()表示捕获组。

[a-z0-9] -- One character that is in the range of a-z OR 0-9

[a-z0-9]——a-z或0-9范围内的一个字符

(a-z0-9) -- Explicit capture of a-z0-9. No ranges.

(a-z0-9)——显式捕获a-z0-9。没有范围。

a -- Can be captured by [a-z0-9].

a——可以被[a-z0-9]捕获。

a-z0-9 -- Can be captured by (a-z0-9) and then can be referenced in a replacement and/or later in the expression.

a-z0-9——可以被(a-z0-9)捕获,然后可以在替换和/或后面的表达式中引用。

#2


13  

(…) is a group that groups the contents like in math; (a-z0-9) is the grouped sequence of a-z0-9. Groups are particularly used with quantifiers that allow the preceding expression to be repeated as a whole: a*b* matches any number of a’s followed by any number of b’s, e.g. a, aaab, bbbbb, etc.; in contrast to that, (ab)* matches any number of ab’s, e.g. ab, abababab, etc.

(…)是像数学一样对内容进行分组的一组;(a-z0-9)是a-z0-9的分组序列。组特别用于量词,使前面的表达式作为一个整体重复:a*b*匹配任意数量的a,后面跟着任意数量的b,例如a、aaab、bbbbb等;与此相反,(ab)*与ab、abababab等任意数量的ab匹配。

[…] is a character class that describes the options for one single character; [a-z0-9] describes one single character that can be of the range az or 09.

是描述单个字符的选项的字符类;[a-z0-9]描述一个可以是a-z或0-9范围的单个字符。

#3


9  

The [] construct in a regex is essentially shorthand for an | on all of the contents. For example [abc] matches a, b or c. Additionally the - character has special meaning inside of a []. It provides a range construct. The regex [a-z] will match any letter a through z.

regex中的[]构造本质上是对所有内容的|的简写。例如,[abc]匹配a、b或c。此外,-字符在a[]中有特殊的含义。它提供一个范围结构。regex [a-z]将匹配任何字母a到z。

The () construct is a grouping construct establishing a precedence order (it also has impact on accessing matched substrings but that's a bit more of an advanced topic). The regex (abc) will match the string "abc".

()构造是一个建立优先顺序的分组构造(它也对访问匹配的子字符串有影响,但这更像是一个高级主题)。regex (abc)将与字符串“abc”匹配。

#4


4  

[a-z0-9] will match any lowercase letter or number. (a-z0-9) will match the exact string "a-z0-9" and allows two additional things: You can apply modifiers like * and ? and + to the whole group, and you can reference this match after the match with $1 or \1. Not useful with your example, though.

[a-z0-9]将匹配任何小写字母或数字。(a-z0-9)将匹配确切的字符串“a-z0-9”,并允许另外两件事:您可以应用*和?和+到整个组,您可以在匹配后使用$1或\1引用此匹配。不过,在您的示例中没有用处。

#5


0  

Try ([a-z0-9]) to capture a mixed string of lowercase letters and numbers, as well as capture for back references (or extraction).

尝试([a-z0-9])捕获小写字母和数字的混合字符串,并捕获反向引用(或提取)。

#6


-1  

[a-z0-9] will match one of abcdefghijklmnopqrstuvwxyz0123456789. In other words, square brackets match exactly one character.

[a-z0-9]将匹配abcdefghijklmnopqrstuvwxyz0123456789。换句话说,方括号正好匹配一个字符。

(a-z0-9) will match two characters, the first is one of abcdefghijklmnopqrstuvwxyz, the second is one of 0123456789, just as if the parenthesis weren't there. The () will allow you to read exactly which characters were matched. Parenthesis are also useful for OR'ing two expressions with the bar | character. For example, (a-z|0-9) will match one character -- any of the lowercase alpha or digit.

(a-z0-9)将匹配两个字符,第一个字符是abcdefghijklmnopqrstuvwxyz,第二个字符是0123456789中的一个,就好像括号不存在一样。()将允许您准确读取匹配的字符。括号对于使用bar |字符的两个表达式也很有用。例如,(a-z| -9)将匹配一个字符——任何小写字母或数字。