What is the difference between encasing part of a regular expression in () (parentheses) and doing it in [] (square brackets)?
在()(括号)内的正则表达式的封闭部分和在[](方括号)中执行它的区别是什么?
How does this:
如何:
[a-z0-9]
differ from this:
不同于这个:
(a-z0-9)
?
吗?
6 个解决方案
#1
32
[]
denotes a character class. ()
denotes a capturing group.
表示字符类。()表示捕获组。
[a-z0-9]
-- One character that is in the range of a-z
OR 0-9
[a-z0-9]——a-z或0-9范围内的一个字符
(a-z0-9)
-- Explicit capture of a-z0-9
. No ranges.
(a-z0-9)——显式捕获a-z0-9。没有范围。
a
-- Can be captured by [a-z0-9]
.
a——可以被[a-z0-9]捕获。
a-z0-9
-- Can be captured by (a-z0-9)
and then can be referenced in a replacement and/or later in the expression.
a-z0-9——可以被(a-z0-9)捕获,然后可以在替换和/或后面的表达式中引用。
#2
13
(…)
is a group that groups the contents like in math; (a-z0-9)
is the grouped sequence of a-z0-9
. Groups are particularly used with quantifiers that allow the preceding expression to be repeated as a whole: a*b*
matches any number of a
’s followed by any number of b
’s, e.g. a
, aaab
, bbbbb
, etc.; in contrast to that, (ab)*
matches any number of ab
’s, e.g. ab
, abababab
, etc.
(…)是像数学一样对内容进行分组的一组;(a-z0-9)是a-z0-9的分组序列。组特别用于量词,使前面的表达式作为一个整体重复:a*b*匹配任意数量的a,后面跟着任意数量的b,例如a、aaab、bbbbb等;与此相反,(ab)*与ab、abababab等任意数量的ab匹配。
[…]
is a character class that describes the options for one single character; [a-z0-9]
describes one single character that can be of the range a
–z
or 0
–9
.
是描述单个字符的选项的字符类;[a-z0-9]描述一个可以是a-z或0-9范围的单个字符。
#3
9
The []
construct in a regex is essentially shorthand for an |
on all of the contents. For example [abc]
matches a, b or c. Additionally the -
character has special meaning inside of a []
. It provides a range construct. The regex [a-z]
will match any letter a through z.
regex中的[]构造本质上是对所有内容的|的简写。例如,[abc]匹配a、b或c。此外,-字符在a[]中有特殊的含义。它提供一个范围结构。regex [a-z]将匹配任何字母a到z。
The ()
construct is a grouping construct establishing a precedence order (it also has impact on accessing matched substrings but that's a bit more of an advanced topic). The regex (abc)
will match the string "abc".
()构造是一个建立优先顺序的分组构造(它也对访问匹配的子字符串有影响,但这更像是一个高级主题)。regex (abc)将与字符串“abc”匹配。
#4
4
[a-z0-9]
will match any lowercase letter or number. (a-z0-9)
will match the exact string "a-z0-9"
and allows two additional things: You can apply modifiers like *
and ?
and +
to the whole group, and you can reference this match after the match with $1
or \1
. Not useful with your example, though.
[a-z0-9]将匹配任何小写字母或数字。(a-z0-9)将匹配确切的字符串“a-z0-9”,并允许另外两件事:您可以应用*和?和+到整个组,您可以在匹配后使用$1或\1引用此匹配。不过,在您的示例中没有用处。
#5
0
Try ([a-z0-9]) to capture a mixed string of lowercase letters and numbers, as well as capture for back references (or extraction).
尝试([a-z0-9])捕获小写字母和数字的混合字符串,并捕获反向引用(或提取)。
#6
-1
[a-z0-9]
will match one of abcdefghijklmnopqrstuvwxyz0123456789
. In other words, square brackets match exactly one character.
[a-z0-9]将匹配abcdefghijklmnopqrstuvwxyz0123456789。换句话说,方括号正好匹配一个字符。
(a-z0-9)
will match two characters, the first is one of abcdefghijklmnopqrstuvwxyz
, the second is one of 0123456789
, just as if the parenthesis weren't there. The () will allow you to read exactly which characters were matched. Parenthesis are also useful for OR'ing two expressions with the bar |
character. For example, (a-z|0-9)
will match one character -- any of the lowercase alpha or digit.
(a-z0-9)将匹配两个字符,第一个字符是abcdefghijklmnopqrstuvwxyz,第二个字符是0123456789中的一个,就好像括号不存在一样。()将允许您准确读取匹配的字符。括号对于使用bar |字符的两个表达式也很有用。例如,(a-z| -9)将匹配一个字符——任何小写字母或数字。
#1
32
[]
denotes a character class. ()
denotes a capturing group.
表示字符类。()表示捕获组。
[a-z0-9]
-- One character that is in the range of a-z
OR 0-9
[a-z0-9]——a-z或0-9范围内的一个字符
(a-z0-9)
-- Explicit capture of a-z0-9
. No ranges.
(a-z0-9)——显式捕获a-z0-9。没有范围。
a
-- Can be captured by [a-z0-9]
.
a——可以被[a-z0-9]捕获。
a-z0-9
-- Can be captured by (a-z0-9)
and then can be referenced in a replacement and/or later in the expression.
a-z0-9——可以被(a-z0-9)捕获,然后可以在替换和/或后面的表达式中引用。
#2
13
(…)
is a group that groups the contents like in math; (a-z0-9)
is the grouped sequence of a-z0-9
. Groups are particularly used with quantifiers that allow the preceding expression to be repeated as a whole: a*b*
matches any number of a
’s followed by any number of b
’s, e.g. a
, aaab
, bbbbb
, etc.; in contrast to that, (ab)*
matches any number of ab
’s, e.g. ab
, abababab
, etc.
(…)是像数学一样对内容进行分组的一组;(a-z0-9)是a-z0-9的分组序列。组特别用于量词,使前面的表达式作为一个整体重复:a*b*匹配任意数量的a,后面跟着任意数量的b,例如a、aaab、bbbbb等;与此相反,(ab)*与ab、abababab等任意数量的ab匹配。
[…]
is a character class that describes the options for one single character; [a-z0-9]
describes one single character that can be of the range a
–z
or 0
–9
.
是描述单个字符的选项的字符类;[a-z0-9]描述一个可以是a-z或0-9范围的单个字符。
#3
9
The []
construct in a regex is essentially shorthand for an |
on all of the contents. For example [abc]
matches a, b or c. Additionally the -
character has special meaning inside of a []
. It provides a range construct. The regex [a-z]
will match any letter a through z.
regex中的[]构造本质上是对所有内容的|的简写。例如,[abc]匹配a、b或c。此外,-字符在a[]中有特殊的含义。它提供一个范围结构。regex [a-z]将匹配任何字母a到z。
The ()
construct is a grouping construct establishing a precedence order (it also has impact on accessing matched substrings but that's a bit more of an advanced topic). The regex (abc)
will match the string "abc".
()构造是一个建立优先顺序的分组构造(它也对访问匹配的子字符串有影响,但这更像是一个高级主题)。regex (abc)将与字符串“abc”匹配。
#4
4
[a-z0-9]
will match any lowercase letter or number. (a-z0-9)
will match the exact string "a-z0-9"
and allows two additional things: You can apply modifiers like *
and ?
and +
to the whole group, and you can reference this match after the match with $1
or \1
. Not useful with your example, though.
[a-z0-9]将匹配任何小写字母或数字。(a-z0-9)将匹配确切的字符串“a-z0-9”,并允许另外两件事:您可以应用*和?和+到整个组,您可以在匹配后使用$1或\1引用此匹配。不过,在您的示例中没有用处。
#5
0
Try ([a-z0-9]) to capture a mixed string of lowercase letters and numbers, as well as capture for back references (or extraction).
尝试([a-z0-9])捕获小写字母和数字的混合字符串,并捕获反向引用(或提取)。
#6
-1
[a-z0-9]
will match one of abcdefghijklmnopqrstuvwxyz0123456789
. In other words, square brackets match exactly one character.
[a-z0-9]将匹配abcdefghijklmnopqrstuvwxyz0123456789。换句话说,方括号正好匹配一个字符。
(a-z0-9)
will match two characters, the first is one of abcdefghijklmnopqrstuvwxyz
, the second is one of 0123456789
, just as if the parenthesis weren't there. The () will allow you to read exactly which characters were matched. Parenthesis are also useful for OR'ing two expressions with the bar |
character. For example, (a-z|0-9)
will match one character -- any of the lowercase alpha or digit.
(a-z0-9)将匹配两个字符,第一个字符是abcdefghijklmnopqrstuvwxyz,第二个字符是0123456789中的一个,就好像括号不存在一样。()将允许您准确读取匹配的字符。括号对于使用bar |字符的两个表达式也很有用。例如,(a-z| -9)将匹配一个字符——任何小写字母或数字。