I am new to regex .I am studying it in regularexperssion.com ..The question is that i need to know what is the use of colon (:) in regular expressions ..
我是regex的新手,我在regularexperssion.com学习它。问题是我需要知道在正则表达式中冒号(:)的用法。
For example ..:
例如. .:
$pattern = '/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?@)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/';
which matches
匹配
$url1 = "http://www.somewebsite.com";
$url2 = "https://www.somewebsite.com";
$url3 = "https://somewebsite.com";
$url4 = "www.somewebsite.com";
$url5 = "somewebsite.com";
Yeah any help would be greately appreciated ..:)
是的,任何帮助都将得到感激。
4 个解决方案
#1
29
Colon :
is simply colon. It means nothing, except special cases like, for example, clustering without capturing (also known as a non-capturing group):
冒号:就是冒号。它没有任何意义,除了特殊的情况,例如,不捕获的集群(也称为非捕获组):
(?:pattern)
Also it can be used in character classes, for example:
也可以用于字符类,例如:
[[:upper:]]
However, in your case colon is just a colon.
但是,在你的例子中冒号只是一个冒号。
Special characters used in your regex:
您的regex中使用的特殊字符:
In character class [-+_~.\d\w]
:
在字符类[- d + _ ~。\ \ w]:
-
-
means-
- ——意味着
-
+
means+
- +是+
-
_
means_
- _是_
-
~
means~
- ~是~
-
.
means.
- 。的意思。
-
\d
means any digit - \ d意味着任何数字
-
\w
means any word character - 什么字都可以
These symbols have this meaning because they are used in a symbol class []
. Without symbol class +
and .
have special meaning.
这些符号有这个意思,因为它们在符号类[]中使用。没有符号类+和。有特殊的意义。
Other elements:
其他要素:
-
=?
means=
that can occur 0 or 1 times; in other words=
that can occur or not, optional=
. - = ?表示可以发生0或1次;换句话说=可以发生也可以不发生,可选=。
#2
16
I've decided to go you one better and explain the entire regex:
我决定给你一个更好的解释整个regex:
^ # anchor to start of line
( # start grouping
( # start grouping
[\w]+ # at least one of 0-9a-zA-Z_
: # a literal colon
) # end grouping
? # this grouping is optional
\/\/ # two literal slashes
) # end capture
? # this grouping is optional
(
(
[\d\w] # exactly one of 0-9a-zA-Z_
# having \d is redundant
| # alternation
% # literal % sign
[a-fA-f\d]{2,2} # exactly 2 hexadecimal digits
# should probably be A-F
# using {2} would have sufficed
)+ # at least one of this groups
( # start grouping
: # literal colon
(
[\d\w]
|
%
[a-fA-f\d]{2,2}
)+
)? # Same grouping, but it is optional
# and there can be only one
@ # literal @ sign
)? # this group is optional
(
[\d\w] # same as [\w], explained above
[-\d\w]{0,253} # includes a dash as a valid character
# between 0 and 253 of these characters
[\d\w] # end with \w. They want at most 255
# total and - cannot be at the start
# or end
\. # literal period
)+ # at least one of these groups
[\w]{2,4} # two to four \w characters
(
: # literal colon
[\d]+ # at least one digit
)?
(
\/ # literal slash
(
[-+_~.\d\w] # one of these characters
| # *or*
% # % with two hex digit combo
[a-fA-f\d]{2,2}
)* # zero or more of these groups
)* # zero or more of these groups
(
\? # literal question mark
(
&? # literal & or &
(
[-+_~.\d\w]
|
%
[a-fA-f\d]{2,2}
)
=? # optional literal =
)* # zero or more of this group
)? # this group is optional
(
# # literal #
(
[-+_~.\d\w]
|
%
[a-fA-f\d]{2,2}
)*
)?
$ # anchor to end of line
It's important to understand what the metacharacters/sequences are. Some sequences are not meta when used in certain contexts (especially a character class). I've cataloged them for you:
理解元字符/序列是什么很重要。在某些上下文中(特别是字符类)使用时,有些序列不是元数据。我已经给你编目了:
meta with no context
-
^
-- zero width start of line - ^——零宽度的线
-
()
-- grouping/capture - ()——分组/捕获
-
?
-- zero or one of the preceding sequence - 吗?——0或前一个序列中的一个
-
+
-- one or more of the preceding sequence - +——前一个或多个序列
-
*
-- zero or more of the preceding sequence - *——前一个序列的0或更多
-
[]
-- character class - []——字符类
-
\w
-- alphanumeric characters and_
. Opposite of\W
- \w—字母数字字符和_。相反\ W
-
|
-- alternation - |——交替
-
{}
-- length assertion - { }——长度断言
-
$
-- zero width end of line - $——线的零宽端
This excludes :
, @
, and %
from having any special/meta meaning in the raw context.
这个不包括:,@,和%在原始环境中有任何特殊/元意义。
meta inside character class
]
ends the character class. -
creates a range of characters unless it is at the start or the end of the character class.
结束人物类。-创建一系列字符,除非是在字符类的开始或结尾。
grouping assertions
A (?
combination starts a grouping assertion. For example, (?:
means group but do not capture. This means that in the regex /(?:a)/
, it will match the string "a"
, but a
is not captured for use in replacement or match groups as it would be from /(a)/
.
(?组合启动一个分组断言。例如,(?:表示组,但不表示捕获。这意味着,在regex /(?:a)/中,它将匹配字符串“a”,但a不会被捕获以用于替换或匹配组,因为它将来自/(a)/。
?
can also be used for lookahead/lookbehind assertions with ?=
, ?!
, ?<=
, ?<!
. (?
followed by any sequence except what I mentioned in this section is just a literal ?
.
吗?还可以用于使用?=、?!的lookahead/lookbehind断言。,< =,< !(?除了我在这一节中提到的,其他任何序列都只是字面意思?
#3
5
There is no special use for colon :
in your case :
冒号没有特殊用途:在你的情况下:
(([\w]+:)?\/\/)?
will match http://
, https://
, ftp://
...
(((\ w)+:)? \ / \ /)?将匹配http://、https://、ftp://…。
You can find one special use for colon : every capturing group starting by (?:
won't appear in the results.
Example, with "foobarbaz" in input :
您可以找到冒号的一个特殊用途:从(?不会出现在结果中。例如,在输入中使用“foobarbaz”:
-
/foo((bar)(baz))/
=>{ [1] => 'barbaz', [2] => 'bar', [3] => 'baz' }
- / foo((bar)(baz))/ = > {[1]= >“barbaz”,[2]= >“酒吧”,[3]= > ' baz }
-
/foo(?:(bar)(baz))/
=>{ [1] => 'bar', [2] => 'baz' }
- / foo(?:(bar)(baz))/ = > {[1]= >“酒吧”,[2]= > ' baz }
#4
0
A colon has no special meaning in Regular Expressions, it just matches a literal colon.
在正则表达式中,冒号没有特殊的含义,它只是匹配字面冒号。
[\w]+:
This just means any word character 1 or more times followed by a literal colon
The brackets are actually not needed here. Square brackets are used to define a group of characters to match. So
这意味着任何单词字符1或更多的时间后跟一个字面的冒号,括号在这里实际上是不需要的。方括号用于定义一组要匹配的字符。所以
[abcd]
means a single character of a, b, c, d
表示a、b、c、d的单个字符
#1
29
Colon :
is simply colon. It means nothing, except special cases like, for example, clustering without capturing (also known as a non-capturing group):
冒号:就是冒号。它没有任何意义,除了特殊的情况,例如,不捕获的集群(也称为非捕获组):
(?:pattern)
Also it can be used in character classes, for example:
也可以用于字符类,例如:
[[:upper:]]
However, in your case colon is just a colon.
但是,在你的例子中冒号只是一个冒号。
Special characters used in your regex:
您的regex中使用的特殊字符:
In character class [-+_~.\d\w]
:
在字符类[- d + _ ~。\ \ w]:
-
-
means-
- ——意味着
-
+
means+
- +是+
-
_
means_
- _是_
-
~
means~
- ~是~
-
.
means.
- 。的意思。
-
\d
means any digit - \ d意味着任何数字
-
\w
means any word character - 什么字都可以
These symbols have this meaning because they are used in a symbol class []
. Without symbol class +
and .
have special meaning.
这些符号有这个意思,因为它们在符号类[]中使用。没有符号类+和。有特殊的意义。
Other elements:
其他要素:
-
=?
means=
that can occur 0 or 1 times; in other words=
that can occur or not, optional=
. - = ?表示可以发生0或1次;换句话说=可以发生也可以不发生,可选=。
#2
16
I've decided to go you one better and explain the entire regex:
我决定给你一个更好的解释整个regex:
^ # anchor to start of line
( # start grouping
( # start grouping
[\w]+ # at least one of 0-9a-zA-Z_
: # a literal colon
) # end grouping
? # this grouping is optional
\/\/ # two literal slashes
) # end capture
? # this grouping is optional
(
(
[\d\w] # exactly one of 0-9a-zA-Z_
# having \d is redundant
| # alternation
% # literal % sign
[a-fA-f\d]{2,2} # exactly 2 hexadecimal digits
# should probably be A-F
# using {2} would have sufficed
)+ # at least one of this groups
( # start grouping
: # literal colon
(
[\d\w]
|
%
[a-fA-f\d]{2,2}
)+
)? # Same grouping, but it is optional
# and there can be only one
@ # literal @ sign
)? # this group is optional
(
[\d\w] # same as [\w], explained above
[-\d\w]{0,253} # includes a dash as a valid character
# between 0 and 253 of these characters
[\d\w] # end with \w. They want at most 255
# total and - cannot be at the start
# or end
\. # literal period
)+ # at least one of these groups
[\w]{2,4} # two to four \w characters
(
: # literal colon
[\d]+ # at least one digit
)?
(
\/ # literal slash
(
[-+_~.\d\w] # one of these characters
| # *or*
% # % with two hex digit combo
[a-fA-f\d]{2,2}
)* # zero or more of these groups
)* # zero or more of these groups
(
\? # literal question mark
(
&? # literal & or &
(
[-+_~.\d\w]
|
%
[a-fA-f\d]{2,2}
)
=? # optional literal =
)* # zero or more of this group
)? # this group is optional
(
# # literal #
(
[-+_~.\d\w]
|
%
[a-fA-f\d]{2,2}
)*
)?
$ # anchor to end of line
It's important to understand what the metacharacters/sequences are. Some sequences are not meta when used in certain contexts (especially a character class). I've cataloged them for you:
理解元字符/序列是什么很重要。在某些上下文中(特别是字符类)使用时,有些序列不是元数据。我已经给你编目了:
meta with no context
-
^
-- zero width start of line - ^——零宽度的线
-
()
-- grouping/capture - ()——分组/捕获
-
?
-- zero or one of the preceding sequence - 吗?——0或前一个序列中的一个
-
+
-- one or more of the preceding sequence - +——前一个或多个序列
-
*
-- zero or more of the preceding sequence - *——前一个序列的0或更多
-
[]
-- character class - []——字符类
-
\w
-- alphanumeric characters and_
. Opposite of\W
- \w—字母数字字符和_。相反\ W
-
|
-- alternation - |——交替
-
{}
-- length assertion - { }——长度断言
-
$
-- zero width end of line - $——线的零宽端
This excludes :
, @
, and %
from having any special/meta meaning in the raw context.
这个不包括:,@,和%在原始环境中有任何特殊/元意义。
meta inside character class
]
ends the character class. -
creates a range of characters unless it is at the start or the end of the character class.
结束人物类。-创建一系列字符,除非是在字符类的开始或结尾。
grouping assertions
A (?
combination starts a grouping assertion. For example, (?:
means group but do not capture. This means that in the regex /(?:a)/
, it will match the string "a"
, but a
is not captured for use in replacement or match groups as it would be from /(a)/
.
(?组合启动一个分组断言。例如,(?:表示组,但不表示捕获。这意味着,在regex /(?:a)/中,它将匹配字符串“a”,但a不会被捕获以用于替换或匹配组,因为它将来自/(a)/。
?
can also be used for lookahead/lookbehind assertions with ?=
, ?!
, ?<=
, ?<!
. (?
followed by any sequence except what I mentioned in this section is just a literal ?
.
吗?还可以用于使用?=、?!的lookahead/lookbehind断言。,< =,< !(?除了我在这一节中提到的,其他任何序列都只是字面意思?
#3
5
There is no special use for colon :
in your case :
冒号没有特殊用途:在你的情况下:
(([\w]+:)?\/\/)?
will match http://
, https://
, ftp://
...
(((\ w)+:)? \ / \ /)?将匹配http://、https://、ftp://…。
You can find one special use for colon : every capturing group starting by (?:
won't appear in the results.
Example, with "foobarbaz" in input :
您可以找到冒号的一个特殊用途:从(?不会出现在结果中。例如,在输入中使用“foobarbaz”:
-
/foo((bar)(baz))/
=>{ [1] => 'barbaz', [2] => 'bar', [3] => 'baz' }
- / foo((bar)(baz))/ = > {[1]= >“barbaz”,[2]= >“酒吧”,[3]= > ' baz }
-
/foo(?:(bar)(baz))/
=>{ [1] => 'bar', [2] => 'baz' }
- / foo(?:(bar)(baz))/ = > {[1]= >“酒吧”,[2]= > ' baz }
#4
0
A colon has no special meaning in Regular Expressions, it just matches a literal colon.
在正则表达式中,冒号没有特殊的含义,它只是匹配字面冒号。
[\w]+:
This just means any word character 1 or more times followed by a literal colon
The brackets are actually not needed here. Square brackets are used to define a group of characters to match. So
这意味着任何单词字符1或更多的时间后跟一个字面的冒号,括号在这里实际上是不需要的。方括号用于定义一组要匹配的字符。所以
[abcd]
means a single character of a, b, c, d
表示a、b、c、d的单个字符