I'm trying to write a regular expression for Java that matches if there is a semicolon that does not have two (or more) leading '-' characters.
我正在尝试为Java编写一个正则表达式,如果有一个分号没有两个(或更多)前导' - '字符。
I'm only able to get the opposite working: A semicolon that has at least two leading '-' characters.
我只能做相反的工作:一个至少有两个前导' - '字符的分号。
([\-]{2,}.*?;.*)
But I need something like
但我需要类似的东西
([^([\-]{2,})])*?;.*
I'm somehow not able to express 'not at least two - characters'.
我不知道怎么说不能表达“至少两个字符”。
Here are some examples I need to evaluate with the expression:
以下是我需要用表达式评估的一些示例:
; -- a : should match
-- a ; : should not match
-- ; : should not match
--; : should not match
-;- : should match
---; : should not match
-- semicolon ; : should not match
bla ; bla : should match
bla : should not match (; is mandatory)
-;--; : should match (the first occuring semicolon must not have two or more consecutive leading '-')
5 个解决方案
#1
2
It seems that this regex matches what you want
看来这个正则表达式匹配你想要的东西
String regex = "[^-]*(-[^-]+)*-?;.*";
Explanation: matches
will accept string that:
说明:matches将接受以下字符串:
-
[^-]*
can start with non dash characters -
(-[^-]+)*-?;
is a bit tricky because before we will match;
we need to make sure that each-
do not have another-
after it so:-
(-[^-]+)*
each-
have at least one non-
character after it -
-?
or-
was placed right before;
( - [^ - ] +)*每个 - 后面至少有一个非字符
- ?或 - 被放置在之前;
-
-
;.*
if earlier conditions ware fulfilled we can accept;
and any.*
characters after it.
[^ - ] *可以以非短划线字符开头
( - [^ - ] +)* - ?;有点棘手,因为在我们匹配之前;我们需要确保每一个 - 没有另一个 - 在它之后:( - [^ - ] +)*每个 - 在它之后至少有一个非字符 - ?或 - 被放置在之前;
; *如果早期条件得到满足,我们可以接受;和之后的任何。*字符。
More readable version, but probably little slower
更可读的版本,但可能稍慢
((?!--)[^;])*;.*
Explanation:
To make sure that there is ;
in string we can use .*;.*
in matches.
But we need to add some conditions to characters before first ;
.
确保有;在字符串中我们可以在匹配中使用。*。*。但是我们需要在第一个之前为角色添加一些条件;
So to make sure that matched ;
will be first one we can write such regex as
所以要确保匹配;将是第一个我们可以写这样的正则表达式
[^;]*;.*
which means:
-
[^;]*
zero or more non semicolon characters -
;
first semicolon -
.*
zero or more of any characters (actually.
can't match line separators like\n
or\r
)
[^;] *零个或多个非分号字符
;第一个分号
。*零个或多个任何字符(实际上。不能匹配\ n或\ r \ n等行分隔符)
So now all we need to do is make sure that character matched by [^;]
is not part of --
. To do so we can use look-around mechanisms for instance:
所以现在我们需要做的就是确保[^;]匹配的字符不是 - 的一部分。为此,我们可以使用环视机制,例如:
-
(?!--)[^;]
before matching[^;]
(?!--)
checks that next two characters are not--
, in other words character matched by[^;]
can't be first-
in series of two--
-
[^;](?<!--)
checks if after matching[^;]
regex engine will not be able to find--
if it will backtrack two positions, in other words[^;]
can't be last character in series of--
.
(?! - )[^;]匹配[^;](?! - )之前检查接下来的两个字符是不是 - 换句话说,[^;]匹配的字符不能是第一个 - 在系列中两个 -
[^;](?
#2
0
How about just splitting the string along --
and if there are two or more sub strings, checking if the last one contains a semicolon?
如何只是拆分字符串 - 如果有两个或更多子字符串,检查最后一个字符串是否包含分号?
#3
0
How about using this regex in Java:
如何在Java中使用此正则表达式:
[^;]*;(?<!--[^;]{0,999};).*
Only caveat is that it works with up to 999
character length between --
and ;
唯一需要注意的是它在 - 和之间最多可以使用999个字符长度;
Java Regex Demo
#4
0
I think this is what you're looking for:
我想这就是你要找的东西:
^(?:(?!--).)*;.*$
In other words, match from the start of the string (^
), zero or more characters (.*
) followed by a semicolon. But replacing the dot with (?:(?!--).)
causes it to match any character unless it's the beginning of a two-hyphen sequence (--
).
换句话说,从字符串的开头(^)匹配,零个或多个字符(。*)后跟分号。但是用(?:(?! - )。)替换点会使它匹配任何字符,除非它是双连字序列( - )的开头。
If performance is an issue, you can exclude the semicolon as well, so it never has to backtrack:
如果性能是一个问题,你也可以排除分号,所以它永远不必回溯:
^(?:(?!--|;).)*;.*$
EDIT: I just noticed your comment that the regex should work with the matches()
method, so I padded it out with .*
. The anchors aren't really necessary, but they do no harm.
编辑:我刚刚注意到你的评论,正则表达式应该使用matches()方法,所以我用。*填充它。锚点不是必需的,但它们没有任何伤害。
#5
0
You need a negative lookahead!
你需要一个消极的向前看!
This regex will match any string which does not contain your original match pattern:
此正则表达式将匹配任何不包含原始匹配模式的字符串:
(?!-{2,}.*?;.*).*?;.*
This Regex matches a string which contains a semicolon, but not one occuring after 2 or more dashes.
此正则表达式匹配一个包含分号的字符串,但不是在2个或更多短划线后出现的字符串。
Example:
#1
2
It seems that this regex matches what you want
看来这个正则表达式匹配你想要的东西
String regex = "[^-]*(-[^-]+)*-?;.*";
Explanation: matches
will accept string that:
说明:matches将接受以下字符串:
-
[^-]*
can start with non dash characters -
(-[^-]+)*-?;
is a bit tricky because before we will match;
we need to make sure that each-
do not have another-
after it so:-
(-[^-]+)*
each-
have at least one non-
character after it -
-?
or-
was placed right before;
( - [^ - ] +)*每个 - 后面至少有一个非字符
- ?或 - 被放置在之前;
-
-
;.*
if earlier conditions ware fulfilled we can accept;
and any.*
characters after it.
[^ - ] *可以以非短划线字符开头
( - [^ - ] +)* - ?;有点棘手,因为在我们匹配之前;我们需要确保每一个 - 没有另一个 - 在它之后:( - [^ - ] +)*每个 - 在它之后至少有一个非字符 - ?或 - 被放置在之前;
; *如果早期条件得到满足,我们可以接受;和之后的任何。*字符。
More readable version, but probably little slower
更可读的版本,但可能稍慢
((?!--)[^;])*;.*
Explanation:
To make sure that there is ;
in string we can use .*;.*
in matches.
But we need to add some conditions to characters before first ;
.
确保有;在字符串中我们可以在匹配中使用。*。*。但是我们需要在第一个之前为角色添加一些条件;
So to make sure that matched ;
will be first one we can write such regex as
所以要确保匹配;将是第一个我们可以写这样的正则表达式
[^;]*;.*
which means:
-
[^;]*
zero or more non semicolon characters -
;
first semicolon -
.*
zero or more of any characters (actually.
can't match line separators like\n
or\r
)
[^;] *零个或多个非分号字符
;第一个分号
。*零个或多个任何字符(实际上。不能匹配\ n或\ r \ n等行分隔符)
So now all we need to do is make sure that character matched by [^;]
is not part of --
. To do so we can use look-around mechanisms for instance:
所以现在我们需要做的就是确保[^;]匹配的字符不是 - 的一部分。为此,我们可以使用环视机制,例如:
-
(?!--)[^;]
before matching[^;]
(?!--)
checks that next two characters are not--
, in other words character matched by[^;]
can't be first-
in series of two--
-
[^;](?<!--)
checks if after matching[^;]
regex engine will not be able to find--
if it will backtrack two positions, in other words[^;]
can't be last character in series of--
.
(?! - )[^;]匹配[^;](?! - )之前检查接下来的两个字符是不是 - 换句话说,[^;]匹配的字符不能是第一个 - 在系列中两个 -
[^;](?
#2
0
How about just splitting the string along --
and if there are two or more sub strings, checking if the last one contains a semicolon?
如何只是拆分字符串 - 如果有两个或更多子字符串,检查最后一个字符串是否包含分号?
#3
0
How about using this regex in Java:
如何在Java中使用此正则表达式:
[^;]*;(?<!--[^;]{0,999};).*
Only caveat is that it works with up to 999
character length between --
and ;
唯一需要注意的是它在 - 和之间最多可以使用999个字符长度;
Java Regex Demo
#4
0
I think this is what you're looking for:
我想这就是你要找的东西:
^(?:(?!--).)*;.*$
In other words, match from the start of the string (^
), zero or more characters (.*
) followed by a semicolon. But replacing the dot with (?:(?!--).)
causes it to match any character unless it's the beginning of a two-hyphen sequence (--
).
换句话说,从字符串的开头(^)匹配,零个或多个字符(。*)后跟分号。但是用(?:(?! - )。)替换点会使它匹配任何字符,除非它是双连字序列( - )的开头。
If performance is an issue, you can exclude the semicolon as well, so it never has to backtrack:
如果性能是一个问题,你也可以排除分号,所以它永远不必回溯:
^(?:(?!--|;).)*;.*$
EDIT: I just noticed your comment that the regex should work with the matches()
method, so I padded it out with .*
. The anchors aren't really necessary, but they do no harm.
编辑:我刚刚注意到你的评论,正则表达式应该使用matches()方法,所以我用。*填充它。锚点不是必需的,但它们没有任何伤害。
#5
0
You need a negative lookahead!
你需要一个消极的向前看!
This regex will match any string which does not contain your original match pattern:
此正则表达式将匹配任何不包含原始匹配模式的字符串:
(?!-{2,}.*?;.*).*?;.*
This Regex matches a string which contains a semicolon, but not one occuring after 2 or more dashes.
此正则表达式匹配一个包含分号的字符串,但不是在2个或更多短划线后出现的字符串。
Example: