As part of a larger regex I would like to match the following restrictions:
作为一个更大的regex的一部分,我想匹配以下限制:
- The string has 11 digits
- 这个字符串有11位数字
- All digits are numbers
- 所有的数字是数字
- Within the first 10 digits one number [0-9] (and one only!) must be listed twice
- 在前10位数字中,必须列出一个数字[0-9](而且只有一个)两次
This means the following should match:
这意味着以下内容应该匹配:
12345678914
12235879600
Whereas these should not:
而这些不应该:
12345678903 -> none of the numbers at digits 1 to 10 appears twice
14427823482 -> one number appears more than twice
72349121762 -> two numbers appear twice
I have tried to use a lookahead, but all I'm managing is that the regex counts a certain digit, i.e.:
我尝试过使用一个前瞻,但是我所管理的是regex计数一个特定的数字,例如:
(?!.*0\1{2})
That does not do what I need. Is my query even possible with regex?
这并不能满足我的需要。使用regex是否可以查询我的查询?
1 个解决方案
#1
3
You can use this kind of pattern:
你可以使用这种模式:
\A(?=\d{11}\z)(?:(\d)(?!\d*\1\d))*(\d)(?=\d*\2\d)(?:(\d)(?!\d*\3\d))+\d\z
在线演示
pattern details:
模式的细节:
the idea is to describe string as a duplicate digit surrounded by non duplicate digits.
其思想是将字符串描述为被非重复数字包围的重复数字。
Finding a duplicate digit is easy with a capture group, a lookahead assertion and a backreference:(\d)(?=\d*\1)
使用捕获组、前瞻断言和反向引用查找重复的数字很容易(\d)(?=\d*\1)
You can use the same pattern to ensure that a digit has no duplicate, but this time with a negative lookahead: (\d)(?!\d*\1)
您可以使用相同的模式来确保一个数字没有重复,但是这次使用的是一个负的前视:(\d)(?!\d*\1)
To not take in account the last digit (digit n°11) in the search of duplicates, you only need to add a digit after the backreference. (\d)(?=\d*\1\d)
(in this way you ensure there is at least one digit between the backreference and the end of the string.)
不要在考虑的最后一个数字(数字n°11)重复的搜索,你只需要添加一个数字backreference后。(\d)(?=\d*\1\d)
Note that in the present context, what is called a duplicate digit is a digit that is not followed immediatly or later with the same digit. (i.e. in 1234567891
the first 1
is a duplicate digit, but the last 1
is no more a duplicate digit because it is not followed by an other 1
)
请注意,在当前上下文中,所谓的“重复数字”是指没有立即跟随或稍后使用相同数字的数字。(例如,在1234567891中,前1是一个重复的数字,但是最后的1不是一个重复的数字,因为它没有后面的1)
\A # begining of the string
(?=\d{11}\z) # check the string length (if not needed, remove it)
(?:(\d)(?!\d*\1\d))* # zero or more non duplicate digits
(\d)(?=\d*\2\d) # one duplicate digit
(?:(\d)(?!\d*\3\d))+ # one or more non duplicate digits
\d # the ignored last digit
\z # end of the string
an other way
其他的方式
This time you check the duplicates at the begining of the pattern with lookaheads. One lookahead to ensure there is one duplicate digit, one negative lookahead to ensure there are not two duplicate digits:
这一次,您使用lookaheads在模式开始时检查副本。一个前视确保有一个重复的数字,一个负前视确保没有两个重复的数字:
\A(?=\d*(\d)(?=\d*\1\d))(?!\d*(\d)(?=\d*\2\d)\d*(\d)(?=\d*\3\d))\d{11}\z
pattern details:
模式的细节:
\A
(?= # check if there is one duplicate digit
\d*(\d)(?=\d*\1\d)
)
(?! # check if there are not two duplicate digits
\d*(\d)(?=\d*\2\d) # the first
\d*(\d)(?=\d*\3\d) # the second
)
\d{11}
\z
Note: However it seems that the first way is more efficient.
注:第一种方法似乎更有效。
The code way
代码的方式
You can easily check if your string fit the requirements with array methods:
您可以轻松地使用数组方法检查您的字符串是否符合要求:
> mydigs = "12345678913"
=> "12345678913"
> puts (mydigs.split(//).take 10).uniq.size == 9
true
=> nil
#1
3
You can use this kind of pattern:
你可以使用这种模式:
\A(?=\d{11}\z)(?:(\d)(?!\d*\1\d))*(\d)(?=\d*\2\d)(?:(\d)(?!\d*\3\d))+\d\z
在线演示
pattern details:
模式的细节:
the idea is to describe string as a duplicate digit surrounded by non duplicate digits.
其思想是将字符串描述为被非重复数字包围的重复数字。
Finding a duplicate digit is easy with a capture group, a lookahead assertion and a backreference:(\d)(?=\d*\1)
使用捕获组、前瞻断言和反向引用查找重复的数字很容易(\d)(?=\d*\1)
You can use the same pattern to ensure that a digit has no duplicate, but this time with a negative lookahead: (\d)(?!\d*\1)
您可以使用相同的模式来确保一个数字没有重复,但是这次使用的是一个负的前视:(\d)(?!\d*\1)
To not take in account the last digit (digit n°11) in the search of duplicates, you only need to add a digit after the backreference. (\d)(?=\d*\1\d)
(in this way you ensure there is at least one digit between the backreference and the end of the string.)
不要在考虑的最后一个数字(数字n°11)重复的搜索,你只需要添加一个数字backreference后。(\d)(?=\d*\1\d)
Note that in the present context, what is called a duplicate digit is a digit that is not followed immediatly or later with the same digit. (i.e. in 1234567891
the first 1
is a duplicate digit, but the last 1
is no more a duplicate digit because it is not followed by an other 1
)
请注意,在当前上下文中,所谓的“重复数字”是指没有立即跟随或稍后使用相同数字的数字。(例如,在1234567891中,前1是一个重复的数字,但是最后的1不是一个重复的数字,因为它没有后面的1)
\A # begining of the string
(?=\d{11}\z) # check the string length (if not needed, remove it)
(?:(\d)(?!\d*\1\d))* # zero or more non duplicate digits
(\d)(?=\d*\2\d) # one duplicate digit
(?:(\d)(?!\d*\3\d))+ # one or more non duplicate digits
\d # the ignored last digit
\z # end of the string
an other way
其他的方式
This time you check the duplicates at the begining of the pattern with lookaheads. One lookahead to ensure there is one duplicate digit, one negative lookahead to ensure there are not two duplicate digits:
这一次,您使用lookaheads在模式开始时检查副本。一个前视确保有一个重复的数字,一个负前视确保没有两个重复的数字:
\A(?=\d*(\d)(?=\d*\1\d))(?!\d*(\d)(?=\d*\2\d)\d*(\d)(?=\d*\3\d))\d{11}\z
pattern details:
模式的细节:
\A
(?= # check if there is one duplicate digit
\d*(\d)(?=\d*\1\d)
)
(?! # check if there are not two duplicate digits
\d*(\d)(?=\d*\2\d) # the first
\d*(\d)(?=\d*\3\d) # the second
)
\d{11}
\z
Note: However it seems that the first way is more efficient.
注:第一种方法似乎更有效。
The code way
代码的方式
You can easily check if your string fit the requirements with array methods:
您可以轻松地使用数组方法检查您的字符串是否符合要求:
> mydigs = "12345678913"
=> "12345678913"
> puts (mydigs.split(//).take 10).uniq.size == 9
true
=> nil