I am trying to take a block of numbers that may, or may not, have dividers and return them in a standard format. Using SSN as an example:
我正在尝试取一组可能有或没有分隔符的数字,并以标准格式返回它们。以SSN为例:
ex1="An example 123-45-6789"
ex2="123.45.6789 some more things"
ex3="123456789 thank you Ruby may I have another"
should all go into a method that returns "123-45-6789" Basically, anything(INCLUDING nothing) except a number or letter should return a SSN in a XXX-XX-XXXX format. The part that is stumping is a way to regular expressions to identify that there can be nothing.
如果所有的方法都返回“123-45-6789”,那么除了数字或字母之外的任何东西(不包括任何东西)都应该以XXX-XX-XXXX格式返回SSN。stumping是一种用于正则表达式的方法,可以识别不存在任何内容。
What I have so far in IDENTIFYING my ssn:
到目前为止,我识别ssn的方法是:
def format_ssns(string)
string.scan(/\d{3}[^0-9a-zA-Z]{1}\d{2}[^0-9a-zA-Z]{1}\d{4}/).to_a
end
It seems to work for everything I expect EXCEPT when there is nothing. "123456789" does not work. Can I use regular expressions in this case to identify lack of anything?
它似乎对我所期望的一切都起作用,除非什么都没有。“123456789”不工作。在这种情况下,我可以使用正则表达式来确定缺少什么吗?
4 个解决方案
#1
5
Have you tried to match 0 or 1 characters between your numbers?
你试过在你的数字之间匹配0或1个字符吗?
\d{3}[^0-9a-zA-Z]{0,1}\d{2}[^0-9a-zA-Z]{0,1}\d{4}
#2
31
This has already been shared in a comment, but just to provide a complete-ish answer...
这已经在一条评论中被分享了,但只是为了提供一个完整的答案……
You have these tools at your disposal:
你有这些工具供你使用:
-
x
matchesx
exactly once - x恰好匹配x一次
-
x{a,b}
matchesx
betweena
andb
times - x{a,b}匹配a和b次之间的x
-
x{a,}
matchesx
at leasta
times - {a,}匹配至少一次
-
x{,b}
matchesx
up to (a maximum of)b
times - x{,b}匹配x至多(最多)b次
-
x*
matchesx
zero or more times (same asx{0,}
) - x*匹配x 0或更多次(与x{0,}相同)
-
x+
matchesx
one or more times (same asx{1,}
) - x+匹配x 1次或多次(与x{1,}相同)
-
x?
matchesx
zero or one time (same asx{0,1}
) - x ?匹配x 0或1次(与x{0,1}相同)
So you want to use that last one, since it's exactly what you're looking for (zero or one time).
所以你想用最后一个,因为它就是你要找的(0或1)
/\d{3}[^0-9a-zA-Z]?\d{2}[^0-9a-zA-Z]?\d{4}/
#3
2
Your current regex will allow 123-45[6789
, not to mention all kinds of Unicode characters and control characters. In the extreme case:
您当前的regex将允许123-45[6789],更不用说各种Unicode字符和控制字符了。在极端的例子:
123
45師6789
is considered a matched by your regex.
被您的regex认为是匹配的。
You can use backreference to make sure the separator is the same.
您可以使用backreference来确保分隔符是相同的。
/\d{3}([.-]?)\d{2}\1\d{4}/
[.-]?
will match either .
, -
or nothing (due to the optional ?
quantifier). Whatever matched here will be used to make sure that the second separator is the same via backreference.
(。)?将匹配任何一个,-或没有(由于可选?量词)。这里匹配的内容将用于确保第二个分隔符通过backreference是相同的。
#4
0
Whelp... looks like I just found my own answer, but any clues for improvement would be helpful.
幼兽……看起来我找到了自己的答案,但是任何改进的线索都是有用的。
def format_ssns(string)
string.scan(/\d{3}[^0-9a-zA-Z]{0,1}\d{2}[^0-9a-zA-Z]{1}\d{4}/).to_a
end
Seems to do the trick.
这似乎很管用。
#1
5
Have you tried to match 0 or 1 characters between your numbers?
你试过在你的数字之间匹配0或1个字符吗?
\d{3}[^0-9a-zA-Z]{0,1}\d{2}[^0-9a-zA-Z]{0,1}\d{4}
#2
31
This has already been shared in a comment, but just to provide a complete-ish answer...
这已经在一条评论中被分享了,但只是为了提供一个完整的答案……
You have these tools at your disposal:
你有这些工具供你使用:
-
x
matchesx
exactly once - x恰好匹配x一次
-
x{a,b}
matchesx
betweena
andb
times - x{a,b}匹配a和b次之间的x
-
x{a,}
matchesx
at leasta
times - {a,}匹配至少一次
-
x{,b}
matchesx
up to (a maximum of)b
times - x{,b}匹配x至多(最多)b次
-
x*
matchesx
zero or more times (same asx{0,}
) - x*匹配x 0或更多次(与x{0,}相同)
-
x+
matchesx
one or more times (same asx{1,}
) - x+匹配x 1次或多次(与x{1,}相同)
-
x?
matchesx
zero or one time (same asx{0,1}
) - x ?匹配x 0或1次(与x{0,1}相同)
So you want to use that last one, since it's exactly what you're looking for (zero or one time).
所以你想用最后一个,因为它就是你要找的(0或1)
/\d{3}[^0-9a-zA-Z]?\d{2}[^0-9a-zA-Z]?\d{4}/
#3
2
Your current regex will allow 123-45[6789
, not to mention all kinds of Unicode characters and control characters. In the extreme case:
您当前的regex将允许123-45[6789],更不用说各种Unicode字符和控制字符了。在极端的例子:
123
45師6789
is considered a matched by your regex.
被您的regex认为是匹配的。
You can use backreference to make sure the separator is the same.
您可以使用backreference来确保分隔符是相同的。
/\d{3}([.-]?)\d{2}\1\d{4}/
[.-]?
will match either .
, -
or nothing (due to the optional ?
quantifier). Whatever matched here will be used to make sure that the second separator is the same via backreference.
(。)?将匹配任何一个,-或没有(由于可选?量词)。这里匹配的内容将用于确保第二个分隔符通过backreference是相同的。
#4
0
Whelp... looks like I just found my own answer, but any clues for improvement would be helpful.
幼兽……看起来我找到了自己的答案,但是任何改进的线索都是有用的。
def format_ssns(string)
string.scan(/\d{3}[^0-9a-zA-Z]{0,1}\d{2}[^0-9a-zA-Z]{1}\d{4}/).to_a
end
Seems to do the trick.
这似乎很管用。