I'm heaving trouble finding the right regex for decimal numbers which include the comma separator.
我在寻找包含逗号分隔符的十进制数的正确正则表达式时遇到了麻烦。
I did find a few other questions regarding this issue in general but none of the answers really worked when I tested them
我确实在一般情况下找到了关于这个问题的一些其他问题,但是当我测试它们时,没有一个答案真的有效
The best I got so far is:
到目前为止我得到的最好的是:
[0-9]{1,3}(,([0-9]{3}))*(.[0-9]+)?
2 main problems so far:
到目前为止的2个主要问题
1) It records numbers with spaces between them "3001 1" instead of splitting them to 2 matches "3001" "1" - I don't really see where I allowed space in the regex.
1)它记录数字之间的空格“3001 1”,而不是将它们分成2个匹配“3001”“1” - 我真的没有看到我在正则表达式中允许空格的位置。
2) I have a general problem with the beginning\ending of the regex.
2)我对正则表达式的开头\结尾有一个普遍的问题。
The regex should match:
正则表达式应匹配:
3,001
1
32,012,111.2131
But not:
32,012,11.2131
1132,012,111.2131
32,0112,111.2131
32131
In addition I'd like it to match:
另外我想要匹配:
1.(without any number after it)
1,(without any number after it)
as 1
(a comma or point at the end of the number should be overlooked).
(应该忽略数字末尾的逗号或点)。
Many Thanks! .
非常感谢! 。
3 个解决方案
#1
3
This is a very long and convoluted regular expression that fits all your requirements. It will work if your regex engine is based on PCRE (hopefully you're using PHP, Delphi or R..).
这是一个非常漫长而复杂的正则表达式,可以满足您的所有要求。如果您的正则表达式引擎基于PCRE(希望您使用的是PHP,Delphi或R ..),它将起作用。
(?<=[^\d,.]|^)\d{1,3}(,(\d{3}))*((?=[,.](\s|$))|(\.\d+)?(?=[^\d,.]|$))
RegExr上的DEMO
The things that make it so long:
使它如此之久的事情:
- Matching multiple numbers on the same line separated by only 1 character (a space) whilst not allowing partial matchs requires a lookahead and a lookbehind.
- Matching numbers ending with
.
and,
without including the.
or,
in the match requires another lookahead.
在同一行上匹配多个数字,仅由1个字符(空格)分隔,同时不允许部分匹配需要前瞻和后视。
匹配的数字以。结尾。并且,不包括。或者,在比赛中需要另一个预测。
(?=[,.](\s|$))
Explanation
When writing this explanation I realised the \s
needs to be a (\s|$)
to match 1,
at the very end of a string.
在写这个解释时,我意识到\ s需要是一个(\ s | $)匹配1,在字符串的最后。
This part of the regex is for matching the 1
in 1,
or the 1,000
in 1,000.
so let's say our number is 1,000.
(with the .
on the end).
正则表达式的这一部分用于匹配1合1或1,000合1,000。所以我们说我们的数字是1,000。 (最后是。)。
Up to this point the regex has matched 1,000
, then it can't find another ,
to repeat the thousands group so it moves on to our (?=[,.](\s|$))
到目前为止,正则表达式已匹配1,000,然后它找不到另一个,重复数千组,所以它移动到我们的(?= [,。](\ s | $))
(?=....)
means its a lookahead, that means from where we have matched up to, look at whats coming but don't add it to the match.
(?= ....)意味着它是一个先行,这意味着我们已经匹配到哪里,看看会发生什么,但不要将它添加到比赛中。
So It checks if there is a ,
or a .
and if there is, it checks that it's immediately followed by whitespace or the end of input. In this case it is, so it'd leave the match as 1,000
所以它检查是否有a或a。如果有,它会检查它是否紧跟空格或输入结束。在这种情况下,它将匹配为1,000
Had the lookahead not matched, it would have moved on to trying to match decimal places.
如果前瞻不匹配,它将继续尝试匹配小数位。
#2
1
This works for all the ones that you have listed
这适用于您列出的所有内容
^[0-9]{1,3}(,[0-9]{3})*(([\\.,]{1}[0-9]*)|())$
#3
0
.
means "any character". To use a literal .
, escape it like this: \.
.
。意思是“任何角色”。要使用文字。,请将其转义为:\ ..
As far as I know, that's the only thing missing.
据我所知,这是唯一缺少的东西。
#1
3
This is a very long and convoluted regular expression that fits all your requirements. It will work if your regex engine is based on PCRE (hopefully you're using PHP, Delphi or R..).
这是一个非常漫长而复杂的正则表达式,可以满足您的所有要求。如果您的正则表达式引擎基于PCRE(希望您使用的是PHP,Delphi或R ..),它将起作用。
(?<=[^\d,.]|^)\d{1,3}(,(\d{3}))*((?=[,.](\s|$))|(\.\d+)?(?=[^\d,.]|$))
RegExr上的DEMO
The things that make it so long:
使它如此之久的事情:
- Matching multiple numbers on the same line separated by only 1 character (a space) whilst not allowing partial matchs requires a lookahead and a lookbehind.
- Matching numbers ending with
.
and,
without including the.
or,
in the match requires another lookahead.
在同一行上匹配多个数字,仅由1个字符(空格)分隔,同时不允许部分匹配需要前瞻和后视。
匹配的数字以。结尾。并且,不包括。或者,在比赛中需要另一个预测。
(?=[,.](\s|$))
Explanation
When writing this explanation I realised the \s
needs to be a (\s|$)
to match 1,
at the very end of a string.
在写这个解释时,我意识到\ s需要是一个(\ s | $)匹配1,在字符串的最后。
This part of the regex is for matching the 1
in 1,
or the 1,000
in 1,000.
so let's say our number is 1,000.
(with the .
on the end).
正则表达式的这一部分用于匹配1合1或1,000合1,000。所以我们说我们的数字是1,000。 (最后是。)。
Up to this point the regex has matched 1,000
, then it can't find another ,
to repeat the thousands group so it moves on to our (?=[,.](\s|$))
到目前为止,正则表达式已匹配1,000,然后它找不到另一个,重复数千组,所以它移动到我们的(?= [,。](\ s | $))
(?=....)
means its a lookahead, that means from where we have matched up to, look at whats coming but don't add it to the match.
(?= ....)意味着它是一个先行,这意味着我们已经匹配到哪里,看看会发生什么,但不要将它添加到比赛中。
So It checks if there is a ,
or a .
and if there is, it checks that it's immediately followed by whitespace or the end of input. In this case it is, so it'd leave the match as 1,000
所以它检查是否有a或a。如果有,它会检查它是否紧跟空格或输入结束。在这种情况下,它将匹配为1,000
Had the lookahead not matched, it would have moved on to trying to match decimal places.
如果前瞻不匹配,它将继续尝试匹配小数位。
#2
1
This works for all the ones that you have listed
这适用于您列出的所有内容
^[0-9]{1,3}(,[0-9]{3})*(([\\.,]{1}[0-9]*)|())$
#3
0
.
means "any character". To use a literal .
, escape it like this: \.
.
。意思是“任何角色”。要使用文字。,请将其转义为:\ ..
As far as I know, that's the only thing missing.
据我所知,这是唯一缺少的东西。