浮点数的正则表达式

时间:2021-04-24 11:34:47

I have a task to match floating point numbers. I have written the following regular expression for it :

我有一个匹配浮点数的任务。我为它写了以下正则表达式:

[-+]?[0-9]*\.?[0-9]*

But it shows an error saying :

但它显示错误说:

Invalid escape sequence (valid ones are  \b  \t  \n  \f  \r  \"  \'  \\ )

But as per my knowledge we need to use an escape character for the . also. Please correct me where I am wrong.

但据我所知,我们需要使用转义字符。也。请纠正我错在哪里。

9 个解决方案

#1


161  

TL;DR

Use [.] instead of \. and [0-9] instead of \d to avoid escaping issues in some languages (like Java).

使用[。]代替\。和[0-9]而不是\ d,以避免在某些语言(如Java)中转义问题。

One relatively simple pattern for matching a floating point number is

用于匹配浮点数的一个相对简单的模式是

[+-]?([0-9]*[.])?[0-9]+

This will match:

这将匹配:

  • 123
  • 123.456
  • .456

See a working example

查看一个工作示例

If you also want to match 123. (a period with no decimal part), then you'll need a slightly longer expression:

如果你还想匹配123.(没有小数部分的句号),那么你需要一个稍长的表达式:

[+-]?([0-9]+([.][0-9]*)?|[.][0-9]+)

See pkeller's answer for a fuller explanation of this pattern

请参阅pkeller的答案,以更全面地解释这种模式

If you want to include non-decimal numbers, such as hex and octal, see my answer to How do I identify if a string is a number?.

如果要包含非十进制数字,例如十六进制和八进制,请参阅我的答案如何识别字符串是否为数字?

If you want to validate that an input is a number (rather than finding a number within the input), then you should surround the pattern with ^ and $, like so:

如果你想验证输入是一个数字(而不是在输入中找到一个数字),那么你应该用^和$包围模式,如下所示:

^[+-]?([0-9]*[.])?[0-9]+$

Irregular Regular Expressions

"Regular expressions", as implemented in most modern languages, APIs, frameworks, libraries, etc., are based on a concept developed in formal language theory. However, software engineers have added many extensions that take these implementations far beyond the formal definition. So, while most regular expression engines resemble one another, there is actually no standard. For this reason, a lot depends on what language, API, framework or library you are using.

在大多数现代语言,API,框架,库等中实现的“正则表达式”基于在形式语言理论中开发的概念。但是,软件工程师添加了许多扩展,使这些实现远远超出了正式定义。因此,虽然大多数正则表达式引擎彼此相似,但实际上并没有标准。因此,很大程度上取决于您使用的语言,API,框架或库。

(Incidentally, to help reduce confusion, many have taken to using "regex" or "regexp" to describe these enhanced matching languages. See Is a Regex the Same as a Regular Expression? at RexEgg.com for more information.)

(顺便提一下,为了帮助减少混淆,许多人已经开始使用“regex”或“regexp”来描述这些增强的匹配语言。有关详细信息,请参阅RexEgg.com上的Regex与正则表达式相同吗?)

That said, most regex engines (actually, all of them, as far as I know) would accept \.. Most likely, there's an issue with escaping.

也就是说,大多数正则表达式引擎(实际上,据我所知,所有这些引擎都会接受\ ..)很可能,存在转义问题。

The Trouble with Escaping

(Thanks to the nameless one for originally recognizing this.)

(感谢无名的人最初认识到这一点。)

Some languages have built-in support for regexes, such as JavaScript. For those languages that don't, escaping can be a problem.

有些语言内置了对正则表达式的支持,例如JavaScript。对于那些没有的语言,转义可能是个问题。

This is because you are basically coding in a language within a language. Java, for example, uses \ as an escape character within it's strings, so if you want to place a literal backslash character within a string, you must escape it:

这是因为您基本上使用语言编写语言。例如,Java在其字符串中使用\作为转义字符,因此如果要在字符串中放置文字反斜杠字符,则必须将其转义:

// creates a single character string: "\"String x = "\\";

However, regexes also use the \ character for escaping, so if you want to match a literal \ character, you must escape it for the regexe engine, and then escape it again for Java:

但是,正则表达式也使用\字符进行转义,因此如果要匹配文字\字符,则必须为正则表达式引擎转义它,然后再为Java转义它:

// Creates a two-character string: "\\"// When used as a regex pattern, will match a single character: "\"String regexPattern = "\\\\";

In your case, you have probably not escaped the backslash character in the language you are programming in:

在您的情况下,您可能没有使用您编程的语言中的反斜杠字符进行转义:

// will most likely result in an "Illegal escape character" errorString wrongPattern = "\.";// will result in the string "\."String correctPattern = "\\.";

All this escaping can get very confusing. If the language you are working with supports raw strings, then you should use those to cut down on the number of backslashes, but not all languages do (most notably: Java). Fortunately, there's an alternative that will work some of the time:

所有这些逃脱都会让人感到非常困惑。如果您使用的语言支持原始字符串,那么您应该使用它们来减少反斜杠的数量,但不是所有语言都这样做(最值得注意的是:Java)。幸运的是,有一种替代方案可以在某些时候起作用:

String correctPattern = "[.]";

For a regex engine, \. and [.] mean exactly the same thing. Note that this doesn't work in every case, like newline (\\n), open square bracket (\\[) and backslash (\\\\ or [\\]).

对于正则表达式引擎,\。和[。]意思完全相同。请注意,这并不适用于所有情况,例如换行符(\\ n),开放方括号(\\ [)和反斜杠(\\\\或[\\])。

A Note about Matching Numbers

(Hint: It's harder than you think)

(提示:这比你想象的要难)

Matching a number is one of those things you'd think is quite easy with regex, but it's actually pretty tricky. Let's take a look at your approach, piece by piece:

匹配一个数字是你认为用正则表达式很容易的事情之一,但它实际上非常棘手。让我们一块一块地看看你的方法:

[-+]?

Match an optional - or +

匹配可选 - 或+

[0-9]*

Match 0 or more sequential digits

匹配0个或更多个连续数字

\.?

Match an optional .

匹配可选项。

[0-9]*

Match 0 or more sequential digits

匹配0个或更多个连续数字

First, we can clean up this expression a bit by using a character class shorthand for the digits (note that this is also susceptible to the escaping issue mentioned above):

首先,我们可以通过使用数字的字符类缩写来清理这个表达式(请注意,这也容易受到上面提到的转义问题的影响):

[0-9] = \d

[0-9] = \ d

I'm going to use \d below, but keep in mind that it means the same thing as [0-9]. (Well, actually, in some engines \d will match digits from all scripts, so it'll match more than [0-9] will, but that's probably not significant in your case.)

我将在下面使用\ d,但请记住它与[0-9]的含义相同。 (嗯,实际上,在某些引擎中\ d将匹配所有脚本中的数字,因此它将匹配超过[0-9],但在您的情况下这可能不重要。)

Now, if you look at this carefully, you'll realize that every single part of your pattern is optional. This pattern can match a 0-length string; a string composed only of + or -; or, a string composed only of a .. This is probably not what you've intended.

现在,如果仔细观察,你会发现你的模式的每一个部分都是可选的。此模式可以匹配0长度的字符串;仅由+或 - 组成的字符串;或者,一个只由a组成的字符串。这可能不是你想要的。

To fix this, it's helpful to start by "anchoring" your regex with the bare-minimum required string, probably a single digit:

要解决这个问题,首先要使用最小的必需字符串“锚定”正则表达式,这可能是一个数字:

\d+

Now we want to add the decimal part, but it doesn't go where you think it might:

现在我们要添加小数部分,但它不会出现在您认为可能的位置:

\d+\.?\d* /* This isn't quite correct. */

This will still match values like 123.. Worse, it's got a tinge of evil about it. The period is optional, meaning that you've got two repeated classes side-by-side (\d+ and \d*). This can actually be dangerous if used in just the wrong way, opening your system up to DoS attacks.

这仍然会匹配像123这样的价值。更糟糕的是,它有一丝邪恶。句点是可选的,这意味着你有两个并排的重复类(\ d +和\ d *)。如果以错误的方式使用,将系统打开到DoS攻击,这实际上可能是危险的。

To fix this, rather than treating the period as optional, we need to treat it as required (to separate the repeated character classes) and instead make the entire decimal portion optional:

要解决这个问题,不要将句点视为可选,我们需要根据需要对其进行处理(分隔重复的字符类),而是将整个小数部分设为可选:

\d+(\.\d+)? /* Better. But... */

This is looking better now. We require a period between the first sequence of digits and the second, but there's a fatal flaw: we can't match .123 because a leading digit is now required.

现在看起来好多了。我们需要在第一个数字序列和第二个数字序列之间有一段时间,但是有一个致命的缺陷:我们无法匹配.123,因为现在需要一个前导数字。

This is actually pretty easy to fix. Instead of making the "decimal" portion of the number optional, we need to look at it as a sequence of characters: 1 or more numbers that may be prefixed by a . that may be prefixed by 0 or more numbers:

这实际上很容易修复。我们不需要将数字的“十进制”部分作为可选项,而是将其视为一系列字符:一个或多个可以以a为前缀的数字。可能以0或更多数字为前缀:

(\d*\.)?\d+

Now we just add the sign:

现在我们只需添加标志:

[+-]?(\d*\.)?\d+

Of course, those slashes are pretty annoying in Java, so we can substitute in our long-form character classes:

当然,这些斜杠在Java中非常烦人,所以我们可以在我们的长格式字符类中替换:

[+-]?([0-9]*[.])?[0-9]+

Matching versus Validating

This has come up in the comments a couple times, so I'm adding an addendum on matching versus validating.

这已经在评论中出现了几次,所以我在补充和验证方面添加了一个附录。

The goal of matching is to find some content within the input (the "needle in a haystack"). The goal of validating is to ensure that the input is in an expected format.

匹配的目标是在输入中找到一些内容(“大海捞针”)。验证的目的是确保输入处于预期格式。

Regexes, by their nature, only match text. Given some input, they will either find some matching text or they will not. However, by "snapping" an expression to the beginning and ending of the input with anchor tags (^ and $), we can ensure that no match is found unless the entire input matches the expression, effectively using regexes to validate.

就其性质而言,正则表达式仅匹配文本。给定一些输入,他们会找到一些匹配的文本,或者他们不会。但是,通过使用锚标记(^和$)将表达式“捕捉”到输入的开头和结尾,我们可以确保找不到匹配,除非整个输入与表达式匹配,有效地使用正则表达式进行验证。

The regex described above ([+-]?([0-9]*[.])?[0-9]+) will match one or more numbers within a target string. So given the input:

上述正则表达式([+ - ]?([0-9] * [。])?[0-9] +)将匹配目标字符串中的一个或多个数字。所以给出了输入:

apple 1.34 pear 7.98 version 1.2.3.4

The regex will match 1.34, 7.98, 1.2, .3 and .4.

正则表达式将匹配1.34,7.98,1.2,.3和.4。

To validate that a given input is a number and nothing but a number, "snap" the expression to the start and end of the input by wrapping it in anchor tags:

要验证给定输入是一个数字而不是数字,通过将表达式包装在锚标记中,将表达式“捕捉”到输入的开头和结尾:

^[+-]?([0-9]*[.])?[0-9]+$

This will only find a match if the entire input is a floating point number, and will not find a match if the input contains additional characters. So, given the input 1.2, a match will be found, but given apple 1.2 pear no matches will be found.

如果整个输入是浮点数,则只会找到匹配项,如果输入包含其他字符,则不会找到匹配项。因此,给定输入1.2,将找到匹配,但是如果给出苹果1.2梨,则不会找到匹配。

Note that some regex engines have a validate, isMatch or similar function, which essentially does what I've described automatically, returning true if a match is found and false if no match is found. Also keep in mind that some engines allow you to set flags which change the definition of ^ and $, matching the beginning/end of a line rather than the beginning/end of the entire input. This is typically not the default, but be on the lookout for these flags.

请注意,某些正则表达式引擎具有validate,isMatch或类似函数,它基本上执行我自动描述的操作,如果找到匹配则返回true,如果未找到匹配则返回false。还要记住,某些引擎允许您设置标志,这些标志会更改^和$的定义,匹配行的开头/结尾而不是整个输入的开头/结尾。这通常不是默认值,而是要注意这些标志。

#2


13  

I don't think that any of the answers on this page at the time of writing are correct (also many other suggestions elsewhere on SO are wrong too). The complication is that you have to match all of the following possibilities:

在撰写本文时,我不认为本页面上的任何答案都是正确的(在SO的其他地方也有许多其他建议也是错误的)。复杂的是你必须匹配以下所有可能性:

  • No decimal point (i.e. an integer value)
  • 没有小数点(即整数值)

  • Digits both before and after the decimal point (e.g. 0.35 , 22.165)
  • 小数点前后的数字(例如0.35,22.165)

  • Digits before the decimal point only (e.g. 0. , 1234.)
  • 仅小数点前的数字(例如,0,1234。)

  • Digits after the decimal point only (e.g. .0 , .5678)
  • 仅小数点后的数字(例如.0,.5678)

At the same time, you must ensure that there is at least one digit somewhere, i.e. the following are not allowed:

同时,您必须确保某处至少有一位数字,即不允许以下数字:

  • a decimal point on its own
  • 自己的小数点

  • a signed decimal point with no digits (i.e. +. or -.)
  • 带符号的小数点,没有数字(即+。或 - 。)

  • + or - on their own
  • +或 - 自己

  • an empty string
  • 一个空字符串

This seems tricky at first, but one way of finding inspiration is to look at the OpenJDK source for the java.lang.Double.valueOf(String) method (start at http://hg.openjdk.java.net/jdk8/jdk8/jdk, click "browse", navigate down /src/share/classes/java/lang/ and find the Double class). The long regex that this class contains caters for various possibilities that the OP probably didn't have in mind, but ignoring for simplicity the parts of it that deal with NaN, infinity, Hexadecimal notation and exponents, and using \d rather than the POSIX notation for a single digit, I can reduce the important parts of the regex for a signed floating point number with no exponent to:

这一开始看起来很棘手,但找到灵感的一种方法是查看java.lang.Double.valueOf(String)方法的OpenJDK源代码(从http://hg.openjdk.java.net/jdk8/jdk8开始/ jdk,单击“浏览”,向下导航/ src / share / classes / java / lang /并找到Double类)。这个类所包含的长正则表达式可以满足OP可能没有想到的各种可能性,但是为了简单而忽略了处理NaN,无穷大,十六进制表示法和指数的部分,以及使用\ d而不是POSIX对于单个数字的表示法,我可以减少正则表达式的重要部分,对于没有指数的带符号浮点数:

[+-]?((\d+\.?\d*)|(\.\d+))

I don't think that there is a way of avoiding the (...)|(...) construction without allowing something that contains no digits, or forbidding one of the possibilities that has no digits before the decimal point or no digits after it.

我不认为有一种方法可以避免(...)|(...)构造而不允许任何不包含数字的东西,或禁止在小数点之前没有数字或没有数字的可能性之一在它之后。

Obviously in practice you will need to cater for trailing or preceding whitespace, either in the regex itself or in the code that uses it.

显然,在实践中,您需要在正则表达式本身或使用它的代码中满足尾随或前面的空格。

#3


7  

what you need is:

你需要的是:

[\-\+]?[0-9]*(\.[0-9]+)?

I escaped the "+" and "-" sign and also grouped the decimal with its following digits since something like "1." is not a valid number.

我转义了“+”和“ - ”符号,并将小数与其后面的数字分组,因为类似于“1”。不是有效的数字。

The changes will allow you to match integers and floats. for example:

更改将允许您匹配整数和浮点数。例如:

0+1-2.02.23442

#4


2  

This is simple: you have used Java and you ought to use \\. instead of \. (search for character escaping in Java).

这很简单:你使用过Java而你应该使用\\。代替 \。 (在Java中搜索字符转义)。

#5


2  

This one worked for me:

这个对我有用:

(?P<value>[-+]*\d+\.\d+|[-+]*\d+)

You can also use this one (without named parameter):

你也可以使用这个(没有命名参数):

([-+]*\d+\.\d+|[-+]*\d+)

Use some online regex tester to test it (e.g. regex101 )

使用一些在线正则表达式测试仪进行测试(例如regex101)

#6


0  

[+-]?(([1-9][0-9]*)|(0))([.,][0-9]+)?

[+-]? - optional leading sign

[+ - ]? - 可选的前导标志

(([1-9][0-9]*)|(0)) - integer without leading zero, including single zero

(([1-9] [0-9] *)|(0)) - 不带前导零的整数,包括单个零

([.,][0-9]+)? - optional fractional part

([。] [0-9] +)? - 可选的小数部分

#7


0  

^[+]?([0-9]{1,2})*[.,]([0-9]{1,1})?$

This will match:

这将匹配:

  1. 1.2
  2. 12.3
  3. 1,2
  4. 12,3

#8


-1  

[+/-] [0-9]*.[0-9]+

Try this solution.

尝试此解决方案。

#9


-1  

for javascript

const test = new RegExp('^[+]?([0-9]{0,})*[.]?([0-9]{0,2})?$','g');

Which would work for 1.231234.2200.1212

哪个适用于1.231234.2200.1212

You can change the parts in the {} to get different results in decimal length and front of the decimal as well. This is used in inputs for entering in number and checking every input as you type only allowing what passes.

您可以更改{}中的部分以获得十进制长度和小数前面的不同结果。这用于输入输入数字和检查每个输入,因为您键入只允许通过。

#1


161  

TL;DR

Use [.] instead of \. and [0-9] instead of \d to avoid escaping issues in some languages (like Java).

使用[。]代替\。和[0-9]而不是\ d,以避免在某些语言(如Java)中转义问题。

One relatively simple pattern for matching a floating point number is

用于匹配浮点数的一个相对简单的模式是

[+-]?([0-9]*[.])?[0-9]+

This will match:

这将匹配:

  • 123
  • 123.456
  • .456

See a working example

查看一个工作示例

If you also want to match 123. (a period with no decimal part), then you'll need a slightly longer expression:

如果你还想匹配123.(没有小数部分的句号),那么你需要一个稍长的表达式:

[+-]?([0-9]+([.][0-9]*)?|[.][0-9]+)

See pkeller's answer for a fuller explanation of this pattern

请参阅pkeller的答案,以更全面地解释这种模式

If you want to include non-decimal numbers, such as hex and octal, see my answer to How do I identify if a string is a number?.

如果要包含非十进制数字,例如十六进制和八进制,请参阅我的答案如何识别字符串是否为数字?

If you want to validate that an input is a number (rather than finding a number within the input), then you should surround the pattern with ^ and $, like so:

如果你想验证输入是一个数字(而不是在输入中找到一个数字),那么你应该用^和$包围模式,如下所示:

^[+-]?([0-9]*[.])?[0-9]+$

Irregular Regular Expressions

"Regular expressions", as implemented in most modern languages, APIs, frameworks, libraries, etc., are based on a concept developed in formal language theory. However, software engineers have added many extensions that take these implementations far beyond the formal definition. So, while most regular expression engines resemble one another, there is actually no standard. For this reason, a lot depends on what language, API, framework or library you are using.

在大多数现代语言,API,框架,库等中实现的“正则表达式”基于在形式语言理论中开发的概念。但是,软件工程师添加了许多扩展,使这些实现远远超出了正式定义。因此,虽然大多数正则表达式引擎彼此相似,但实际上并没有标准。因此,很大程度上取决于您使用的语言,API,框架或库。

(Incidentally, to help reduce confusion, many have taken to using "regex" or "regexp" to describe these enhanced matching languages. See Is a Regex the Same as a Regular Expression? at RexEgg.com for more information.)

(顺便提一下,为了帮助减少混淆,许多人已经开始使用“regex”或“regexp”来描述这些增强的匹配语言。有关详细信息,请参阅RexEgg.com上的Regex与正则表达式相同吗?)

That said, most regex engines (actually, all of them, as far as I know) would accept \.. Most likely, there's an issue with escaping.

也就是说,大多数正则表达式引擎(实际上,据我所知,所有这些引擎都会接受\ ..)很可能,存在转义问题。

The Trouble with Escaping

(Thanks to the nameless one for originally recognizing this.)

(感谢无名的人最初认识到这一点。)

Some languages have built-in support for regexes, such as JavaScript. For those languages that don't, escaping can be a problem.

有些语言内置了对正则表达式的支持,例如JavaScript。对于那些没有的语言,转义可能是个问题。

This is because you are basically coding in a language within a language. Java, for example, uses \ as an escape character within it's strings, so if you want to place a literal backslash character within a string, you must escape it:

这是因为您基本上使用语言编写语言。例如,Java在其字符串中使用\作为转义字符,因此如果要在字符串中放置文字反斜杠字符,则必须将其转义:

// creates a single character string: "\"String x = "\\";

However, regexes also use the \ character for escaping, so if you want to match a literal \ character, you must escape it for the regexe engine, and then escape it again for Java:

但是,正则表达式也使用\字符进行转义,因此如果要匹配文字\字符,则必须为正则表达式引擎转义它,然后再为Java转义它:

// Creates a two-character string: "\\"// When used as a regex pattern, will match a single character: "\"String regexPattern = "\\\\";

In your case, you have probably not escaped the backslash character in the language you are programming in:

在您的情况下,您可能没有使用您编程的语言中的反斜杠字符进行转义:

// will most likely result in an "Illegal escape character" errorString wrongPattern = "\.";// will result in the string "\."String correctPattern = "\\.";

All this escaping can get very confusing. If the language you are working with supports raw strings, then you should use those to cut down on the number of backslashes, but not all languages do (most notably: Java). Fortunately, there's an alternative that will work some of the time:

所有这些逃脱都会让人感到非常困惑。如果您使用的语言支持原始字符串,那么您应该使用它们来减少反斜杠的数量,但不是所有语言都这样做(最值得注意的是:Java)。幸运的是,有一种替代方案可以在某些时候起作用:

String correctPattern = "[.]";

For a regex engine, \. and [.] mean exactly the same thing. Note that this doesn't work in every case, like newline (\\n), open square bracket (\\[) and backslash (\\\\ or [\\]).

对于正则表达式引擎,\。和[。]意思完全相同。请注意,这并不适用于所有情况,例如换行符(\\ n),开放方括号(\\ [)和反斜杠(\\\\或[\\])。

A Note about Matching Numbers

(Hint: It's harder than you think)

(提示:这比你想象的要难)

Matching a number is one of those things you'd think is quite easy with regex, but it's actually pretty tricky. Let's take a look at your approach, piece by piece:

匹配一个数字是你认为用正则表达式很容易的事情之一,但它实际上非常棘手。让我们一块一块地看看你的方法:

[-+]?

Match an optional - or +

匹配可选 - 或+

[0-9]*

Match 0 or more sequential digits

匹配0个或更多个连续数字

\.?

Match an optional .

匹配可选项。

[0-9]*

Match 0 or more sequential digits

匹配0个或更多个连续数字

First, we can clean up this expression a bit by using a character class shorthand for the digits (note that this is also susceptible to the escaping issue mentioned above):

首先,我们可以通过使用数字的字符类缩写来清理这个表达式(请注意,这也容易受到上面提到的转义问题的影响):

[0-9] = \d

[0-9] = \ d

I'm going to use \d below, but keep in mind that it means the same thing as [0-9]. (Well, actually, in some engines \d will match digits from all scripts, so it'll match more than [0-9] will, but that's probably not significant in your case.)

我将在下面使用\ d,但请记住它与[0-9]的含义相同。 (嗯,实际上,在某些引擎中\ d将匹配所有脚本中的数字,因此它将匹配超过[0-9],但在您的情况下这可能不重要。)

Now, if you look at this carefully, you'll realize that every single part of your pattern is optional. This pattern can match a 0-length string; a string composed only of + or -; or, a string composed only of a .. This is probably not what you've intended.

现在,如果仔细观察,你会发现你的模式的每一个部分都是可选的。此模式可以匹配0长度的字符串;仅由+或 - 组成的字符串;或者,一个只由a组成的字符串。这可能不是你想要的。

To fix this, it's helpful to start by "anchoring" your regex with the bare-minimum required string, probably a single digit:

要解决这个问题,首先要使用最小的必需字符串“锚定”正则表达式,这可能是一个数字:

\d+

Now we want to add the decimal part, but it doesn't go where you think it might:

现在我们要添加小数部分,但它不会出现在您认为可能的位置:

\d+\.?\d* /* This isn't quite correct. */

This will still match values like 123.. Worse, it's got a tinge of evil about it. The period is optional, meaning that you've got two repeated classes side-by-side (\d+ and \d*). This can actually be dangerous if used in just the wrong way, opening your system up to DoS attacks.

这仍然会匹配像123这样的价值。更糟糕的是,它有一丝邪恶。句点是可选的,这意味着你有两个并排的重复类(\ d +和\ d *)。如果以错误的方式使用,将系统打开到DoS攻击,这实际上可能是危险的。

To fix this, rather than treating the period as optional, we need to treat it as required (to separate the repeated character classes) and instead make the entire decimal portion optional:

要解决这个问题,不要将句点视为可选,我们需要根据需要对其进行处理(分隔重复的字符类),而是将整个小数部分设为可选:

\d+(\.\d+)? /* Better. But... */

This is looking better now. We require a period between the first sequence of digits and the second, but there's a fatal flaw: we can't match .123 because a leading digit is now required.

现在看起来好多了。我们需要在第一个数字序列和第二个数字序列之间有一段时间,但是有一个致命的缺陷:我们无法匹配.123,因为现在需要一个前导数字。

This is actually pretty easy to fix. Instead of making the "decimal" portion of the number optional, we need to look at it as a sequence of characters: 1 or more numbers that may be prefixed by a . that may be prefixed by 0 or more numbers:

这实际上很容易修复。我们不需要将数字的“十进制”部分作为可选项,而是将其视为一系列字符:一个或多个可以以a为前缀的数字。可能以0或更多数字为前缀:

(\d*\.)?\d+

Now we just add the sign:

现在我们只需添加标志:

[+-]?(\d*\.)?\d+

Of course, those slashes are pretty annoying in Java, so we can substitute in our long-form character classes:

当然,这些斜杠在Java中非常烦人,所以我们可以在我们的长格式字符类中替换:

[+-]?([0-9]*[.])?[0-9]+

Matching versus Validating

This has come up in the comments a couple times, so I'm adding an addendum on matching versus validating.

这已经在评论中出现了几次,所以我在补充和验证方面添加了一个附录。

The goal of matching is to find some content within the input (the "needle in a haystack"). The goal of validating is to ensure that the input is in an expected format.

匹配的目标是在输入中找到一些内容(“大海捞针”)。验证的目的是确保输入处于预期格式。

Regexes, by their nature, only match text. Given some input, they will either find some matching text or they will not. However, by "snapping" an expression to the beginning and ending of the input with anchor tags (^ and $), we can ensure that no match is found unless the entire input matches the expression, effectively using regexes to validate.

就其性质而言,正则表达式仅匹配文本。给定一些输入,他们会找到一些匹配的文本,或者他们不会。但是,通过使用锚标记(^和$)将表达式“捕捉”到输入的开头和结尾,我们可以确保找不到匹配,除非整个输入与表达式匹配,有效地使用正则表达式进行验证。

The regex described above ([+-]?([0-9]*[.])?[0-9]+) will match one or more numbers within a target string. So given the input:

上述正则表达式([+ - ]?([0-9] * [。])?[0-9] +)将匹配目标字符串中的一个或多个数字。所以给出了输入:

apple 1.34 pear 7.98 version 1.2.3.4

The regex will match 1.34, 7.98, 1.2, .3 and .4.

正则表达式将匹配1.34,7.98,1.2,.3和.4。

To validate that a given input is a number and nothing but a number, "snap" the expression to the start and end of the input by wrapping it in anchor tags:

要验证给定输入是一个数字而不是数字,通过将表达式包装在锚标记中,将表达式“捕捉”到输入的开头和结尾:

^[+-]?([0-9]*[.])?[0-9]+$

This will only find a match if the entire input is a floating point number, and will not find a match if the input contains additional characters. So, given the input 1.2, a match will be found, but given apple 1.2 pear no matches will be found.

如果整个输入是浮点数,则只会找到匹配项,如果输入包含其他字符,则不会找到匹配项。因此,给定输入1.2,将找到匹配,但是如果给出苹果1.2梨,则不会找到匹配。

Note that some regex engines have a validate, isMatch or similar function, which essentially does what I've described automatically, returning true if a match is found and false if no match is found. Also keep in mind that some engines allow you to set flags which change the definition of ^ and $, matching the beginning/end of a line rather than the beginning/end of the entire input. This is typically not the default, but be on the lookout for these flags.

请注意,某些正则表达式引擎具有validate,isMatch或类似函数,它基本上执行我自动描述的操作,如果找到匹配则返回true,如果未找到匹配则返回false。还要记住,某些引擎允许您设置标志,这些标志会更改^和$的定义,匹配行的开头/结尾而不是整个输入的开头/结尾。这通常不是默认值,而是要注意这些标志。

#2


13  

I don't think that any of the answers on this page at the time of writing are correct (also many other suggestions elsewhere on SO are wrong too). The complication is that you have to match all of the following possibilities:

在撰写本文时,我不认为本页面上的任何答案都是正确的(在SO的其他地方也有许多其他建议也是错误的)。复杂的是你必须匹配以下所有可能性:

  • No decimal point (i.e. an integer value)
  • 没有小数点(即整数值)

  • Digits both before and after the decimal point (e.g. 0.35 , 22.165)
  • 小数点前后的数字(例如0.35,22.165)

  • Digits before the decimal point only (e.g. 0. , 1234.)
  • 仅小数点前的数字(例如,0,1234。)

  • Digits after the decimal point only (e.g. .0 , .5678)
  • 仅小数点后的数字(例如.0,.5678)

At the same time, you must ensure that there is at least one digit somewhere, i.e. the following are not allowed:

同时,您必须确保某处至少有一位数字,即不允许以下数字:

  • a decimal point on its own
  • 自己的小数点

  • a signed decimal point with no digits (i.e. +. or -.)
  • 带符号的小数点,没有数字(即+。或 - 。)

  • + or - on their own
  • +或 - 自己

  • an empty string
  • 一个空字符串

This seems tricky at first, but one way of finding inspiration is to look at the OpenJDK source for the java.lang.Double.valueOf(String) method (start at http://hg.openjdk.java.net/jdk8/jdk8/jdk, click "browse", navigate down /src/share/classes/java/lang/ and find the Double class). The long regex that this class contains caters for various possibilities that the OP probably didn't have in mind, but ignoring for simplicity the parts of it that deal with NaN, infinity, Hexadecimal notation and exponents, and using \d rather than the POSIX notation for a single digit, I can reduce the important parts of the regex for a signed floating point number with no exponent to:

这一开始看起来很棘手,但找到灵感的一种方法是查看java.lang.Double.valueOf(String)方法的OpenJDK源代码(从http://hg.openjdk.java.net/jdk8/jdk8开始/ jdk,单击“浏览”,向下导航/ src / share / classes / java / lang /并找到Double类)。这个类所包含的长正则表达式可以满足OP可能没有想到的各种可能性,但是为了简单而忽略了处理NaN,无穷大,十六进制表示法和指数的部分,以及使用\ d而不是POSIX对于单个数字的表示法,我可以减少正则表达式的重要部分,对于没有指数的带符号浮点数:

[+-]?((\d+\.?\d*)|(\.\d+))

I don't think that there is a way of avoiding the (...)|(...) construction without allowing something that contains no digits, or forbidding one of the possibilities that has no digits before the decimal point or no digits after it.

我不认为有一种方法可以避免(...)|(...)构造而不允许任何不包含数字的东西,或禁止在小数点之前没有数字或没有数字的可能性之一在它之后。

Obviously in practice you will need to cater for trailing or preceding whitespace, either in the regex itself or in the code that uses it.

显然,在实践中,您需要在正则表达式本身或使用它的代码中满足尾随或前面的空格。

#3


7  

what you need is:

你需要的是:

[\-\+]?[0-9]*(\.[0-9]+)?

I escaped the "+" and "-" sign and also grouped the decimal with its following digits since something like "1." is not a valid number.

我转义了“+”和“ - ”符号,并将小数与其后面的数字分组,因为类似于“1”。不是有效的数字。

The changes will allow you to match integers and floats. for example:

更改将允许您匹配整数和浮点数。例如:

0+1-2.02.23442

#4


2  

This is simple: you have used Java and you ought to use \\. instead of \. (search for character escaping in Java).

这很简单:你使用过Java而你应该使用\\。代替 \。 (在Java中搜索字符转义)。

#5


2  

This one worked for me:

这个对我有用:

(?P<value>[-+]*\d+\.\d+|[-+]*\d+)

You can also use this one (without named parameter):

你也可以使用这个(没有命名参数):

([-+]*\d+\.\d+|[-+]*\d+)

Use some online regex tester to test it (e.g. regex101 )

使用一些在线正则表达式测试仪进行测试(例如regex101)

#6


0  

[+-]?(([1-9][0-9]*)|(0))([.,][0-9]+)?

[+-]? - optional leading sign

[+ - ]? - 可选的前导标志

(([1-9][0-9]*)|(0)) - integer without leading zero, including single zero

(([1-9] [0-9] *)|(0)) - 不带前导零的整数,包括单个零

([.,][0-9]+)? - optional fractional part

([。] [0-9] +)? - 可选的小数部分

#7


0  

^[+]?([0-9]{1,2})*[.,]([0-9]{1,1})?$

This will match:

这将匹配:

  1. 1.2
  2. 12.3
  3. 1,2
  4. 12,3

#8


-1  

[+/-] [0-9]*.[0-9]+

Try this solution.

尝试此解决方案。

#9


-1  

for javascript

const test = new RegExp('^[+]?([0-9]{0,})*[.]?([0-9]{0,2})?$','g');

Which would work for 1.231234.2200.1212

哪个适用于1.231234.2200.1212

You can change the parts in the {} to get different results in decimal length and front of the decimal as well. This is used in inputs for entering in number and checking every input as you type only allowing what passes.

您可以更改{}中的部分以获得十进制长度和小数前面的不同结果。这用于输入输入数字和检查每个输入,因为您键入只允许通过。