检测String是否为数字的最优雅方法?

时间:2022-06-14 00:19:29

Is there a better, more elegant (and/or possibly faster) way than

是否有更好,更优雅(和/或可能更快)的方式

boolean isNumber = false;
try{
   Double.valueOf(myNumber);
   isNumber = true;
} catch (NumberFormatException e) {
}

...?


Edit: Since I can't pick two answers I'm going with the regex one because a) it's elegant and b) saying "Jon Skeet solved the problem" is a tautology because Jon Skeet himself is the solution to all problems.

编辑:因为我不能选择两个答案我正在使用正则表达式,因为a)它优雅而且b)说“Jon Skeet解决了问题”是一个重言式,因为Jon Skeet本身就是所有问题的解决方案。

11 个解决方案

#1


9  

I don't believe there's anything built into Java to do it faster and still reliably, assuming that later on you'll want to actually parse it with Double.valueOf (or similar).

我不相信Java中有任何内容可以更快,更可靠地执行它,假设稍后您将要使用Double.valueOf(或类似)实际解析它。

I'd use Double.parseDouble instead of Double.valueOf to avoid creating a Double unnecessarily, and you can also get rid of blatantly silly numbers quicker than the exception will by checking for digits, e/E, - and . beforehand. So, something like:

我使用Double.parseDouble而不是Double.valueOf来避免不必要地创建一个Double,并且你还可以通过检查数字,e / E,和来更快地摆脱明显愚蠢的数字。预先。所以,像:

public boolean isDouble(String value)
{        
    boolean seenDot = false;
    boolean seenExp = false;
    boolean justSeenExp = false;
    boolean seenDigit = false;
    for (int i=0; i < value.length(); i++)
    {
        char c = value.charAt(i);
        if (c >= '0' && c <= '9')
        {
            seenDigit = true;
            continue;
        }
        if ((c == '-' || c=='+') && (i == 0 || justSeenExp))
        {
            continue;
        }
        if (c == '.' && !seenDot)
        {
            seenDot = true;
            continue;
        }
        justSeenExp = false;
        if ((c == 'e' || c == 'E') && !seenExp)
        {
            seenExp = true;
            justSeenExp = true;
            continue;
        }
        return false;
    }
    if (!seenDigit)
    {
        return false;
    }
    try
    {
        Double.parseDouble(value);
        return true;
    }
    catch (NumberFormatException e)
    {
        return false;
    }
}

Note that despite taking a couple of tries, this still doesn't cover "NaN" or hex values. Whether you want those to pass or not depends on context.

请注意,尽管尝试了几次,但仍然不包括“NaN”或十六进制值。您是否希望这些传递取决于上下文。

In my experience regular expressions are slower than the hard-coded check above.

根据我的经验,正则表达式比上面的硬编码检查慢。

#2


9  

You could use a regex, i.e. something like String.matches("^[\\d\\-\\.]+$"); (if you're not testing for negative numbers or floating point numbers you could simplify a bit).

你可以使用正则表达式,比如String.matches(“^ [\\ d \\ - \\。] + $”); (如果你没有测试负数或浮点数,你可以简化一下)。

Not sure whether that would be faster than the method you outlined though.

不确定这是否会比你概述的方法更快。

Edit: in the light of all this controversy, I decided to make a test and get some data about how fast each of these methods were. Not so much the correctness, but just how quickly they ran.

编辑:鉴于所有这些争议,我决定进行一项测试,并获得一些有关每种方法的速度的数据。不是那么正确,而是他们跑得多快。

You can read about my results on my blog. (Hint: Jon Skeet FTW).

您可以在我的博客上阅读我的结果。 (提示:Jon Skeet FTW)。

#3


8  

See java.text.NumberFormat (javadoc).

请参见java.text.NumberFormat(javadoc)。

NumberFormat nf = NumberFormat.getInstance(Locale.FRENCH);
Number myNumber = nf.parse(myString);
int myInt = myNumber.intValue();
double myDouble = myNumber.doubleValue();

#4


5  

The correct regex is actually given in the Double javadocs:

正确的正则表达式实际上是在Double javadocs中给出的:

To avoid calling this method on an invalid string and having a NumberFormatException be thrown, the regular expression below can be used to screen the input string:

为了避免在无效字符串上调用此方法并抛出NumberFormatException,可以使用下面的正则表达式来筛选输入字符串:

    final String Digits     = "(\\p{Digit}+)";
    final String HexDigits  = "(\\p{XDigit}+)";
    // an exponent is 'e' or 'E' followed by an optionally 
    // signed decimal integer.
    final String Exp        = "[eE][+-]?"+Digits;
    final String fpRegex    =
        ("[\\x00-\\x20]*"+  // Optional leading "whitespace"
         "[+-]?(" + // Optional sign character
         "NaN|" +           // "NaN" string
         "Infinity|" +      // "Infinity" string

         // A decimal floating-point string representing a finite positive
         // number without a leading sign has at most five basic pieces:
         // Digits . Digits ExponentPart FloatTypeSuffix
         // 
         // Since this method allows integer-only strings as input
         // in addition to strings of floating-point literals, the
         // two sub-patterns below are simplifications of the grammar
         // productions from the Java Language Specification, 2nd 
         // edition, section 3.10.2.

         // Digits ._opt Digits_opt ExponentPart_opt FloatTypeSuffix_opt
         "((("+Digits+"(\\.)?("+Digits+"?)("+Exp+")?)|"+

         // . Digits ExponentPart_opt FloatTypeSuffix_opt
         "(\\.("+Digits+")("+Exp+")?)|"+

   // Hexadecimal strings
   "((" +
    // 0[xX] HexDigits ._opt BinaryExponent FloatTypeSuffix_opt
    "(0[xX]" + HexDigits + "(\\.)?)|" +

    // 0[xX] HexDigits_opt . HexDigits BinaryExponent FloatTypeSuffix_opt
    "(0[xX]" + HexDigits + "?(\\.)" + HexDigits + ")" +

    ")[pP][+-]?" + Digits + "))" +
         "[fFdD]?))" +
         "[\\x00-\\x20]*");// Optional trailing "whitespace"

    if (Pattern.matches(fpRegex, myString))
        Double.valueOf(myString); // Will not throw NumberFormatException
    else {
        // Perform suitable alternative action
    }

This does not allow for localized representations, however:

但是,这不允许本地化表示:

To interpret localized string representations of a floating-point value, use subclasses of NumberFormat.

要解释浮点值的本地化字符串表示形式,请使用NumberFormat的子类。

#5


3  

Use StringUtils.isDouble(String) in Apache Commons.

在Apache Commons中使用StringUtils.isDouble(String)。

#6


3  

Leveraging off Mr. Skeet:

利用Skeet先生:

private boolean IsValidDoubleChar(char c)
{
    return "0123456789.+-eE".indexOf(c) >= 0;
}

public boolean isDouble(String value)
{
    for (int i=0; i < value.length(); i++)
    {
        char c = value.charAt(i);
        if (IsValidDoubleChar(c))
            continue;
        return false;
    }
    try
    {
        Double.parseDouble(value);
        return true;
    }
    catch (NumberFormatException e)
    {
        return false;
    }
}

#7


2  

I would use the Jakarta commons-lang, as always ! But I have no idea if their implementation is fast or not. It doesnt rely on Exceptions, which might be a good thig performance wise ...

我将一如既往地使用雅加达公共场所!但我不知道他们的实施是否快速。它不依赖于Exceptions,这可能是一个很好的thig性能......

#8


2  

Most of these answers are somewhat acceptable solutions. All of the regex solutions have the issue of not being correct for all cases you may care about.

大多数答案都是可接受的解决方案。所有正则表达式解决方案都存在对您可能关心的所有情况都不正确的问题。

If you really want to ensure that the String is a valid number, then I would use your own solution. Don't forget that, I imagine, that most of the time the String will be a valid number and won't raise an exception. So most of the time the performance will be identical to that of Double.valueOf().

如果你真的想确保String是一个有效的数字,那么我会使用你自己的解决方案。我想,不要忘记,大多数情况下,String将是一个有效的数字,不会引发异常。因此,大多数情况下,性能将与Double.valueOf()的性能相同。

I guess this really isn't an answer, except that it validates your initial instinct.

我想这真的不是一个答案,除了它验证了你的初始本能。

Randy

#9


1  

Following Phill's answer can I suggest another regex?

根据Phill的回答,我可以建议另一个正则表达式吗?

String.matches("^-?\\d+(\\.\\d+)?$");

#10


1  

I prefer using a loop over the Strings's char[] representation and using the Character.isDigit() method. If elegance is desired, I think this is the most readable:

我更喜欢在Strings的char []表示上使用循环并使用Character.isDigit()方法。如果需要优雅,我认为这是最可读的:

package tias;

public class Main {
  private static final String NUMERIC = "123456789";
  private static final String NOT_NUMERIC = "1L5C";

  public static void main(String[] args) {
    System.out.println(isStringNumeric(NUMERIC));
    System.out.println(isStringNumeric(NOT_NUMERIC));
  }

  private static boolean isStringNumeric(String aString) {
    if (aString == null || aString.length() == 0) {
      return false;
    }
    for (char c : aString.toCharArray() ) {
      if (!Character.isDigit(c)) {
        return false;
      }
    }
    return true;
  }

}

#11


-1  

If you want something that's blisteringly fast, and you have a very clear idea of what formats you want to accept, you can build a state machine DFA by hand. This is essentially how regexes work under the hood anyway, but you can avoid the regex compilation step this way, and it may well be faster than a generic regex compiler.

如果你想要一些非常快速的东西,并且你非常清楚你想要接受哪种格式,你可以手工构建一个状态机DFA。这本质上是正则表达式的工作原理,但你可以通过这种方式避免正则表达式编译步骤,并且它可能比通用正则表达式编译器更快。

#1


9  

I don't believe there's anything built into Java to do it faster and still reliably, assuming that later on you'll want to actually parse it with Double.valueOf (or similar).

我不相信Java中有任何内容可以更快,更可靠地执行它,假设稍后您将要使用Double.valueOf(或类似)实际解析它。

I'd use Double.parseDouble instead of Double.valueOf to avoid creating a Double unnecessarily, and you can also get rid of blatantly silly numbers quicker than the exception will by checking for digits, e/E, - and . beforehand. So, something like:

我使用Double.parseDouble而不是Double.valueOf来避免不必要地创建一个Double,并且你还可以通过检查数字,e / E,和来更快地摆脱明显愚蠢的数字。预先。所以,像:

public boolean isDouble(String value)
{        
    boolean seenDot = false;
    boolean seenExp = false;
    boolean justSeenExp = false;
    boolean seenDigit = false;
    for (int i=0; i < value.length(); i++)
    {
        char c = value.charAt(i);
        if (c >= '0' && c <= '9')
        {
            seenDigit = true;
            continue;
        }
        if ((c == '-' || c=='+') && (i == 0 || justSeenExp))
        {
            continue;
        }
        if (c == '.' && !seenDot)
        {
            seenDot = true;
            continue;
        }
        justSeenExp = false;
        if ((c == 'e' || c == 'E') && !seenExp)
        {
            seenExp = true;
            justSeenExp = true;
            continue;
        }
        return false;
    }
    if (!seenDigit)
    {
        return false;
    }
    try
    {
        Double.parseDouble(value);
        return true;
    }
    catch (NumberFormatException e)
    {
        return false;
    }
}

Note that despite taking a couple of tries, this still doesn't cover "NaN" or hex values. Whether you want those to pass or not depends on context.

请注意,尽管尝试了几次,但仍然不包括“NaN”或十六进制值。您是否希望这些传递取决于上下文。

In my experience regular expressions are slower than the hard-coded check above.

根据我的经验,正则表达式比上面的硬编码检查慢。

#2


9  

You could use a regex, i.e. something like String.matches("^[\\d\\-\\.]+$"); (if you're not testing for negative numbers or floating point numbers you could simplify a bit).

你可以使用正则表达式,比如String.matches(“^ [\\ d \\ - \\。] + $”); (如果你没有测试负数或浮点数,你可以简化一下)。

Not sure whether that would be faster than the method you outlined though.

不确定这是否会比你概述的方法更快。

Edit: in the light of all this controversy, I decided to make a test and get some data about how fast each of these methods were. Not so much the correctness, but just how quickly they ran.

编辑:鉴于所有这些争议,我决定进行一项测试,并获得一些有关每种方法的速度的数据。不是那么正确,而是他们跑得多快。

You can read about my results on my blog. (Hint: Jon Skeet FTW).

您可以在我的博客上阅读我的结果。 (提示:Jon Skeet FTW)。

#3


8  

See java.text.NumberFormat (javadoc).

请参见java.text.NumberFormat(javadoc)。

NumberFormat nf = NumberFormat.getInstance(Locale.FRENCH);
Number myNumber = nf.parse(myString);
int myInt = myNumber.intValue();
double myDouble = myNumber.doubleValue();

#4


5  

The correct regex is actually given in the Double javadocs:

正确的正则表达式实际上是在Double javadocs中给出的:

To avoid calling this method on an invalid string and having a NumberFormatException be thrown, the regular expression below can be used to screen the input string:

为了避免在无效字符串上调用此方法并抛出NumberFormatException,可以使用下面的正则表达式来筛选输入字符串:

    final String Digits     = "(\\p{Digit}+)";
    final String HexDigits  = "(\\p{XDigit}+)";
    // an exponent is 'e' or 'E' followed by an optionally 
    // signed decimal integer.
    final String Exp        = "[eE][+-]?"+Digits;
    final String fpRegex    =
        ("[\\x00-\\x20]*"+  // Optional leading "whitespace"
         "[+-]?(" + // Optional sign character
         "NaN|" +           // "NaN" string
         "Infinity|" +      // "Infinity" string

         // A decimal floating-point string representing a finite positive
         // number without a leading sign has at most five basic pieces:
         // Digits . Digits ExponentPart FloatTypeSuffix
         // 
         // Since this method allows integer-only strings as input
         // in addition to strings of floating-point literals, the
         // two sub-patterns below are simplifications of the grammar
         // productions from the Java Language Specification, 2nd 
         // edition, section 3.10.2.

         // Digits ._opt Digits_opt ExponentPart_opt FloatTypeSuffix_opt
         "((("+Digits+"(\\.)?("+Digits+"?)("+Exp+")?)|"+

         // . Digits ExponentPart_opt FloatTypeSuffix_opt
         "(\\.("+Digits+")("+Exp+")?)|"+

   // Hexadecimal strings
   "((" +
    // 0[xX] HexDigits ._opt BinaryExponent FloatTypeSuffix_opt
    "(0[xX]" + HexDigits + "(\\.)?)|" +

    // 0[xX] HexDigits_opt . HexDigits BinaryExponent FloatTypeSuffix_opt
    "(0[xX]" + HexDigits + "?(\\.)" + HexDigits + ")" +

    ")[pP][+-]?" + Digits + "))" +
         "[fFdD]?))" +
         "[\\x00-\\x20]*");// Optional trailing "whitespace"

    if (Pattern.matches(fpRegex, myString))
        Double.valueOf(myString); // Will not throw NumberFormatException
    else {
        // Perform suitable alternative action
    }

This does not allow for localized representations, however:

但是,这不允许本地化表示:

To interpret localized string representations of a floating-point value, use subclasses of NumberFormat.

要解释浮点值的本地化字符串表示形式,请使用NumberFormat的子类。

#5


3  

Use StringUtils.isDouble(String) in Apache Commons.

在Apache Commons中使用StringUtils.isDouble(String)。

#6


3  

Leveraging off Mr. Skeet:

利用Skeet先生:

private boolean IsValidDoubleChar(char c)
{
    return "0123456789.+-eE".indexOf(c) >= 0;
}

public boolean isDouble(String value)
{
    for (int i=0; i < value.length(); i++)
    {
        char c = value.charAt(i);
        if (IsValidDoubleChar(c))
            continue;
        return false;
    }
    try
    {
        Double.parseDouble(value);
        return true;
    }
    catch (NumberFormatException e)
    {
        return false;
    }
}

#7


2  

I would use the Jakarta commons-lang, as always ! But I have no idea if their implementation is fast or not. It doesnt rely on Exceptions, which might be a good thig performance wise ...

我将一如既往地使用雅加达公共场所!但我不知道他们的实施是否快速。它不依赖于Exceptions,这可能是一个很好的thig性能......

#8


2  

Most of these answers are somewhat acceptable solutions. All of the regex solutions have the issue of not being correct for all cases you may care about.

大多数答案都是可接受的解决方案。所有正则表达式解决方案都存在对您可能关心的所有情况都不正确的问题。

If you really want to ensure that the String is a valid number, then I would use your own solution. Don't forget that, I imagine, that most of the time the String will be a valid number and won't raise an exception. So most of the time the performance will be identical to that of Double.valueOf().

如果你真的想确保String是一个有效的数字,那么我会使用你自己的解决方案。我想,不要忘记,大多数情况下,String将是一个有效的数字,不会引发异常。因此,大多数情况下,性能将与Double.valueOf()的性能相同。

I guess this really isn't an answer, except that it validates your initial instinct.

我想这真的不是一个答案,除了它验证了你的初始本能。

Randy

#9


1  

Following Phill's answer can I suggest another regex?

根据Phill的回答,我可以建议另一个正则表达式吗?

String.matches("^-?\\d+(\\.\\d+)?$");

#10


1  

I prefer using a loop over the Strings's char[] representation and using the Character.isDigit() method. If elegance is desired, I think this is the most readable:

我更喜欢在Strings的char []表示上使用循环并使用Character.isDigit()方法。如果需要优雅,我认为这是最可读的:

package tias;

public class Main {
  private static final String NUMERIC = "123456789";
  private static final String NOT_NUMERIC = "1L5C";

  public static void main(String[] args) {
    System.out.println(isStringNumeric(NUMERIC));
    System.out.println(isStringNumeric(NOT_NUMERIC));
  }

  private static boolean isStringNumeric(String aString) {
    if (aString == null || aString.length() == 0) {
      return false;
    }
    for (char c : aString.toCharArray() ) {
      if (!Character.isDigit(c)) {
        return false;
      }
    }
    return true;
  }

}

#11


-1  

If you want something that's blisteringly fast, and you have a very clear idea of what formats you want to accept, you can build a state machine DFA by hand. This is essentially how regexes work under the hood anyway, but you can avoid the regex compilation step this way, and it may well be faster than a generic regex compiler.

如果你想要一些非常快速的东西,并且你非常清楚你想要接受哪种格式,你可以手工构建一个状态机DFA。这本质上是正则表达式的工作原理,但你可以通过这种方式避免正则表达式编译步骤,并且它可能比通用正则表达式编译器更快。