从字符串中提取数字来创建数字字符串

时间:2022-09-13 00:23:28

I have been given some poorly formatted data and need to pull numbers out of strings. I'm not sure what the best way to do this is. The numbers can be any length.

我得到了一些格式不佳的数据,需要从字符串中提取数字。我不知道最好的办法是什么。数字可以是任意长度。

string a = "557222]]>";
string b = "5100870<br>";

any idea what I can do so I'll get this:

你知道我能做什么吗?

a = "557222"
b = "5100870"

Thanks

谢谢

Solution is for c# sorry. Edited the question to have that tag

解决方案是c#对不起。编辑的问题是有那个标签。

8 个解决方案

#1


8  

Not familiar enough with .NET for exact code. Nonetheless, two approaches would be:

对于确切的代码,不太熟悉。net。尽管如此,有两种方法:

  • Cast it as an integer. If the non-digit characters are at the end (i.e. 21389abc), this is the easiest.
  • 将其转换为整数。如果非数字字符在末尾(即21389abc),这是最简单的。
  • If you have intermixed non-digit characters (i.e. 1231a23v) and want to keep every digit, use the regex [^\d] to replace non-digit characters.
  • 如果你有混杂non-digit字符(例如1231 a23v),想让每一位使用regex ^ \[d]取代non-digit字符。

#2


31  

You could write a simple method to extract out all non-digit characters, though this won't handle floating point data:

您可以编写一个简单的方法来提取所有的非数字字符,尽管这不能处理浮点数据:

public string ExtractNumber(string original)
{
     return new string(original.Where(c => Char.IsDigit(c)).ToArray());
}

This purely pulls out the "digits" - you could also use Char.IsNumber instead of Char.IsDigit, depending on the result you wish.

这纯粹是“数字”——您也可以使用Char。IsNumber代替字符。IsDigit,取决于您希望的结果。

#3


11  

try this oneliner : Regex.Replace(str, "[^0-9 _]", "");

试试这个oneliner: Regex。替换(str,“[^ 0 - 9 _]”," ");

#4


7  

You can use a simple regular expression:

您可以使用一个简单的正则表达式:

var numericPart = Regex.Match( a, "\\d+" ).Value;

If you need it to be an actual numeric value, you can then use int.Parse or int.TryParse.

如果需要它是一个实际的数值,则可以使用int.Parse或int.TryParse。

#5


5  

You could use LINQ. The code below filters the string into an IEnumerable with only digits and then converts it to a char[]. The string constructor can then convert the char[] into a string:

您可以使用LINQ。下面的代码将字符串过滤为只有数字的IEnumerable,然后将其转换为char[]。然后,字符串构造函数可以将char[]转换为字符串:

string a = "557222]]>";
string b = "5100870<br>";

a = new string(a.Where(x => char.IsDigit(x)).ToArray());
b = new string(b.Where(x => char.IsDigit(x)).ToArray());

#6


3  

The question doesn't explicitly state that you just want the characters 0 to 9 but it wouldn't be a stretch to believe that is true from your example set and comments. So here is the code that does that.

这个问题并没有明确地说你只想要0到9的字符,但是从你的示例集和注释来看,这并不是一个牵强的假设。这是做这个的代码。

        string digitsOnly = String.Empty;
        foreach (char c in s)
        {
            // Do not use IsDigit as it will include more than the characters 0 through to 9
            if (c >= '0' && c <= '9') digitsOnly += c;
        }

Why you don't want to use Char.IsDigit() - Numbers include characters such as fractions, subscripts, superscripts, Roman numerals, currency numerators, encircled numbers, and script-specific digits.

isdigit() -数字包括分数、下标、超级脚本、罗马数字、货币数字符、圈数和特定于脚本的数字等字符。

#7


3  

Try this

试试这个

string number = Regex.Match("12345<br>", @"\d+").Value;

This will return the first group of digits. Example: for the input "a 123 b 456 c" it will return "123".

这将返回第一组数字。示例:对于输入“a 123 b 456 c”,它将返回“123”。

#8


0  

Here's the version that worked for my case

这是适用于我案例的版本

    public static string ExtractNumbers(this string source)
    {
        if (String.IsNullOrWhiteSpace(source))
            return string.Empty;
        var number = Regex.Match(source, @"\d+");
        if (number != null)
            return number.Value;
        else
            return string.Empty;
    }

#1


8  

Not familiar enough with .NET for exact code. Nonetheless, two approaches would be:

对于确切的代码,不太熟悉。net。尽管如此,有两种方法:

  • Cast it as an integer. If the non-digit characters are at the end (i.e. 21389abc), this is the easiest.
  • 将其转换为整数。如果非数字字符在末尾(即21389abc),这是最简单的。
  • If you have intermixed non-digit characters (i.e. 1231a23v) and want to keep every digit, use the regex [^\d] to replace non-digit characters.
  • 如果你有混杂non-digit字符(例如1231 a23v),想让每一位使用regex ^ \[d]取代non-digit字符。

#2


31  

You could write a simple method to extract out all non-digit characters, though this won't handle floating point data:

您可以编写一个简单的方法来提取所有的非数字字符,尽管这不能处理浮点数据:

public string ExtractNumber(string original)
{
     return new string(original.Where(c => Char.IsDigit(c)).ToArray());
}

This purely pulls out the "digits" - you could also use Char.IsNumber instead of Char.IsDigit, depending on the result you wish.

这纯粹是“数字”——您也可以使用Char。IsNumber代替字符。IsDigit,取决于您希望的结果。

#3


11  

try this oneliner : Regex.Replace(str, "[^0-9 _]", "");

试试这个oneliner: Regex。替换(str,“[^ 0 - 9 _]”," ");

#4


7  

You can use a simple regular expression:

您可以使用一个简单的正则表达式:

var numericPart = Regex.Match( a, "\\d+" ).Value;

If you need it to be an actual numeric value, you can then use int.Parse or int.TryParse.

如果需要它是一个实际的数值,则可以使用int.Parse或int.TryParse。

#5


5  

You could use LINQ. The code below filters the string into an IEnumerable with only digits and then converts it to a char[]. The string constructor can then convert the char[] into a string:

您可以使用LINQ。下面的代码将字符串过滤为只有数字的IEnumerable,然后将其转换为char[]。然后,字符串构造函数可以将char[]转换为字符串:

string a = "557222]]>";
string b = "5100870<br>";

a = new string(a.Where(x => char.IsDigit(x)).ToArray());
b = new string(b.Where(x => char.IsDigit(x)).ToArray());

#6


3  

The question doesn't explicitly state that you just want the characters 0 to 9 but it wouldn't be a stretch to believe that is true from your example set and comments. So here is the code that does that.

这个问题并没有明确地说你只想要0到9的字符,但是从你的示例集和注释来看,这并不是一个牵强的假设。这是做这个的代码。

        string digitsOnly = String.Empty;
        foreach (char c in s)
        {
            // Do not use IsDigit as it will include more than the characters 0 through to 9
            if (c >= '0' && c <= '9') digitsOnly += c;
        }

Why you don't want to use Char.IsDigit() - Numbers include characters such as fractions, subscripts, superscripts, Roman numerals, currency numerators, encircled numbers, and script-specific digits.

isdigit() -数字包括分数、下标、超级脚本、罗马数字、货币数字符、圈数和特定于脚本的数字等字符。

#7


3  

Try this

试试这个

string number = Regex.Match("12345<br>", @"\d+").Value;

This will return the first group of digits. Example: for the input "a 123 b 456 c" it will return "123".

这将返回第一组数字。示例:对于输入“a 123 b 456 c”,它将返回“123”。

#8


0  

Here's the version that worked for my case

这是适用于我案例的版本

    public static string ExtractNumbers(this string source)
    {
        if (String.IsNullOrWhiteSpace(source))
            return string.Empty;
        var number = Regex.Match(source, @"\d+");
        if (number != null)
            return number.Value;
        else
            return string.Empty;
    }