什么RegEx字符串将找到字符串中最后一个(最右边)的数字组?

时间:2020-12-15 19:29:32

Looking for a regex string that will let me find the rightmost (if any) group of digits embedded in a string. We only care about contiguous digits. We don't care about sign, commas, decimals, etc. Those, if found should simply be treated as non-digits just like a letter.

寻找一个正则表达式字符串,它将让我找到嵌入在字符串中的最右边(如果有)数字组。我们只关心连续的数字。我们不关心标志,逗号,小数等。如果找到这些,应该只是像字母一样对待非数字。

This is for replacement/incrementing purposes so we also need to grab everything before and after the detected number so we can reconstruct the string after incrementing the value so we need a tokenized regex.

这是为了替换/递增目的,所以我们还需要获取检测到的数字之前和之后的所有内容,以便我们可以在递增值后重建字符串,因此我们需要一个标记化的正则表达式。

Here's examples of what we are looking for:

以下是我们正在寻找的例子:

  • "abc123def456ghi" should identify the'456'
  • “abc123def456ghi”应该识别'456'
  • "abc123def456ghi789jkl" should identify the'789'
  • “abc123def456ghi789jkl”应该识别'789'
  • "abc123def" should identify the'123'
  • “abc123def”应该标识'123'
  • "123ghi" should identify the'123'
  • “123ghi”应该识别'123'
  • "abc123,456ghi" should identify the'456'
  • “abc123,456ghi”应该识别'456'
  • "abc-654def" should identify the'654'
  • “abc-654def”应该标识'654'
  • "abcdef" shouldn't return any match
  • “abcdef”不应该返回任何匹配

As an example of what we want, it would be something like starting with the name 'Item 4-1a', extracting out the '1' with everything before being the prefix and everything after being the suffix. Then using that, we can generate the values 'Item 4-2a', 'Item 4-3a' and 'Item 4-4a' in a code loop.

作为我们想要的一个例子,它就像从名称'Item 4-1a'开始,在作为前缀之前提取出所有内容的'1'以及作为后缀后的所有内容。然后使用它,我们可以在代码循环中生成值“Item 4-2a”,“Item 4-3a”和“Item 4-4a”。

Now If I were looking for the first set, this would be easy. I'd just find the first contiguous block of 0 or more non-digits for the prefix, then the block of 1 or more contiguous digits for the number, then everything else to the end would be the suffix.

现在,如果我正在寻找第一套,这将很容易。我只是找到前缀为0或更多非数字的第一个连续块,然后是数字的1个或多个连续数字的块,然后到结尾的所有其他内容将是后缀。

The issue I'm having is how to define the prefix as including all (if any) numbers except the last set. Everything I try for the prefix keeps swallowing that last set, even when I've tried anchoring it to the end by basically reversing the above.

我遇到的问题是如何将前缀定义为包括除最后一组之外的所有(如果有)数字。我尝试使用前缀的所有内容都会吞下最后一组,即使我已经尝试通过基本上颠倒以上来将其锚定到最后。

5 个解决方案

#1


13  

How about:

怎么样:

^(.*?)(\d+)(\D*)$

then increment the second group and concat all 3.

然后增加第二组并连续全部3。

Explanation:

说明:

^         : Begining of string
  (       : start of 1st capture group
    .*?   : any number of any char not greedy
  )       : end group
  (       : start of 2nd capture group
    \d+   : one or more digits
  )       : end group
  (       : start of 3rd capture group
    \D*   : any number of non digit char
  )       : end group
$         : end of string

The first capture group will match all characters until the first digit of last group of digits before the end of the string.

第一个捕获组将匹配所有字符,直到字符串结尾之前的最后一组数字的第一个数字。

or if you can use named group

或者如果你可以使用命名组

^(?<prefix>.*?)(?<number>\d+)(?<suffix>\D*)$

#2


6  

Try next regex:

试试下一个正则表达式:

(\d+)(?!.*\d)

Explanation:

说明:

(\d+)           # One or more digits.
(?!.*\d)        # (zero-width) Negative look-ahead: Don't find any characters followed with a digit.

EDIT (OFFTOPIC of the question):: This answer is incorrect but this question has already been answered in other posts so to avoid delete this one I will use this same regex other way, for example in Perl could be used like this to get same result as in C# (increment last digit):

编辑(问题的OFFTOPIC)::这个答案是不正确的,但这个问题已经在其他帖子中得到了回答,所以为了避免删除这个我将使用这个相同的正则表达式,例如在Perl中可以像这样用来获得相同的结果与C#相同(增加最后一位数):

s/(\d+)(?!.*\d)/$1 + 1/e;

#3


3  

You can also try little bit simpler version:

您也可以尝试更简单的版本:

(\d+)[^\d]*$

#4


1  

This should do it:

这应该这样做:

Regex regexObj = new Regex(@"
    # Grab last set of digits, prefix and suffix.
    ^               # Anchor to start of string.
    (.*)            # $1: Stuff before last set of digits.
    (?<!\d)         # Anchor start of last set of digits.
    (\d+)           # $2: Last set of one or more digits.
    (\D*)           # $3: Zero or more trailing non digits.
    $               # Anchor to end of string.
    ", RegexOptions.IgnorePatternWhitespace);

#5


1  

What about not using Regex. Here's code snippet (for console)

怎么样不使用正则表达式。这是代码片段(用于控制台)

string[] myStringArray = new string[] { "abc123def456ghi", "abc123def456ghi789jkl", "abc123def", "123ghi", "abcdef","abc-654def" };

        char[] numberSet = new char[] { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
        char[] filterSet = new char[] {'a','b','c','d','e','f','g','h','i','j','k','l','m',
                                        'n','o','p','q','r','s','t','u','v','w','x','y','z','-'};
        foreach (string myString in myStringArray)
        {
            Console.WriteLine("your string - {0}",myString);
            int index1 = myString.LastIndexOfAny(numberSet);
            if (index1 == -1)
            Console.WriteLine("no number");
            else
            {
               string mySubString = myString.Substring(0,index1 + 1);
               string prefix = myString.Substring(index1 + 1);
               Console.WriteLine("prefix - {0}", prefix);
               int index2 = mySubString.LastIndexOfAny(filterSet);
               string suffix = myString.Substring(0, index2 + 1);
               Console.WriteLine("suffix - {0}",suffix);
               mySubString = mySubString.Substring(index2 + 1);
               Console.WriteLine("number - {0}",mySubString);
               Console.WriteLine("_________________");
            }
        }
        Console.Read();

#1


13  

How about:

怎么样:

^(.*?)(\d+)(\D*)$

then increment the second group and concat all 3.

然后增加第二组并连续全部3。

Explanation:

说明:

^         : Begining of string
  (       : start of 1st capture group
    .*?   : any number of any char not greedy
  )       : end group
  (       : start of 2nd capture group
    \d+   : one or more digits
  )       : end group
  (       : start of 3rd capture group
    \D*   : any number of non digit char
  )       : end group
$         : end of string

The first capture group will match all characters until the first digit of last group of digits before the end of the string.

第一个捕获组将匹配所有字符,直到字符串结尾之前的最后一组数字的第一个数字。

or if you can use named group

或者如果你可以使用命名组

^(?<prefix>.*?)(?<number>\d+)(?<suffix>\D*)$

#2


6  

Try next regex:

试试下一个正则表达式:

(\d+)(?!.*\d)

Explanation:

说明:

(\d+)           # One or more digits.
(?!.*\d)        # (zero-width) Negative look-ahead: Don't find any characters followed with a digit.

EDIT (OFFTOPIC of the question):: This answer is incorrect but this question has already been answered in other posts so to avoid delete this one I will use this same regex other way, for example in Perl could be used like this to get same result as in C# (increment last digit):

编辑(问题的OFFTOPIC)::这个答案是不正确的,但这个问题已经在其他帖子中得到了回答,所以为了避免删除这个我将使用这个相同的正则表达式,例如在Perl中可以像这样用来获得相同的结果与C#相同(增加最后一位数):

s/(\d+)(?!.*\d)/$1 + 1/e;

#3


3  

You can also try little bit simpler version:

您也可以尝试更简单的版本:

(\d+)[^\d]*$

#4


1  

This should do it:

这应该这样做:

Regex regexObj = new Regex(@"
    # Grab last set of digits, prefix and suffix.
    ^               # Anchor to start of string.
    (.*)            # $1: Stuff before last set of digits.
    (?<!\d)         # Anchor start of last set of digits.
    (\d+)           # $2: Last set of one or more digits.
    (\D*)           # $3: Zero or more trailing non digits.
    $               # Anchor to end of string.
    ", RegexOptions.IgnorePatternWhitespace);

#5


1  

What about not using Regex. Here's code snippet (for console)

怎么样不使用正则表达式。这是代码片段(用于控制台)

string[] myStringArray = new string[] { "abc123def456ghi", "abc123def456ghi789jkl", "abc123def", "123ghi", "abcdef","abc-654def" };

        char[] numberSet = new char[] { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' };
        char[] filterSet = new char[] {'a','b','c','d','e','f','g','h','i','j','k','l','m',
                                        'n','o','p','q','r','s','t','u','v','w','x','y','z','-'};
        foreach (string myString in myStringArray)
        {
            Console.WriteLine("your string - {0}",myString);
            int index1 = myString.LastIndexOfAny(numberSet);
            if (index1 == -1)
            Console.WriteLine("no number");
            else
            {
               string mySubString = myString.Substring(0,index1 + 1);
               string prefix = myString.Substring(index1 + 1);
               Console.WriteLine("prefix - {0}", prefix);
               int index2 = mySubString.LastIndexOfAny(filterSet);
               string suffix = myString.Substring(0, index2 + 1);
               Console.WriteLine("suffix - {0}",suffix);
               mySubString = mySubString.Substring(index2 + 1);
               Console.WriteLine("number - {0}",mySubString);
               Console.WriteLine("_________________");
            }
        }
        Console.Read();