正则表达式只能得到包含数字的方括号,但不在方括号内

时间:2022-12-01 21:43:33

Sample String

示例字符串

 "[] [ds*[000112]] [1448472995] sample string [1448472995] ***";

The regex should match

正则表达式应该匹配

 [1448472995] [1448472995]

and should not match [000112] since there is outer square bracket.

因为有外方括号,所以不应该匹配[000112]。

Currently I have this regex that is matching [000112] as well

目前我的这个正则表达式也匹配[000112]

const string unixTimeStampPattern = @"\[([0-9]+)]";

4 个解决方案

#1


4  

This is a good way to do it using balanced text.

这是使用平衡文本执行此操作的好方法。

    ( \[ \d+ \] )                 # (1)
 |                             # or,
    \[                            # Opening bracket
    (?>                           # Then either match (possessively):
         [^\[\]]+                      #  non - brackets
      |                              # or
         \[                            #  [ increase the bracket counter
         (?<Depth> )
      |                              # or
         \]                            #  ] decrease the bracket counter
         (?<-Depth> )
    )*                            # Repeat as needed.
    (?(Depth)                     # Assert that the bracket counter is at zero
         (?!)
    )
    \]                            # Closing bracket

C# sample

C#样本

string sTestSample = "[] [ds*[000112]] [1448472995] sample string [1448472995] ***";
Regex RxBracket = new Regex(@"(\[\d+\])|\[(?>[^\[\]]+|\[(?<Depth>)|\](?<-Depth>))*(?(Depth)(?!))\]");

Match bracketMatch = RxBracket.Match(sTestSample);
while (bracketMatch.Success)
{
    if (bracketMatch.Groups[1].Success)
        Console.WriteLine("{0}", bracketMatch);
    bracketMatch = bracketMatch.NextMatch();
}

Output

产量

[1448472995]
[1448472995]

#2


4  

You need to use balancing groups to handle this - it looks a bit daunting but isn't all that complicated:

您需要使用平衡组来处理这个问题 - 它看起来有点令人生畏但并不是那么复杂:

Regex regexObj = new Regex(
    @"\[               # Match opening bracket.
    \d+                # Match a number.
    \]                 # Match closing bracket.
    (?=                # Assert that the following can be matched ahead:
     (?>               # The following group (made atomic to avoid backtracking):
      [^\[\]]+         # One or more characters except brackets
     |                 # or
      \[ (?<Depth>)    # an opening bracket (increase bracket counter)
     |                 # or
      \] (?<-Depth>)   # a closing bracket (decrease bracket counter, can't go below 0).
     )*                # Repeat ad libitum.
     (?(Depth)(?!))    # Assert that the bracket counter is now zero.
     [^\[\]]*          # Match any remaining non-bracket characters
     \z                # until the end of the string.
    )                  # End of lookahead.", 
    RegexOptions.IgnorePatternWhitespace);

#3


0  

Are you just trying to capture the unix time stamp? Then you can try a simpler one where you specify the minimum number of characters matched in a group.

你只是想捕获unix时间戳吗?然后,您可以尝试更简单的一个,指定组中匹配的最小字符数。

\[([0-9]{10})\]

Here I limit it to 10 characters since I doubt the time stamp will hit 11 characters anytime soon... To protect against that:

在这里,我将其限制为10个字符,因为我怀疑时间戳很快会达到11个字符...为了防止这种情况:

\[([0-9]{10,11})\]

Of course this could lead to false positives if you have a 10-length number in an enclosing bracket.

当然,如果你在一个封闭的支架中有一个10长的数字,这可能会导致误报。

#4


-2  

This will match your expression as expected: http://regexr.com/3csg3 it uses lookahead.

这将按预期匹配您的表达式:http://regexr.com/3csg3它使用lookahead。

#1


4  

This is a good way to do it using balanced text.

这是使用平衡文本执行此操作的好方法。

    ( \[ \d+ \] )                 # (1)
 |                             # or,
    \[                            # Opening bracket
    (?>                           # Then either match (possessively):
         [^\[\]]+                      #  non - brackets
      |                              # or
         \[                            #  [ increase the bracket counter
         (?<Depth> )
      |                              # or
         \]                            #  ] decrease the bracket counter
         (?<-Depth> )
    )*                            # Repeat as needed.
    (?(Depth)                     # Assert that the bracket counter is at zero
         (?!)
    )
    \]                            # Closing bracket

C# sample

C#样本

string sTestSample = "[] [ds*[000112]] [1448472995] sample string [1448472995] ***";
Regex RxBracket = new Regex(@"(\[\d+\])|\[(?>[^\[\]]+|\[(?<Depth>)|\](?<-Depth>))*(?(Depth)(?!))\]");

Match bracketMatch = RxBracket.Match(sTestSample);
while (bracketMatch.Success)
{
    if (bracketMatch.Groups[1].Success)
        Console.WriteLine("{0}", bracketMatch);
    bracketMatch = bracketMatch.NextMatch();
}

Output

产量

[1448472995]
[1448472995]

#2


4  

You need to use balancing groups to handle this - it looks a bit daunting but isn't all that complicated:

您需要使用平衡组来处理这个问题 - 它看起来有点令人生畏但并不是那么复杂:

Regex regexObj = new Regex(
    @"\[               # Match opening bracket.
    \d+                # Match a number.
    \]                 # Match closing bracket.
    (?=                # Assert that the following can be matched ahead:
     (?>               # The following group (made atomic to avoid backtracking):
      [^\[\]]+         # One or more characters except brackets
     |                 # or
      \[ (?<Depth>)    # an opening bracket (increase bracket counter)
     |                 # or
      \] (?<-Depth>)   # a closing bracket (decrease bracket counter, can't go below 0).
     )*                # Repeat ad libitum.
     (?(Depth)(?!))    # Assert that the bracket counter is now zero.
     [^\[\]]*          # Match any remaining non-bracket characters
     \z                # until the end of the string.
    )                  # End of lookahead.", 
    RegexOptions.IgnorePatternWhitespace);

#3


0  

Are you just trying to capture the unix time stamp? Then you can try a simpler one where you specify the minimum number of characters matched in a group.

你只是想捕获unix时间戳吗?然后,您可以尝试更简单的一个,指定组中匹配的最小字符数。

\[([0-9]{10})\]

Here I limit it to 10 characters since I doubt the time stamp will hit 11 characters anytime soon... To protect against that:

在这里,我将其限制为10个字符,因为我怀疑时间戳很快会达到11个字符...为了防止这种情况:

\[([0-9]{10,11})\]

Of course this could lead to false positives if you have a 10-length number in an enclosing bracket.

当然,如果你在一个封闭的支架中有一个10长的数字,这可能会导致误报。

#4


-2  

This will match your expression as expected: http://regexr.com/3csg3 it uses lookahead.

这将按预期匹配您的表达式:http://regexr.com/3csg3它使用lookahead。