从正则表达式匹配中排除特定模式

时间:2021-06-06 00:10:50

Somewhat a continuation of my previous question:

我前一个问题的某种延续:

I came across another pattern that I'd have to take care of, which looks something like this:

我遇到了另一种我需要照顾的模式,看起来像这样:

Tue 01/24/12 1/24/2012 2:56:25 PM

In which case I'd only want it to match the 1/24/2012 2:56:25 PM portion.

在这种情况下,我只希望它与2012年1月24日下午2:56:25部分相匹配。

My previous expression seems to match the above input on 01/24/12 1 or something similar.

我之前的表达似乎与01/24/12 1或类似的上述输入相匹配。

I was able to make this work, for the most part, by using the following expression:

在大多数情况下,通过使用以下表达式,我能够完成这项工作:

(?:\w\w\w (0?[1-9]|1[012])[- \/.](0?[1-9]|[12][0-9]|3[01])[- \/.](19|20)?\d\d)? (0?[1-9]|1[012])[- \/.](0?[1-9]|[12][0-9]|3[01])[- \/.](19|20)?\d\d((?: |\s*-\s*)(?:(?:([01]?\d|2[0-3]):)?([0-5]?\d):)?([0-5]?\d)( AM| PM)?)?

The issue here is that I don't want to actually include the Tue 01/24/12 bit in my match; I want to make sure that part does not match. I attempted to use a negative look ahead by adding the ?! modifiers to the first non-capturing group, but it didn't quite do what I thought it'd do.

这里的问题是我不想在我的比赛中实际包含Tue 01/24/12位;我想确保那部分不匹配。我试图通过添加?来使用负面的预测?第一个非捕获组的修饰符,但它并没有完全按我认为的那样做。

I've tried looking at similar questions here and here, but the answers did not explain anything; they simply provided a working expression for that particular instance.

我试着在这里和这里查看类似的问题,但答案没有解释任何事情;他们只是为该特定实例提供了一个工作表达式。

1 个解决方案

#1


Whenever you are using (...) in your regex, you are creating capture Groups that returns those matches into groups.

无论何时在正则表达式中使用(...),您都要创建捕获组,将这些匹配返回到组中。

In your case, you just need to create a group that contains your desired output, having that in mind i changed your regex a little and group $4 have your desired output:

在你的情况下,你只需要创建一个包含所需输出的组,记住我改变了你的正则表达式,组$ 4有你想要的输出:

(?:\w\w\w (0?[1-9]|1[012])[- \/.](0?[1-9]|[12][0-9]|3[01])[- \/.](19|20)?\d\d)? ((0?[1-9]|1[012])[- \/.](0?[1-9]|[12][0-9]|3[01])[- \/.](19|20)?\d\d((?: |\s*-\s*)(?:(?:([01]?\d|2[0-3]):)?([0-5]?\d):)?([0-5]?\d)( AM| PM)?))?

Tested on regexr.com:

在regexr.com上测试:

从正则表达式匹配中排除特定模式

To address your spacing matching issue, you need to include the space after the first(...)? group inside second (...)? group (I included as \s?), leaving you with:

要解决间距匹配问题,您需要在第一个(...)之后包含空格?第二组(...)?小组(我包括为\ s?),留下你:

(?:\w\w\w (0?[1-9]|1[012])[- \/.](0?[1-9]|[12][0-9]|3[01])[- \/.](19|20)?\d\d)?(\s?(0?[1-9]|1[012])[- \/.](0?[1-9]|[12][0-9]|3[01])[- \/.](19|20)?\d\d((?: |\s*-\s*)(?:(?:([01]?\d|2[0-3]):)?([0-5]?\d):)?([0-5]?\d)( AM| PM)?))

Also last group can't be (...)? anymore otherwise you would match infinity.

最后一组不能(...)?不管怎样,你会匹配无限。

And You should also consider changing all your (...) groups to (?:...) if you do not need to capture them, leaving your desired output in $1

如果您不需要捕获它们,您还应该考虑将所有(...)组更改为(?:...),将所需的输出保留为1美元

#1


Whenever you are using (...) in your regex, you are creating capture Groups that returns those matches into groups.

无论何时在正则表达式中使用(...),您都要创建捕获组,将这些匹配返回到组中。

In your case, you just need to create a group that contains your desired output, having that in mind i changed your regex a little and group $4 have your desired output:

在你的情况下,你只需要创建一个包含所需输出的组,记住我改变了你的正则表达式,组$ 4有你想要的输出:

(?:\w\w\w (0?[1-9]|1[012])[- \/.](0?[1-9]|[12][0-9]|3[01])[- \/.](19|20)?\d\d)? ((0?[1-9]|1[012])[- \/.](0?[1-9]|[12][0-9]|3[01])[- \/.](19|20)?\d\d((?: |\s*-\s*)(?:(?:([01]?\d|2[0-3]):)?([0-5]?\d):)?([0-5]?\d)( AM| PM)?))?

Tested on regexr.com:

在regexr.com上测试:

从正则表达式匹配中排除特定模式

To address your spacing matching issue, you need to include the space after the first(...)? group inside second (...)? group (I included as \s?), leaving you with:

要解决间距匹配问题,您需要在第一个(...)之后包含空格?第二组(...)?小组(我包括为\ s?),留下你:

(?:\w\w\w (0?[1-9]|1[012])[- \/.](0?[1-9]|[12][0-9]|3[01])[- \/.](19|20)?\d\d)?(\s?(0?[1-9]|1[012])[- \/.](0?[1-9]|[12][0-9]|3[01])[- \/.](19|20)?\d\d((?: |\s*-\s*)(?:(?:([01]?\d|2[0-3]):)?([0-5]?\d):)?([0-5]?\d)( AM| PM)?))

Also last group can't be (...)? anymore otherwise you would match infinity.

最后一组不能(...)?不管怎样,你会匹配无限。

And You should also consider changing all your (...) groups to (?:...) if you do not need to capture them, leaving your desired output in $1

如果您不需要捕获它们,您还应该考虑将所有(...)组更改为(?:...),将所需的输出保留为1美元