正则表达式匹配文本,但c#找不到匹配项

时间:2022-09-13 08:31:39

I'm trying to match a Regex to a date in a text file that I've created from a PDF. The Regex matches when I build it in Regexhero, but when de-bugging I found that c# is not finding a match at all.

我正在尝试将正则表达式与我从PDF创建的文本文件中的日期相匹配。当我在Regexhero中构建它时正则表达式匹配,但是在解决问题时我发现c#根本找不到匹配。

Any thoughts as to why this would happen?

有关为什么会发生这种情况的任何想法?

I can provide some code if that would help, but all of my other regex's are matching and the code is very robust, involving many different classes, public variables, and functions. It would take some time to make readable.

我可以提供一些代码,如果这会有所帮助,但我的所有其他正则表达式都匹配,代码非常健壮,涉及许多不同的类,公共变量和函数。可读性需要一些时间。

(using vs 2012 pro in c# console application) (Regex confirmed with regexhero)

(在c#console应用程序中使用vs 2012 pro)(正则表达式与regexhero确认)

Regex:
*?((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) *\d{1,2}, \d{4})\n.?\n. *?GEORGIA POWER COMPANY

正则表达式:*?((Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec)* \ d {1,2},\ d {4})\ n。? ñ。 *?GEORGIA POWER COMPANY

text file snippet:

文本文件片段:

                                                      Dec 26, 2012

GEORGIA POWER COMPANY
BIN #19999
21141 Ralph McGuiver Blvd.
Atlanta, GA 30308-3374




                         GI LANDING LLC
                         Customer***
                         PO BOX 1234
                         LOGAN UT 84323





                                                                                                                              Please Pay By                                Jan 10, 2013
                                                           Customer Name                                   Account Number     Total Due                                              $ 61.91
                                                           IV LANDING LLC                      19380-29341


             Service Address                                                                                 Service Period   Contact Us 24 hours a day, 7 days a week
             900 GI LANDING DR                                                        Nov 26, 2012 - Dec 25, 2012
             HSE A                                                                                                                      georgiapower.com
                                                                                                                                           Account Number             Web Access Code
             Billing Summary
                                                                                                                                           135130-530141              845089
             Previous Bill Amount                                                                                  $ 63.34                 Customer Service           Power Outage Reporting
             Payment Received On 12/06/12                                 Thank You!                                

3 个解决方案

#1


1  

The regex you are using is incorrect. I checked it with Expresso.

您正在使用的正则表达式是不正确的。我用Expresso检查过它。

The following regex will match the date that you require. The date can be extracted from the group DATE.

以下正则表达式将匹配您需要的日期。可以从DATE组中提取日期。

(?<DATE>(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)?\s+\d{1,2},\s+\d{4})\s+GEORGIA POWER COMPANY

#2


1  

There are a few errors in your pattern. First of all, the first character, '*', is not applied to any character previously and results in an exception thrown by the regex. Furthermore, the \n.?\n. *? segment assumes that the only line separator is \n, while in our case there are also \r characters.

您的模式中存在一些错误。首先,第一个字符'*'以前不会应用于任何字符,并导致正则表达式抛出异常。此外,\ n。?\ n。 *? segment假定唯一的行分隔符是\ n,而在我们的例子中也有\ r \ n字符。

Your correct pattern should be, approximately:

你的正确模式应该是:

\s*?((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s\d{1,2},\s\d{4})[\r\n]*GEORGIA POWER COMPANY

You may adapt it to make it more restrictive, however.

但是,您可以对其进行调整以使其更具限制性。

Example of how to use it:

如何使用它的示例:

var regex = new Regex(@"\s*?((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s\d{1,2},\s\d{4})[\r\n]*GEORGIA POWER COMPANY");
var input = @"your input here";
var match = regex.Match(input);
if (match.Success) { /*Operate*/ }

#3


1  

I found one that worked. All of your responses worked in Expresso and regexhero, howver, my particular console app only liked this one for some reason. Thanks for the responses.

我找到了一个有效的。所有的回复都在Expresso和regexhero中工作,不过,我的特定控制台应用程序出于某种原因只喜欢这个。谢谢你的回复。

Correct Regex: "\s*((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s\d{1,2},\s\d{4})\s*G"

正确的正则表达式:“\ s *((Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec)\ s \ d {1,2},\ s \ d {4} )\ S * G”

#1


1  

The regex you are using is incorrect. I checked it with Expresso.

您正在使用的正则表达式是不正确的。我用Expresso检查过它。

The following regex will match the date that you require. The date can be extracted from the group DATE.

以下正则表达式将匹配您需要的日期。可以从DATE组中提取日期。

(?<DATE>(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)?\s+\d{1,2},\s+\d{4})\s+GEORGIA POWER COMPANY

#2


1  

There are a few errors in your pattern. First of all, the first character, '*', is not applied to any character previously and results in an exception thrown by the regex. Furthermore, the \n.?\n. *? segment assumes that the only line separator is \n, while in our case there are also \r characters.

您的模式中存在一些错误。首先,第一个字符'*'以前不会应用于任何字符,并导致正则表达式抛出异常。此外,\ n。?\ n。 *? segment假定唯一的行分隔符是\ n,而在我们的例子中也有\ r \ n字符。

Your correct pattern should be, approximately:

你的正确模式应该是:

\s*?((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s\d{1,2},\s\d{4})[\r\n]*GEORGIA POWER COMPANY

You may adapt it to make it more restrictive, however.

但是,您可以对其进行调整以使其更具限制性。

Example of how to use it:

如何使用它的示例:

var regex = new Regex(@"\s*?((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s\d{1,2},\s\d{4})[\r\n]*GEORGIA POWER COMPANY");
var input = @"your input here";
var match = regex.Match(input);
if (match.Success) { /*Operate*/ }

#3


1  

I found one that worked. All of your responses worked in Expresso and regexhero, howver, my particular console app only liked this one for some reason. Thanks for the responses.

我找到了一个有效的。所有的回复都在Expresso和regexhero中工作,不过,我的特定控制台应用程序出于某种原因只喜欢这个。谢谢你的回复。

Correct Regex: "\s*((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s\d{1,2},\s\d{4})\s*G"

正确的正则表达式:“\ s *((Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec)\ s \ d {1,2},\ s \ d {4} )\ S * G”