I'm trying to match a Regex to a date in a text file that I've created from a PDF. The Regex matches when I build it in Regexhero, but when de-bugging I found that c# is not finding a match at all.
我正在尝试将正则表达式与我从PDF创建的文本文件中的日期相匹配。当我在Regexhero中构建它时正则表达式匹配,但是在解决问题时我发现c#根本找不到匹配。
Any thoughts as to why this would happen?
有关为什么会发生这种情况的任何想法?
I can provide some code if that would help, but all of my other regex's are matching and the code is very robust, involving many different classes, public variables, and functions. It would take some time to make readable.
我可以提供一些代码,如果这会有所帮助,但我的所有其他正则表达式都匹配,代码非常健壮,涉及许多不同的类,公共变量和函数。可读性需要一些时间。
(using vs 2012 pro in c# console application) (Regex confirmed with regexhero)
(在c#console应用程序中使用vs 2012 pro)(正则表达式与regexhero确认)
Regex:
*?((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) *\d{1,2}, \d{4})\n.?\n. *?GEORGIA POWER COMPANY
正则表达式:*?((Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec)* \ d {1,2},\ d {4})\ n。? ñ。 *?GEORGIA POWER COMPANY
text file snippet:
文本文件片段:
Dec 26, 2012
GEORGIA POWER COMPANY
BIN #19999
21141 Ralph McGuiver Blvd.
Atlanta, GA 30308-3374
GI LANDING LLC
Customer***
PO BOX 1234
LOGAN UT 84323
Please Pay By Jan 10, 2013
Customer Name Account Number Total Due $ 61.91
IV LANDING LLC 19380-29341
Service Address Service Period Contact Us 24 hours a day, 7 days a week
900 GI LANDING DR Nov 26, 2012 - Dec 25, 2012
HSE A georgiapower.com
Account Number Web Access Code
Billing Summary
135130-530141 845089
Previous Bill Amount $ 63.34 Customer Service Power Outage Reporting
Payment Received On 12/06/12 Thank You!
3 个解决方案
#1
1
The regex you are using is incorrect. I checked it with Expresso.
您正在使用的正则表达式是不正确的。我用Expresso检查过它。
The following regex will match the date that you require. The date can be extracted from the group DATE.
以下正则表达式将匹配您需要的日期。可以从DATE组中提取日期。
(?<DATE>(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)?\s+\d{1,2},\s+\d{4})\s+GEORGIA POWER COMPANY
#2
1
There are a few errors in your pattern. First of all, the first character, '*', is not applied to any character previously and results in an exception thrown by the regex. Furthermore, the \n.?\n. *?
segment assumes that the only line separator is \n
, while in our case there are also \r
characters.
您的模式中存在一些错误。首先,第一个字符'*'以前不会应用于任何字符,并导致正则表达式抛出异常。此外,\ n。?\ n。 *? segment假定唯一的行分隔符是\ n,而在我们的例子中也有\ r \ n字符。
Your correct pattern should be, approximately:
你的正确模式应该是:
\s*?((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s\d{1,2},\s\d{4})[\r\n]*GEORGIA POWER COMPANY
You may adapt it to make it more restrictive, however.
但是,您可以对其进行调整以使其更具限制性。
Example of how to use it:
如何使用它的示例:
var regex = new Regex(@"\s*?((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s\d{1,2},\s\d{4})[\r\n]*GEORGIA POWER COMPANY");
var input = @"your input here";
var match = regex.Match(input);
if (match.Success) { /*Operate*/ }
#3
1
I found one that worked. All of your responses worked in Expresso and regexhero, howver, my particular console app only liked this one for some reason. Thanks for the responses.
我找到了一个有效的。所有的回复都在Expresso和regexhero中工作,不过,我的特定控制台应用程序出于某种原因只喜欢这个。谢谢你的回复。
Correct Regex: "\s*((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s\d{1,2},\s\d{4})\s*G"
正确的正则表达式:“\ s *((Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec)\ s \ d {1,2},\ s \ d {4} )\ S * G”
#1
1
The regex you are using is incorrect. I checked it with Expresso.
您正在使用的正则表达式是不正确的。我用Expresso检查过它。
The following regex will match the date that you require. The date can be extracted from the group DATE.
以下正则表达式将匹配您需要的日期。可以从DATE组中提取日期。
(?<DATE>(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)?\s+\d{1,2},\s+\d{4})\s+GEORGIA POWER COMPANY
#2
1
There are a few errors in your pattern. First of all, the first character, '*', is not applied to any character previously and results in an exception thrown by the regex. Furthermore, the \n.?\n. *?
segment assumes that the only line separator is \n
, while in our case there are also \r
characters.
您的模式中存在一些错误。首先,第一个字符'*'以前不会应用于任何字符,并导致正则表达式抛出异常。此外,\ n。?\ n。 *? segment假定唯一的行分隔符是\ n,而在我们的例子中也有\ r \ n字符。
Your correct pattern should be, approximately:
你的正确模式应该是:
\s*?((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s\d{1,2},\s\d{4})[\r\n]*GEORGIA POWER COMPANY
You may adapt it to make it more restrictive, however.
但是,您可以对其进行调整以使其更具限制性。
Example of how to use it:
如何使用它的示例:
var regex = new Regex(@"\s*?((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s\d{1,2},\s\d{4})[\r\n]*GEORGIA POWER COMPANY");
var input = @"your input here";
var match = regex.Match(input);
if (match.Success) { /*Operate*/ }
#3
1
I found one that worked. All of your responses worked in Expresso and regexhero, howver, my particular console app only liked this one for some reason. Thanks for the responses.
我找到了一个有效的。所有的回复都在Expresso和regexhero中工作,不过,我的特定控制台应用程序出于某种原因只喜欢这个。谢谢你的回复。
Correct Regex: "\s*((Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s\d{1,2},\s\d{4})\s*G"
正确的正则表达式:“\ s *((Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec)\ s \ d {1,2},\ s \ d {4} )\ S * G”