
时间:2022-09-28 23:18:53

Short version:

How can I get a regex that matches a@a.aaaa but not a@a.aaaaa using CAtlRegExp ?

如何获得匹配a@a的regex。aaaa级但不是a@a。五星级使用CAtlRegExp ?

Long version:

I'm using CAtlRegExp to try to match email addresses. I want to use the regex



extracted from here. But the syntax that CAtlRegExp accepts is different than the one used there. This regex returns the error REPARSE_ERROR_BRACKET_EXPECTED, you can check for yourself using this app:

从这里提取。但是CAtlRegExp接受的语法与这里使用的语法不同。这个regex返回了预期的error errse_error_bracket_expected,您可以使用这个应用程序:自行检查

Using said app, I created this regex:



But the problem is this matches a@a.aaaaa as valid, I need it to match 4 characters maximum for the op-level domain.


So, how can I get a regex that matches a@a.aaaa but not a@a.aaaaa ?


2 个解决方案



Try: ^[a-zA-Z0-9\._%\+\-]+@([a-zA-Z0-9-]+\.)+\c\c\c?\c?$

试题:^[a-zA-Z0-9 \ ._ % \ + \ -]+ @([a-zA-Z0-9 -]+ \)+ \ \ c \ c ?美元\ c ?

This expression replaces the [A-Z]{2,4} sequence which CAtlRegExp doesn't support with \c\c\c?\c?


\c serves as an abbreviation of [a-zA-Z]. The question marks after the 3rd and 4th \c's indicate they can match either zero or one characters. As a result, this portion of the expression matches 2, 3 or 4 characters, but neither more nor less.




You are trying to match email addresses, a very widely used critical element of internet communication.


To which I would say that this job is best done with the most widely used most correct regex.


Since email address format rules are described by RFC822, it seems useful to do internet searches for something like "RFC822 email regex".


For Perl the answer seems to be easy: use Mail::RFC822::Address: regexp-based address validation


RFC 822 Email Address Parser in PHP

RFC 822电子邮件地址解析器。

Thus, to achieve the most correct handling of email addresses, one should either locate the most precise regex that there is out somewhere for the particular toolkit (ATL in your case) or - in case there's no suitable existing regex yet - adapt a very precise regex of another toolkit (Perl above seems to be a very complete albeit difficult candidate).


If you're trying to match a specific sub part of email addresses (as seems to be the case given your question), then it probably still makes sense to start with the most up-to-date/correct/universal regex and specifically limit it to the parts that you require.


Perhaps I stated the obvious, but I hope it helped.




Try: ^[a-zA-Z0-9\._%\+\-]+@([a-zA-Z0-9-]+\.)+\c\c\c?\c?$

试题:^[a-zA-Z0-9 \ ._ % \ + \ -]+ @([a-zA-Z0-9 -]+ \)+ \ \ c \ c ?美元\ c ?

This expression replaces the [A-Z]{2,4} sequence which CAtlRegExp doesn't support with \c\c\c?\c?


\c serves as an abbreviation of [a-zA-Z]. The question marks after the 3rd and 4th \c's indicate they can match either zero or one characters. As a result, this portion of the expression matches 2, 3 or 4 characters, but neither more nor less.




You are trying to match email addresses, a very widely used critical element of internet communication.


To which I would say that this job is best done with the most widely used most correct regex.


Since email address format rules are described by RFC822, it seems useful to do internet searches for something like "RFC822 email regex".


For Perl the answer seems to be easy: use Mail::RFC822::Address: regexp-based address validation


RFC 822 Email Address Parser in PHP

RFC 822电子邮件地址解析器。

Thus, to achieve the most correct handling of email addresses, one should either locate the most precise regex that there is out somewhere for the particular toolkit (ATL in your case) or - in case there's no suitable existing regex yet - adapt a very precise regex of another toolkit (Perl above seems to be a very complete albeit difficult candidate).


If you're trying to match a specific sub part of email addresses (as seems to be the case given your question), then it probably still makes sense to start with the most up-to-date/correct/universal regex and specifically limit it to the parts that you require.


Perhaps I stated the obvious, but I hope it helped.
