我应该何时在regex中使用\A ?

时间:2022-03-11 07:05:07

End of line anchor $ match even there is extra trailing \n in matched string, so we use \Z instead of $

即使在匹配的字符串中有额外的拖尾\n,我们也使用\Z而不是$

For example

例如

^\w+$ will match the string abcd\n but ^\w+\Z is not

^ \ w +美元将匹配字符串abcd \ n但^ \ w + \ Z不是

How about \A and when to use?

那么使用\A和什么时候使用呢?

4 个解决方案

#1


21  

Most often it's used when also enabling multi-line matches. Since \A only matches at the beginning of the ENTIRE text, as opposed to just a line beginning, in regexes that can match across lines the functionality of ^ and \A are different.

通常在启用多行匹配时使用。因为\只匹配整个文本的开始,而不是一条线开始,在regex可以跨线的功能匹配^ \是不同的。

#2


4  

As with any regex feature, you use it when it more exactly describes what you need as opposed to any more general feature. If you know that you want to match exactly at the start of a string (instead of logical lines), use the regex feature that describes that. Don't use regex features that could possibly match in situations that you don't want.

与任何regex特性一样,当它更确切地描述您需要什么而不是更通用的特性时,您就可以使用它。如果您知道您想要在字符串的开头(而不是逻辑行)精确匹配,请使用regex特性来描述它。不要使用可能与您不希望的情况匹配的regex特性。

For Perl, see the perlre docs for details about the zero-width assertions:

对于Perl,有关零宽度断言的详细信息,请参阅perlre文档:

\b  Match a word boundary
\B  Match except at a word boundary
\A  Match only at beginning of string
\Z  Match only at end of string, or before newline at the end
\z  Match only at end of string
\G  Match only at pos() (e.g. at the end-of-match position
    of prior m//g)

#3


2  

Not directly relevant to your question according to the tags you used, but there is at least one language (Ruby) where ^ and $ always mean start/end-of-line, so if you want to match start/end-of-string you have to use \A and \Z or \z.

不能直接与你的问题根据你使用的标签,但至少有一个语言(Ruby)^和$总是意味着启动/行尾,所以如果你想启动/匹配字符串末尾必须使用\ \ Z或\ Z。

If you want to keep your regexes portable, it's good practice to explicitly state what you want them to do instead of relying on the availability of mode modifiers like \m or Regex.MULTILINE etc.

如果您想让regexes保持可移植性,最好明确地声明您希望它们做什么,而不是依赖于模式修饰符(如\m或Regex)的可用性。多行等。

On the other hand, JavaScript, POSIX and XML do not support \A and \Z. This is where tools like RegexBuddy come in handy that translate regexes from one flavor to the other for you.

另一方面,JavaScript、POSIX和XML不支持\A和\Z。这就是RegexBuddy这样的工具派上用场的地方,它可以为您将regexes从一种风味转换为另一种风味。

#4


2  

If the regex flavor you're working with supports \A then I recommend you always use it instead of ^. \A always matches at the start of the string only in all flavors that support it. There is no issue with line breaks.

如果你使用支持正则表达式的味道\那么我建议你总是使用它代替^。\A总是在字符串的开头匹配,只在所有支持它的味道中匹配。换行没有问题。

^ may match at the start of the string only or at the start of any line depending on the regex flavor and regex options.

^可能只匹配字符串的开始或在任何线根据正则表达式的味道和正则表达式选项。

By using \A you reduce the potential for confusion when somebody else has to maintain your code.

通过使用\A,当其他人必须维护您的代码时,您可以减少混淆的可能性。

#1


21  

Most often it's used when also enabling multi-line matches. Since \A only matches at the beginning of the ENTIRE text, as opposed to just a line beginning, in regexes that can match across lines the functionality of ^ and \A are different.

通常在启用多行匹配时使用。因为\只匹配整个文本的开始,而不是一条线开始,在regex可以跨线的功能匹配^ \是不同的。

#2


4  

As with any regex feature, you use it when it more exactly describes what you need as opposed to any more general feature. If you know that you want to match exactly at the start of a string (instead of logical lines), use the regex feature that describes that. Don't use regex features that could possibly match in situations that you don't want.

与任何regex特性一样,当它更确切地描述您需要什么而不是更通用的特性时,您就可以使用它。如果您知道您想要在字符串的开头(而不是逻辑行)精确匹配,请使用regex特性来描述它。不要使用可能与您不希望的情况匹配的regex特性。

For Perl, see the perlre docs for details about the zero-width assertions:

对于Perl,有关零宽度断言的详细信息,请参阅perlre文档:

\b  Match a word boundary
\B  Match except at a word boundary
\A  Match only at beginning of string
\Z  Match only at end of string, or before newline at the end
\z  Match only at end of string
\G  Match only at pos() (e.g. at the end-of-match position
    of prior m//g)

#3


2  

Not directly relevant to your question according to the tags you used, but there is at least one language (Ruby) where ^ and $ always mean start/end-of-line, so if you want to match start/end-of-string you have to use \A and \Z or \z.

不能直接与你的问题根据你使用的标签,但至少有一个语言(Ruby)^和$总是意味着启动/行尾,所以如果你想启动/匹配字符串末尾必须使用\ \ Z或\ Z。

If you want to keep your regexes portable, it's good practice to explicitly state what you want them to do instead of relying on the availability of mode modifiers like \m or Regex.MULTILINE etc.

如果您想让regexes保持可移植性,最好明确地声明您希望它们做什么,而不是依赖于模式修饰符(如\m或Regex)的可用性。多行等。

On the other hand, JavaScript, POSIX and XML do not support \A and \Z. This is where tools like RegexBuddy come in handy that translate regexes from one flavor to the other for you.

另一方面,JavaScript、POSIX和XML不支持\A和\Z。这就是RegexBuddy这样的工具派上用场的地方,它可以为您将regexes从一种风味转换为另一种风味。

#4


2  

If the regex flavor you're working with supports \A then I recommend you always use it instead of ^. \A always matches at the start of the string only in all flavors that support it. There is no issue with line breaks.

如果你使用支持正则表达式的味道\那么我建议你总是使用它代替^。\A总是在字符串的开头匹配,只在所有支持它的味道中匹配。换行没有问题。

^ may match at the start of the string only or at the start of any line depending on the regex flavor and regex options.

^可能只匹配字符串的开始或在任何线根据正则表达式的味道和正则表达式选项。

By using \A you reduce the potential for confusion when somebody else has to maintain your code.

通过使用\A,当其他人必须维护您的代码时,您可以减少混淆的可能性。