如何将模式与可选的周围引号相匹配?

时间:2022-09-13 12:07:12

How would one write a regex that matches a pattern that can contain quotes, but if it does, must have matching quotes at the beginning and end?

如何编写与可以包含引号的模式匹配的正则表达式,但如果匹配,则必须在开头和结尾都有匹配的引号?

"?(pattern)"?

Will not work because it will allow patterns that begin with a quote but don't end with one.

将无法工作,因为它将允许以引号开头但不以一个结尾的模式。

"(pattern)"|(pattern)

Will work, but is repetitive. Is there a better way to do that without repeating the pattern?

会工作,但重复。有没有更好的方法来做到这一点而不重复模式?

3 个解决方案

#1


17  

You can get a solution without repeating by making use of backreferences and conditionals:

您可以通过使用反向引用和条件来获得解决方案而无需重复:

/^(")?(pattern)(?(1)\1|)$/

Matches:

火柴:

  • pattern
  • 模式
  • "pattern"
  • “模式”

Doesn't match:

不匹配:

  • "pattern
  • “模式
  • pattern"
  • 模式”

This pattern is somewhat complex, however. It first looks for an optional quote, and puts it into backreference 1 if one is found. Then it searches for your pattern. Then it uses conditional syntax to say "if backreference 1 is found again, match it, otherwise match nothing". The whole pattern is anchored (which means that it needs to appear by itself on a line) so that unmatched quotes won't be captured (otherwise the pattern in pattern" would match).

然而,这种模式有点复杂。它首先查找可选引用,如果找到,则将其置于反向引用1中。然后它搜索您的模式。然后它使用条件语法来说“如果再次找到反向引用1,则匹配它,否则不匹配”。整个模式被锚定(这意味着它需要单独出现在一条线上),以便不会捕获不匹配的引号(否则模式中的模式将匹配)。

Note that support for conditionals varies by engine and the more verbose but repetitive expressions will be more widely supported (and likely easier to understand).

请注意,对条件的支持因引擎而异,更加冗长但重复的表达式将得到更广泛的支持(并且可能更容易理解)。


Update: A much simpler version of this regex would be /^(")?(pattern)\1$/, which does not need a conditional. When I was testing this initially, the tester I was using gave me a false negative, which lead me to discount it (oops!).

更新:这个正则表达式的一个更简单的版本是/ ^(“)?(模式)\ 1 $ /,这不需要条件。当我最初测试时,我使用的测试人员给了我一个假阴性,这导致我打折它(哎呀!)。

I'll leave the solution with the conditional up for posterity and interest, but this is a simpler version that is more likely to work in a wider variety of engines (backreferences are the only feature being used here which might be unsupported).

我将保留解决方案的条件性后代和兴趣,但这是一个更简单的版本,更有可能在更广泛的引擎中工作(反向引用是这里使用的唯一可能不受支持的功能)。

#2


0  

Depending on the language you're using, you should be able to use backreferences. Something like this, say:

根据您使用的语言,您应该能够使用反向引用。这样的话,说:

(["'])(pattern)\1|^(pattern)$

That way, you're requiring that either there are no quotes, or that the SAME quote is used on both ends.

这样,您要求没有引号,或者要求两端都使用SAME引用。

#3


0  

This should work with recursive regex (which needs longer to get right). In the meantime: in Perl, you can build a self-modifying regex. I'll leave that as an academic example ;-)

这应该适用于递归正则表达式(需要更长时间才能正确)。与此同时:在Perl中,您可以构建一个自我修改的正则表达式。我会把它留作学术榜样;-)

my @stuff = ( '"pattern"', 'pattern', 'pattern"', '"pattern'  );

foreach (@stuff) {
   print "$_ OK\n" if /^
                        (")?
                        \w+
                        (??{defined $1 ? '"' : ''})
                       $
                      /x
}

Result:

结果:

"pattern" OK
pattern OK

#1


17  

You can get a solution without repeating by making use of backreferences and conditionals:

您可以通过使用反向引用和条件来获得解决方案而无需重复:

/^(")?(pattern)(?(1)\1|)$/

Matches:

火柴:

  • pattern
  • 模式
  • "pattern"
  • “模式”

Doesn't match:

不匹配:

  • "pattern
  • “模式
  • pattern"
  • 模式”

This pattern is somewhat complex, however. It first looks for an optional quote, and puts it into backreference 1 if one is found. Then it searches for your pattern. Then it uses conditional syntax to say "if backreference 1 is found again, match it, otherwise match nothing". The whole pattern is anchored (which means that it needs to appear by itself on a line) so that unmatched quotes won't be captured (otherwise the pattern in pattern" would match).

然而,这种模式有点复杂。它首先查找可选引用,如果找到,则将其置于反向引用1中。然后它搜索您的模式。然后它使用条件语法来说“如果再次找到反向引用1,则匹配它,否则不匹配”。整个模式被锚定(这意味着它需要单独出现在一条线上),以便不会捕获不匹配的引号(否则模式中的模式将匹配)。

Note that support for conditionals varies by engine and the more verbose but repetitive expressions will be more widely supported (and likely easier to understand).

请注意,对条件的支持因引擎而异,更加冗长但重复的表达式将得到更广泛的支持(并且可能更容易理解)。


Update: A much simpler version of this regex would be /^(")?(pattern)\1$/, which does not need a conditional. When I was testing this initially, the tester I was using gave me a false negative, which lead me to discount it (oops!).

更新:这个正则表达式的一个更简单的版本是/ ^(“)?(模式)\ 1 $ /,这不需要条件。当我最初测试时,我使用的测试人员给了我一个假阴性,这导致我打折它(哎呀!)。

I'll leave the solution with the conditional up for posterity and interest, but this is a simpler version that is more likely to work in a wider variety of engines (backreferences are the only feature being used here which might be unsupported).

我将保留解决方案的条件性后代和兴趣,但这是一个更简单的版本,更有可能在更广泛的引擎中工作(反向引用是这里使用的唯一可能不受支持的功能)。

#2


0  

Depending on the language you're using, you should be able to use backreferences. Something like this, say:

根据您使用的语言,您应该能够使用反向引用。这样的话,说:

(["'])(pattern)\1|^(pattern)$

That way, you're requiring that either there are no quotes, or that the SAME quote is used on both ends.

这样,您要求没有引号,或者要求两端都使用SAME引用。

#3


0  

This should work with recursive regex (which needs longer to get right). In the meantime: in Perl, you can build a self-modifying regex. I'll leave that as an academic example ;-)

这应该适用于递归正则表达式(需要更长时间才能正确)。与此同时:在Perl中,您可以构建一个自我修改的正则表达式。我会把它留作学术榜样;-)

my @stuff = ( '"pattern"', 'pattern', 'pattern"', '"pattern'  );

foreach (@stuff) {
   print "$_ OK\n" if /^
                        (")?
                        \w+
                        (??{defined $1 ? '"' : ''})
                       $
                      /x
}

Result:

结果:

"pattern" OK
pattern OK