Ruby正则表达式:“捕获字符串,除非它后跟......”

时间:2022-02-09 09:57:39

My regex captures quoted phrases:

我的正则表达式捕获引用的短语:

"([^"]*)"

I want to improve it, by ignoring quotes, which are followed by ', -' (a comma, a space and a dash in this particular order).

我想通过忽略引号来改进它,后面跟着', - '(逗号,空格和破折号按此特定顺序)。

How do I do this?

我该怎么做呢?

The test: http://rubular.com/r/xls6vN1w92

测试:http://rubular.com/r/xls6vN1w92

4 个解决方案

#1


4  

This should do it, using a Negative Lookahead:

应该这样做,使用否定前瞻:

"(?!, -)([^"]*)"(?!, -)

A little icky, but it works. You want to make sure either quote isn't followed by your string, or else the match will start at the closing quotes.

有点icky,但它的工作原理。您希望确保任何一个引号后面没有字符串,否则匹配将从结束引号开始。

http://rubular.com/r/yFMyUKJOHL

http://rubular.com/r/yFMyUKJOHL

#2


1  

Regex

正则表达式

"(.*?)"(?!, -)

“(。*?)”(?!, - )

Working Example

工作实例

http://rubular.com/r/9kOmZLxLfy

http://rubular.com/r/9kOmZLxLfy

#3


1  

This is unparsable in your context, its open ended. The only way to parse it is to consume the not's as well as the want's, but its still an invalid premise.

这在你的背景下是不可解决的,它是开放式的。解析它的唯一方法是消耗not和the want,但它仍然是一个无效的前提。

/"([^"]*?)"(?!, -)|"[^"]*?"(?=, -)/

/“([^”] *?)“(?!, - )|”[^“] *?”(?=, - )/

Then check for capture group 1 on each match, something like this:

然后在每场比赛中检查捕获组1,如下所示:

$rx = qr/"([^"]*?)"(?!, -)|"[^"]*?"(?=, -)/;
while (' "ingnore me", - "but not me" ' =~ /$rx/g) {
  print "'$1'\n" if defined $1
}

#4


0  

Add (?!...) at the end of the regex:

在正则表达式的末尾添加(?!...):

"([^"\n]*)"(?!, -)

#1


4  

This should do it, using a Negative Lookahead:

应该这样做,使用否定前瞻:

"(?!, -)([^"]*)"(?!, -)

A little icky, but it works. You want to make sure either quote isn't followed by your string, or else the match will start at the closing quotes.

有点icky,但它的工作原理。您希望确保任何一个引号后面没有字符串,否则匹配将从结束引号开始。

http://rubular.com/r/yFMyUKJOHL

http://rubular.com/r/yFMyUKJOHL

#2


1  

Regex

正则表达式

"(.*?)"(?!, -)

“(。*?)”(?!, - )

Working Example

工作实例

http://rubular.com/r/9kOmZLxLfy

http://rubular.com/r/9kOmZLxLfy

#3


1  

This is unparsable in your context, its open ended. The only way to parse it is to consume the not's as well as the want's, but its still an invalid premise.

这在你的背景下是不可解决的,它是开放式的。解析它的唯一方法是消耗not和the want,但它仍然是一个无效的前提。

/"([^"]*?)"(?!, -)|"[^"]*?"(?=, -)/

/“([^”] *?)“(?!, - )|”[^“] *?”(?=, - )/

Then check for capture group 1 on each match, something like this:

然后在每场比赛中检查捕获组1,如下所示:

$rx = qr/"([^"]*?)"(?!, -)|"[^"]*?"(?=, -)/;
while (' "ingnore me", - "but not me" ' =~ /$rx/g) {
  print "'$1'\n" if defined $1
}

#4


0  

Add (?!...) at the end of the regex:

在正则表达式的末尾添加(?!...):

"([^"\n]*)"(?!, -)