为什么这个正则表达式不起作用？

I have a regular expression to extract two tokens, delimiters ['] and words between apostrophes like 'Stack Overflow'. The question is, why this regular expression doesn't work?

我有一个正则表达式来提取两个标记，分隔符[']和撇号之间的单词，如'Stack Overflow'。问题是，为什么这个正则表达式不起作用？

Regex:

正则表达式：

(['])|'([^']*)'

Here is a link to explain it: Regular Expression

这是一个解释它的链接：正则表达式

Only works extracting apostrophes but, words between apostrophes no.

只能用于提取撇号但是，撇号之间没有。

NOTE: I need to extract apostrophe and any word between apostrophe by separately like 'Stack Overflow'.

注意：我需要像'Stack Overflow'一样单独提取撇号和撇号之间的任何单词。

The result would be like:

结果如下：

'
“
Stack Overflow
堆栈溢出
'
“

Greetings.

问候。

2 个解决方案

#1

Your regex says to match either a single quote or the content between quotes, but it's an exclusive or the way you have it. To get each of them as a capture group you could use the regex:

你的正则表达式说要匹配单引号或引号之间的内容，但它是独占的或你拥有它的方式。要将它们作为捕获组，您可以使用正则表达式：

(')([^']*)(')

to get the first quote, then everything that's not a quote then the last quote

得到第一个报价，那么所有不是报价然后是最后一个报价

#2

TL;DR Because it's short-circuit.

TL; DR因为它是短路的。

In the or condition, once the first regex is matched the second regex is unnecessary to evaluated. because True | anything always gets True, right?

在条件中，一旦第一个正则表达式匹配，就不需要计算第二个正则表达式。因为True |什么都变得正确，对吗？

Consider your regex

考虑你的正则表达式

regex = (['])|'([^']*)'
text = 'Stack Overflow'

Run regex to match string in text

运行正则表达式以匹配文本中的字符串

([']) matches to ' and ', then capture them into $1 and $2.

（[']）匹配'和'，然后将它们捕获到$ 1和$ 2。

done! (skip the second regex because you connect them with or)

完成了！（跳过第二个正则表达式，因为你用它们连接或）

Another proof:

另一个证据：

regex = (['])|'([^']*)'
text = 'Stack Overflow'

get

得到

$1 = `'`
$2 = `'`

but

但

regex = '([^']*)'|(['])
text = 'Stack Overflow'

get

得到

$1 = `Stack Overflow`

You will see that only the first one is work!

你会看到只有第一个工作！

Thus, I suggest you to use this regex instead of:

因此，我建议你使用这个正则表达式而不是：

(')(.*?)(')

where you get your captured texts in $1, $2, $3 respectively.

您可以分别以1美元，2美元，3美元的价格获得所捕获的文本。

Note that *? is a non-greedy quantifier, the simple explanation is: it will not arbitrarily consume your '.

注意 *？是一个非贪婪的量词，简单的解释是：它不会随意消耗你的'。

#1