JavaScript正则表达式记住遇到的最后一个括号类型

时间:2021-06-26 22:25:19

In JavaScript want to be able to match text that is:

在JavaScript中希望能够匹配以下文本:

  • (surrounded by parentheses)
  • (括号括起来)

  • [surrounded by square brackets]
  • [方括号包围]

  • not surrounded by either type of bracket
  • 没有被任何一种支架包围

In the following expression...

在以下表达式中......

none[square](round)(accept]able)[wrong).text

... there should be 4 matches, for none, [square], (round) and (accept]able). However [wrong) should not match because there is no closing ] to be found.

......应该有4场比赛,没有,[square],(round)和(accept)able。但是[错误的]不应该匹配,因为没有找到关闭]。

In my best attempt so far...

迄今为止我的最佳尝试......

([([])[A-Za-z]+[\])]|[^\[()\]]+

... (accept], able and [wrong) are incorrectly matched, while (accept]able) as a whole is not matched. I'm not too concerned about (accept]able); I would prefer no match at all to a match with imbalanced brackets.

...(接受),能力和[错误]匹配不正确,而(接受)能够整体不匹配。我不太关心(接受)能够);我宁愿不匹配不平衡的括号匹配。

I am guessing that I need to replace the [\])] expression with one that checks the value of the initial matching group, and uses ) if the first match was ( or ] if the first match was [.

我猜我需要用一个检查初始匹配组的值替换[\ _])]表达式,并且如果第一个匹配是(或者如果第一个匹配是[。]则使用)。

I have tried working with conditional expressions. These seem to work well in PCRE and Python, but not in JavaScript.

我试过使用条件表达式。这些似乎在PCRE和Python中运行良好,但在JavaScript中却不行。

Is this a problem that can be solved in a JavaScript regular expression on its own, or will I have to handle this piecemeal in a bulky JavaScript function?

这是一个可以在JavaScript正则表达式中单独解决的问题,还是我必须在庞大的JavaScript函数中处理这个零碎的问题?

4 个解决方案

#1


A way to do that consists to match the two cases (acceptable and non-acceptable) and to separate the results in two different capture groups. So whatever you need to do with the results you only have to test which group succeeds:

一种方法是匹配两种情况(可接受和不可接受)并将结果分成两个不同的捕获组。因此,无论您需要对结果做什么,您只需要测试哪个组成功:

/(\[[^\]]*\]|\([^)]*\)|[a-z]+)|([\[(][\s\S]*?(?:[\])]|$))/gi

pattern details:

(  # acceptable capture group
    \[ [^\]]* \]
  |
    \( [^)]* \)
  |
    [a-z]+
)
|
(  # non-acceptable capture group
    [\[(] [\s\S]*? (?: [\])] | $ ) # unclosed parens
)

This pattern doesn't care if a square bracket is enclosed between round brackets and vice-versa, but you can easily be more constrictive with this pattern that forbids any other brackets between brackets (square or round):

这种模式并不关心方括号是否括在圆括号之间,反之亦然,但是这种模式可以更容易限制,禁止括号(方形或圆形)之间的任何其他括号:

(  # acceptable capture group
    \[ [^()\[\]]* \]
  |
    \( [^()\[\]]* \)
  |
    [a-z]+
)
|
(  # non-acceptable capture group
    [\[(] [\s\S]*? (?: [\])] | $ ) # unclosed parens
)

Note about these two patterns: You can choose the default behavior when a unclosed bracket is found. The two patterns are designed to stop the non-acceptable part at the first closing bracket or if not found at the end of the string, but you can change this behavior and choose that an unclosing bracket stops always at the end of the string like this: [\[(][\s\S]*$

请注意以下两种模式:您可以在找到未闭合的括号时选择默认行为。这两种模式设计用于在第一个闭合括号处停止不可接受的部分,或者如果在字符串的末尾没有找到,但是您可以更改此行为并选择一个非闭合括号始终停止在字符串的末尾,如下所示:[\ [([[] [\ s \ S] * $

#2


I'm not quite sure if I get all of the possible strings, but maybe this does the trick?

我不太确定我是否得到了所有可能的字符串,但是这可能就是诀窍吗?

/\[([A-Za-z]*)\]|\(([\]A-Za-z]*)\)/gm

#3


You can use the following :

您可以使用以下内容:

/^(\[[^\[]+?\]|\([^\(]+?\)|[^\[\(]+)$/gm

See DEMO

#4


This will do it for you:

这将为您做到:

\((\w*\s*)\)|\[(\w*)\]|\((\w*\s*|\])*\)|\((\w*\s*|\[)*\)|\[(\w*\s*|\()*\]|\[(\w*\s*|\))*\]|^\b\w*\s*\b

Demo here:

https://regex101.com/r/mV6gD2/2

#1


A way to do that consists to match the two cases (acceptable and non-acceptable) and to separate the results in two different capture groups. So whatever you need to do with the results you only have to test which group succeeds:

一种方法是匹配两种情况(可接受和不可接受)并将结果分成两个不同的捕获组。因此,无论您需要对结果做什么,您只需要测试哪个组成功:

/(\[[^\]]*\]|\([^)]*\)|[a-z]+)|([\[(][\s\S]*?(?:[\])]|$))/gi

pattern details:

(  # acceptable capture group
    \[ [^\]]* \]
  |
    \( [^)]* \)
  |
    [a-z]+
)
|
(  # non-acceptable capture group
    [\[(] [\s\S]*? (?: [\])] | $ ) # unclosed parens
)

This pattern doesn't care if a square bracket is enclosed between round brackets and vice-versa, but you can easily be more constrictive with this pattern that forbids any other brackets between brackets (square or round):

这种模式并不关心方括号是否括在圆括号之间,反之亦然,但是这种模式可以更容易限制,禁止括号(方形或圆形)之间的任何其他括号:

(  # acceptable capture group
    \[ [^()\[\]]* \]
  |
    \( [^()\[\]]* \)
  |
    [a-z]+
)
|
(  # non-acceptable capture group
    [\[(] [\s\S]*? (?: [\])] | $ ) # unclosed parens
)

Note about these two patterns: You can choose the default behavior when a unclosed bracket is found. The two patterns are designed to stop the non-acceptable part at the first closing bracket or if not found at the end of the string, but you can change this behavior and choose that an unclosing bracket stops always at the end of the string like this: [\[(][\s\S]*$

请注意以下两种模式:您可以在找到未闭合的括号时选择默认行为。这两种模式设计用于在第一个闭合括号处停止不可接受的部分,或者如果在字符串的末尾没有找到,但是您可以更改此行为并选择一个非闭合括号始终停止在字符串的末尾,如下所示:[\ [([[] [\ s \ S] * $

#2


I'm not quite sure if I get all of the possible strings, but maybe this does the trick?

我不太确定我是否得到了所有可能的字符串,但是这可能就是诀窍吗?

/\[([A-Za-z]*)\]|\(([\]A-Za-z]*)\)/gm

#3


You can use the following :

您可以使用以下内容:

/^(\[[^\[]+?\]|\([^\(]+?\)|[^\[\(]+)$/gm

See DEMO

#4


This will do it for you:

这将为您做到:

\((\w*\s*)\)|\[(\w*)\]|\((\w*\s*|\])*\)|\((\w*\s*|\[)*\)|\[(\w*\s*|\()*\]|\[(\w*\s*|\))*\]|^\b\w*\s*\b

Demo here:

https://regex101.com/r/mV6gD2/2