正则表达式匹配大括号内的字符串

时间:2022-08-14 18:05:12

I am trying to write a regex to a string that has the following format

我正在尝试将正则表达式写入具有以下格式的字符串

12740(34,12) [abc (a1b2c3) (a2b3c4)......] myId123

12740(34,12)[abc(a1b2c3)(a2b3c4)......] myId123

Currently, I have something like this

目前,我有这样的事情

\((?P<expression>\S+)\)

But with this, I can capture only the strings within square brackets.

但有了这个,我只能捕获方括号内的字符串。

Is there anyway I can capture the integers before the square brackets and also id at the end along with the strings within square brackets.

无论如何,我可以在方括号之前捕获整数,也可以在方括号内的字符串中捕获id。

The number of strings enclosed within small brackets will not be the same. I could also have a string that looks like this

小括号内的字符串数量不一样。我也可以有一个看起来像这样的字符串

10(3,2) [abc (a1b2c3)] myId1

10(3,2)[abc(a1b2c3)] myId1

I know that I can write a simple regex for the above expression using brute force. But could anyone please help me write one when the number of strings within the square bracket keeps changing.

我知道我可以使用蛮力为上面的表达式编写一个简单的正则表达式。但是当方括号内的字符串数量不断变化时,有人可以帮我写一个。

Thanks in advance

提前致谢

1 个解决方案

#1


2  

You can capture the information by using ^ and $, which mean start and end respectively:

您可以使用^和$来捕获信息,分别表示开始和结束:

((?P<front>^\d+)|\((?P<expression>\S+)\)|(?P<id>[a-zA-Z0-9]+)$)

Regex101:

https://regex101.com/r/PoA5k4/1

To make the result more usable, I'd turn it into a dictionary:

为了使结果更有用,我将其变成字典:

import re

myStr = "12740(34,12) [abc (a1b2c3) (a2b3c4)......] myId123"
di = {}
for find in re.findall("((?P<front>^\d+)|\((?P<expression>\S+)\)|(?P<id>[a-zA-Z0-9]+)$)",myStr):
    if find[1] != "":
        di["starter"] = find[1]
    elif find[3] != "":
        di["id"] = find[3]
    else:
        di.setdefault("expression",[]).append(find[2])
print(di)

#1


2  

You can capture the information by using ^ and $, which mean start and end respectively:

您可以使用^和$来捕获信息,分别表示开始和结束:

((?P<front>^\d+)|\((?P<expression>\S+)\)|(?P<id>[a-zA-Z0-9]+)$)

Regex101:

https://regex101.com/r/PoA5k4/1

To make the result more usable, I'd turn it into a dictionary:

为了使结果更有用,我将其变成字典:

import re

myStr = "12740(34,12) [abc (a1b2c3) (a2b3c4)......] myId123"
di = {}
for find in re.findall("((?P<front>^\d+)|\((?P<expression>\S+)\)|(?P<id>[a-zA-Z0-9]+)$)",myStr):
    if find[1] != "":
        di["starter"] = find[1]
    elif find[3] != "":
        di["id"] = find[3]
    else:
        di.setdefault("expression",[]).append(find[2])
print(di)