如何在逗号上将字符串拆分为数组但忽略括号中的逗号

时间:2022-08-01 21:42:34

I have a string with 3 lines:

我有一个包含3行的字符串:

a VARCHAR(20),
b FLOAT, c FLOAT,
d NUMBER(38,0), e NUMBER(38,0)

Need to split a string into array based on comma delimiter but to ignore commas in parentheses.

需要根据逗号分隔符将字符串拆分为数组,但忽略括号中的逗号。

Final output is array with 5 elements:

最终输出是包含5个元素的数组:

s_arr = ['a VARCHAR(20)', 'b FLOAT', 'c FLOAT', 'd NUMBER(38,0)', 'e NUMBER(38,0)']

So far I have s_arr = s.split(",")

到目前为止我有s_arr = s.split(“,”)

How to achieve that?

怎么实现呢?

4 个解决方案

#1


2  

You may use ,(?![^\(]*[\)]) with a list comprehension:

您可以将(?![^ \(* * [\]])与列表推导一起使用:

s = '''
a VARCHAR(20),
b FLOAT, c FLOAT,
d NUMBER(38,0), e NUMBER(38,0)
'''

[i.strip() for i in re.split(r',(?![^\(]*[\)])', s)]
# ['a VARCHAR(20)', 'b FLOAT', 'c FLOAT', 'd NUMBER(38,0)', 'e NUMBER(38,0)']

#2


1  

Use a Regular Expression to split based on multiple delimiters

stringToSplit = '''a VARCHAR(20),
b FLOAT, c FLOAT,
d NUMBER(38,0), e NUMBER(38,0)'''

import re
re.split(', |,\n', stringToSplit)

This works because your string doesn't have any spaces or newlines after commas in the parentheses (1,2).

这是有效的,因为您的字符串在括号中的逗号后面没有任何空格或换行符(1,2)。

#3


0  

If you know more about the data, you can easily avoid all the weird parsing by doing this:

如果您对数据有更多了解,可以通过以下方式轻松避免所有奇怪的解析:

a.replace(", ", "@").replace(",\n", "@").split("@")

Which replaces the delimiters with a different character and splits them on that. This assumes you have a space after the comma for delimiters. Not the most elegant, but will handle most cases if you're in a bind.

用不同的字符替换分隔符并将其拆分。假设您在分隔符的逗号后面有空格。不是最优雅的,但如果你处在一个绑定中,它将处理大多数情况。

#4


0  

Using list comprehensions and string methods:

使用列表推导和字符串方法:

Given

s = """\
a VARCHAR(20),
b FLOAT, c FLOAT,
d NUMBER(38,0), e NUMBER(38,0)
"""

Code

[z.strip() for y in [x.split(", ") for x in s.split(",\n")] for z in y]
# ['a VARCHAR(20)', 'b FLOAT', 'c FLOAT', 'd NUMBER(38,0)', 'e NUMBER(38,0)']

Alternatively

[z.strip(",") for y in [x.split(", ") for x in s.splitlines()] for z in y]
# ['a VARCHAR(20)', 'b FLOAT', 'c FLOAT', 'd NUMBER(38,0)', 'e NUMBER(38,0)']

#1


2  

You may use ,(?![^\(]*[\)]) with a list comprehension:

您可以将(?![^ \(* * [\]])与列表推导一起使用:

s = '''
a VARCHAR(20),
b FLOAT, c FLOAT,
d NUMBER(38,0), e NUMBER(38,0)
'''

[i.strip() for i in re.split(r',(?![^\(]*[\)])', s)]
# ['a VARCHAR(20)', 'b FLOAT', 'c FLOAT', 'd NUMBER(38,0)', 'e NUMBER(38,0)']

#2


1  

Use a Regular Expression to split based on multiple delimiters

stringToSplit = '''a VARCHAR(20),
b FLOAT, c FLOAT,
d NUMBER(38,0), e NUMBER(38,0)'''

import re
re.split(', |,\n', stringToSplit)

This works because your string doesn't have any spaces or newlines after commas in the parentheses (1,2).

这是有效的,因为您的字符串在括号中的逗号后面没有任何空格或换行符(1,2)。

#3


0  

If you know more about the data, you can easily avoid all the weird parsing by doing this:

如果您对数据有更多了解,可以通过以下方式轻松避免所有奇怪的解析:

a.replace(", ", "@").replace(",\n", "@").split("@")

Which replaces the delimiters with a different character and splits them on that. This assumes you have a space after the comma for delimiters. Not the most elegant, but will handle most cases if you're in a bind.

用不同的字符替换分隔符并将其拆分。假设您在分隔符的逗号后面有空格。不是最优雅的,但如果你处在一个绑定中,它将处理大多数情况。

#4


0  

Using list comprehensions and string methods:

使用列表推导和字符串方法:

Given

s = """\
a VARCHAR(20),
b FLOAT, c FLOAT,
d NUMBER(38,0), e NUMBER(38,0)
"""

Code

[z.strip() for y in [x.split(", ") for x in s.split(",\n")] for z in y]
# ['a VARCHAR(20)', 'b FLOAT', 'c FLOAT', 'd NUMBER(38,0)', 'e NUMBER(38,0)']

Alternatively

[z.strip(",") for y in [x.split(", ") for x in s.splitlines()] for z in y]
# ['a VARCHAR(20)', 'b FLOAT', 'c FLOAT', 'd NUMBER(38,0)', 'e NUMBER(38,0)']