I have some input that looks like the following:
我有如下输入:
A,B,C,"D12121",E,F,G,H,"I9,I8",J,K
The comma-separated values can be in any order. I'd like to split the string on commas; however, in the case where something is inside double quotation marks, I need it to both ignore commas and strip out the quotation marks (if possible). So basically, the output would be this list of strings:
逗号分隔的值可以是任意顺序的。我想把绳子分成逗号;但是,在双引号内的情况下,我需要它同时忽略逗号并去掉引号(如果可能的话)。所以基本上,输出就是这个字符串列表:
['A', 'B', 'C', 'D12121', 'E', 'F', 'G', 'H', 'I9,I8', 'J', 'K']
I've had a look at some other answers, and I'm thinking a regular expression would be best, but I'm terrible at coming up with them.
我已经看了一些其他的答案,我认为一个正则表达式是最好的,但是我很不擅长想出它们。
1 个解决方案
#1
49
Lasse is right; it's a comma separated value file, so you should use the csv
module. A brief example:
Lasse是正确的;它是一个逗号分隔的值文件,所以您应该使用csv模块。一个简单的例子:
from csv import reader
# test
infile = ['A,B,C,"D12121",E,F,G,H,"I9,I8",J,K']
# real is probably like
# infile = open('filename', 'r')
# or use 'with open(...) as infile:' and indent the rest
for line in reader(infile):
print line
# for the test input, prints
# ['A', 'B', 'C', 'D12121', 'E', 'F', 'G', 'H', 'I9,I8', 'J', 'K']
#1
49
Lasse is right; it's a comma separated value file, so you should use the csv
module. A brief example:
Lasse是正确的;它是一个逗号分隔的值文件,所以您应该使用csv模块。一个简单的例子:
from csv import reader
# test
infile = ['A,B,C,"D12121",E,F,G,H,"I9,I8",J,K']
# real is probably like
# infile = open('filename', 'r')
# or use 'with open(...) as infile:' and indent the rest
for line in reader(infile):
print line
# for the test input, prints
# ['A', 'B', 'C', 'D12121', 'E', 'F', 'G', 'H', 'I9,I8', 'J', 'K']