Possible Duplicate:
How to read a CSV line with "?可能重复:如何读取带有“?”的CSV行?
I have seen a number of related questions but none have directly addressed what I am trying to do. I am reading in lines of text from a CSV file.
我已经看到了一些相关的问题,但没有一个直接解决了我想要做的事情。我正在阅读CSV文件中的文本行。
All the items are in quotes and some have additional commas within the quotes. I would like to split the line along commas, but ignore the commas within quotes. Is there a way to do this within Python that does not require a number of regex statements.
所有项目都在引号中,有些项目在引号内有额外的逗号。我想用逗号分隔该行,但忽略引号内的逗号。有没有办法在Python中执行此操作,不需要许多正则表达式语句。
An example is:
一个例子是:
"114111","Planes,Trains,and Automobiles","50","BOOK"
which I would like parsed into 4 separate variables of values:
我想解析为4个独立的值变量:
"114111" "Planes,Trains,and Automobiles" "50" "Book"
Is there a simple option in line.split()
that I am missing ?
在line.split()中有一个我想念的简单选项吗?
2 个解决方案
#1
31
Don't try to re-invent the wheel.
不要试图重新发明*。
If you want to read lines from a CSV file, use Python's csv
module from the standard library.
如果要从CSV文件中读取行,请使用标准库中的Python的csv模块。
Example:
例:
> cat test.py
import csv
with open('some.csv') as f:
reader = csv.reader(f)
for row in reader:
print(row)
> cat some.csv
"114111","Planes,Trains,and Automobiles","50","BOOK"
> python test.py
['114111', 'Planes,Trains,and Automobiles', '50', 'BOOK']
[]
Job done!
任务完成!
#2
-5
You can probably split on "," that is "[quote][comma][quote]"
您可以拆分“,”即“[quote] [逗号] [引用]”
the other option is coming up with an escape character, so if somebody wants to embed a comma in the string they do \c and if they want a backslash they do \\. Then you have to split the string, then unescape it before processing.
另一种选择是提出一个转义字符,所以如果有人想在字符串中嵌入一个逗号,那么它们会做\ c,如果他们想要反斜杠,他们会做\\。然后你必须拆分字符串,然后在处理之前将其取消。
#1
31
Don't try to re-invent the wheel.
不要试图重新发明*。
If you want to read lines from a CSV file, use Python's csv
module from the standard library.
如果要从CSV文件中读取行,请使用标准库中的Python的csv模块。
Example:
例:
> cat test.py
import csv
with open('some.csv') as f:
reader = csv.reader(f)
for row in reader:
print(row)
> cat some.csv
"114111","Planes,Trains,and Automobiles","50","BOOK"
> python test.py
['114111', 'Planes,Trains,and Automobiles', '50', 'BOOK']
[]
Job done!
任务完成!
#2
-5
You can probably split on "," that is "[quote][comma][quote]"
您可以拆分“,”即“[quote] [逗号] [引用]”
the other option is coming up with an escape character, so if somebody wants to embed a comma in the string they do \c and if they want a backslash they do \\. Then you have to split the string, then unescape it before processing.
另一种选择是提出一个转义字符,所以如果有人想在字符串中嵌入一个逗号,那么它们会做\ c,如果他们想要反斜杠,他们会做\\。然后你必须拆分字符串,然后在处理之前将其取消。