如何用逗号分隔一行,但忽略引号中的逗号Python [duplicate]

时间:2020-12-27 21:43:14

Possible Duplicate:
How to read a CSV line with "?

可能重复:如何读取带有“?”的CSV行?

I have seen a number of related questions but none have directly addressed what I am trying to do. I am reading in lines of text from a CSV file.

我已经看到了一些相关的问题,但没有一个直接解决了我想要做的事情。我正在阅读CSV文件中的文本行。

All the items are in quotes and some have additional commas within the quotes. I would like to split the line along commas, but ignore the commas within quotes. Is there a way to do this within Python that does not require a number of regex statements.

所有项目都在引号中,有些项目在引号内有额外的逗号。我想用逗号分隔该行,但忽略引号内的逗号。有没有办法在Python中执行此操作,不需要许多正则表达式语句。

An example is:

一个例子是:

"114111","Planes,Trains,and Automobiles","50","BOOK"

which I would like parsed into 4 separate variables of values:

我想解析为4个独立的值变量:

"114111"  "Planes,Trains,and Automobiles"  "50" "Book"

Is there a simple option in line.split() that I am missing ?

在line.split()中有一个我想念的简单选项吗?

2 个解决方案

#1


31  

Don't try to re-invent the wheel.

不要试图重新发明*。

If you want to read lines from a CSV file, use Python's csv module from the standard library.

如果要从CSV文件中读取行,请使用标准库中的Python的csv模块。

Example:

例:

> cat test.py
import csv
with open('some.csv') as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)
> cat some.csv
"114111","Planes,Trains,and Automobiles","50","BOOK"

> python test.py
['114111', 'Planes,Trains,and Automobiles', '50', 'BOOK']
[]

Job done!

任务完成!

#2


-5  

You can probably split on "," that is "[quote][comma][quote]"

您可以拆分“,”即“[quote] [逗号] [引用]”

the other option is coming up with an escape character, so if somebody wants to embed a comma in the string they do \c and if they want a backslash they do \\. Then you have to split the string, then unescape it before processing.

另一种选择是提出一个转义字符,所以如果有人想在字符串中嵌入一个逗号,那么它们会做\ c,如果他们想要反斜杠,他们会做\\。然后你必须拆分字符串,然后在处理之前将其取消。

#1


31  

Don't try to re-invent the wheel.

不要试图重新发明*。

If you want to read lines from a CSV file, use Python's csv module from the standard library.

如果要从CSV文件中读取行,请使用标准库中的Python的csv模块。

Example:

例:

> cat test.py
import csv
with open('some.csv') as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)
> cat some.csv
"114111","Planes,Trains,and Automobiles","50","BOOK"

> python test.py
['114111', 'Planes,Trains,and Automobiles', '50', 'BOOK']
[]

Job done!

任务完成!

#2


-5  

You can probably split on "," that is "[quote][comma][quote]"

您可以拆分“,”即“[quote] [逗号] [引用]”

the other option is coming up with an escape character, so if somebody wants to embed a comma in the string they do \c and if they want a backslash they do \\. Then you have to split the string, then unescape it before processing.

另一种选择是提出一个转义字符,所以如果有人想在字符串中嵌入一个逗号,那么它们会做\ c,如果他们想要反斜杠,他们会做\\。然后你必须拆分字符串,然后在处理之前将其取消。