将pandas DataFrame写入CSV文件

本文翻译自：Writing a pandas DataFrame to CSV file

I have a dataframe in pandas which I would like to write to a CSV file. 我有一个熊猫数据框，我想将其写入CSV文件。 I am doing this using: 我正在使用以下方法：

df.to_csv('')

And getting the error: 并得到错误：

UnicodeEncodeError: 'ascii' codec can't encode character u'\u03b1' in position 20: ordinal not in range(128)

Is there any way to get around this easily (ie I have unicode characters in my data frame)? 有什么方法可以轻松解决此问题（即我的数据框中有Unicode字符）吗？ And is there a way to write to a tab delimited file instead of a CSV using eg a 'to-tab' method (that I dont think exists)? 是否有一种方法可以使用例如“ to-tab”方法（我认为不存在）写入制表符分隔文件而不是CSV？

#1楼

参考：/question/190W9/将pandas-DataFrame写入CSV文件

#2楼

To delimit by a tab you can use the sep argument of to_csv : 要用制表符分隔，可以使用to_csv的sep参数：

df.to_csv(file_name, sep='\t')

To use a specific encoding (eg 'utf-8') use the encoding argument: 要使用特定的编码（例如'utf-8'），请使用encoding参数：

df.to_csv(file_name, sep='\t', encoding='utf-8')

#3楼

Sometimes you face these problems if you specify UTF-8 encoding also. 如果同时指定UTF-8编码，有时会遇到这些问题。 I recommend you to specify encoding while reading file and same encoding while writing to file. 我建议您在读取文件时指定编码，而在写入文件时指定相同的编码。 This might solve your problem. 这可能会解决您的问题。

#4楼

Something else you can try if you are having issues encoding to 'utf-8' and want to go cell by cell you could try the following. 如果您遇到编码为'utf-8'的问题，并且想要逐个单元地进行操作，则可以尝试其他方法。

Python 2 Python 2

(Where "df" is your DataFrame object.) （其中“ df”是您的DataFrame对象。）

for column in :
    for idx in df[column].index:
        x = df.get_value(idx,column)
        try:
            x = unicode(('utf-8','ignore'),errors ='ignore') if type(x) == unicode else unicode(str(x),errors='ignore')
            df.set_value(idx,column,x)
        except Exception:
            print 'encoding error: {0} {1}'.format(idx,column)
            df.set_value(idx,column,'')
            continue

Then try: 然后尝试：

df.to_csv(file_name)

You can check the encoding of the columns by: 您可以通过以下方式检查列的编码：

for column in :
    print '{0} {1}'.format(str(type(df[column][0])),str(column))

Warning: errors='ignore' will just omit the character eg 警告：errors ='ignore'只会忽略字符，例如

IN: unicode('Regenexx\xae',errors='ignore')
OUT: u'Regenexx'

Python 3 Python 3

for column in :
    for idx in df[column].index:
        x = df.get_value(idx,column)
        try:
            x = x if type(x) == str else str(x).encode('utf-8','ignore').decode('utf-8','ignore')
            df.set_value(idx,column,x)
        except Exception:
            print('encoding error: {0} {1}'.format(idx,column))
            df.set_value(idx,column,'')
            continue

#5楼

When you are storing a DataFrame object into a csv file using the to_csv method, you probably wont be needing to store the preceding indices of each row of the DataFrame object. 当使用to_csv方法将DataFrame对象存储到csv文件中时 ，可能不需要存储DataFrame对象每一行的先前索引 。

You can avoid that by passing a False boolean value to index parameter. 您可以通过将False布尔值传递给index参数来避免这种情况。

Somewhat like: 有点像：

df.to_csv(file_name, encoding='utf-8', index=False)

So if your DataFrame object is something like: 因此，如果您的DataFrame对象类似于：

  Color  Number
0   red     22
1  blue     10

The csv file will store: csv文件将存储：

Color,Number
red,22
blue,10

instead of (the case when the default value True was passed) 而不是（通过默认值 True的情况）

,Color,Number
0,red,22
1,blue,10

#6楼

it could be not the answer for this case, but as I had the same error-message with .to_csv I tried .toCSV('') and the error-message was different ("'SparseDataFrame' object has no attribute 'toCSV'"). 它可能不是这种情况的答案，但是由于我对.to_csv使用了相同的错误消息，因此我尝试使用.toCSV（''）并且错误消息有所不同（“'SparseDataFrame'对象没有属性' toCSV'“）。 So the problem was solved by turning dataframe to dense dataframe 因此，通过将数据帧转换为密集数据帧解决了该问题

df.to_dense().to_csv("", index = False, sep=',', encoding='utf-8')

秒客网

将pandas DataFrame写入CSV文件

#1楼

#2楼

#3楼

#4楼

#5楼

#6楼

相关文章