是否可以使用pandas的read_csv读取分类列？

I have tried passing the dtype parameter with read_csv as dtype={n: pandas.Categorical} but this does not work properly (the result is an Object). The manual is unclear.

我尝试将dtype参数与read_csv一起传递为dtype = {n：pandas.Categorical}，但这不能正常工作（结果是一个Object）。手册不清楚。

2 个解决方案

#1

In version 0.19.0 you can use parameter dtype='category' in read_csv:

在版本0.19.0中，您可以在read_csv中使用参数dtype ='category'：

data = 'col1,col2,col3\na,b,1\na,b,2\nc,d,3'
df = pd.read_csv(StringIO(data), dtype='category')
print (df)

  col1 col2  col3
0    a    b     1
1    a    b     2
2    c    d     3

print (df.dtypes)
col1    category
col2    category
col3    category
dtype: object

#2

Categorical is not a valid dtype.

分类不是有效的dtype。

This * post contains details for how to store categorical data in a text file.

此*帖子包含有关如何在文本文件中存储分类数据的详细信息。

#1