在熊猫Read_CSV中使用UseCols时保持列的指定顺序

时间:2021-08-17 20:29:49

I have a csv file with 50 columns of data. I am using Pandas read_csv function to pull in a subset of these columns, using the usecols parameter to choose the ones I want:

我有一个包含50列数据的csv文件。我正在使用panda read_csv函数来提取这些列的子集,使用usecols参数来选择我想要的列:

cols_to_use = [0,1,5,16,8]
df_ret = pd.read_csv(filepath, index_col=False, usecols=cols_to_use)

The trouble is df_ret contains the correct columns, but not in the order I specified. They are in ascending order, so [0,1,5,8,16]. (By the way the column numbers can change from run to run, this is just an example.) This is a problem because the rest of the code has arrays which are in the "correct" order and I would rather not have to reorder all of them.

问题是df_ret包含了正确的列,但不是按照我指定的顺序。它们是按升序排列的,所以[0,1,5,8,16]。(顺便说一下,列号可以从运行到运行,这只是一个例子。)这是一个问题,因为代码的其余部分都有“正确”顺序的数组,我宁愿不需要重新排序它们。

Is there any clever pandas way of pulling in the columns in the order specified? Any help would be much appreciated!

有什么聪明的熊猫按指定的顺序拉柱的方法吗?如有任何帮助,我们将不胜感激!

2 个解决方案

#1


6  

you can reuse the same cols_to_use list for selecting columns in desired order:

您可以重用相同的cols_to_use列表来按需要的顺序选择列:

df_ret = pd.read_csv(filepath, index_col=False, usecols=cols_to_use)[cols_to_use]

#2


1  

Just piggybacking off this question here (hi from 2018).

在这里借用一下这个问题(从2018年开始)。

I discovered the same problem with my pandas read_csv and wanted to figure out a way to take the [col_reorder] using column header strings. It's as simple as defining an array of strings to use.

我在我的熊猫read_csv中发现了同样的问题,并希望找到一种方法,使用列头字符串获取[col_reorder]。它就像定义要使用的字符串数组一样简单。

pd.read_csv(filepath, index_col=False, usecols=cols_to_use)[index_strings]

#1


6  

you can reuse the same cols_to_use list for selecting columns in desired order:

您可以重用相同的cols_to_use列表来按需要的顺序选择列:

df_ret = pd.read_csv(filepath, index_col=False, usecols=cols_to_use)[cols_to_use]

#2


1  

Just piggybacking off this question here (hi from 2018).

在这里借用一下这个问题(从2018年开始)。

I discovered the same problem with my pandas read_csv and wanted to figure out a way to take the [col_reorder] using column header strings. It's as simple as defining an array of strings to use.

我在我的熊猫read_csv中发现了同样的问题,并希望找到一种方法,使用列头字符串获取[col_reorder]。它就像定义要使用的字符串数组一样简单。

pd.read_csv(filepath, index_col=False, usecols=cols_to_use)[index_strings]