I have a csv file with 50 columns of data. I am using Pandas read_csv function to pull in a subset of these columns, using the usecols parameter to choose the ones I want:
我有一个包含50列数据的csv文件。我正在使用panda read_csv函数来提取这些列的子集,使用usecols参数来选择我想要的列:
cols_to_use = [0,1,5,16,8]
df_ret = pd.read_csv(filepath, index_col=False, usecols=cols_to_use)
The trouble is df_ret contains the correct columns, but not in the order I specified. They are in ascending order, so [0,1,5,8,16]. (By the way the column numbers can change from run to run, this is just an example.) This is a problem because the rest of the code has arrays which are in the "correct" order and I would rather not have to reorder all of them.
问题是df_ret包含了正确的列,但不是按照我指定的顺序。它们是按升序排列的,所以[0,1,5,8,16]。(顺便说一下,列号可以从运行到运行,这只是一个例子。)这是一个问题,因为代码的其余部分都有“正确”顺序的数组,我宁愿不需要重新排序它们。
Is there any clever pandas way of pulling in the columns in the order specified? Any help would be much appreciated!
有什么聪明的熊猫按指定的顺序拉柱的方法吗?如有任何帮助,我们将不胜感激!
2 个解决方案
#1
6
you can reuse the same cols_to_use
list for selecting columns in desired order:
您可以重用相同的cols_to_use列表来按需要的顺序选择列:
df_ret = pd.read_csv(filepath, index_col=False, usecols=cols_to_use)[cols_to_use]
#2
1
Just piggybacking off this question here (hi from 2018).
在这里借用一下这个问题(从2018年开始)。
I discovered the same problem with my pandas read_csv and wanted to figure out a way to take the [col_reorder] using column header strings. It's as simple as defining an array of strings to use.
我在我的熊猫read_csv中发现了同样的问题,并希望找到一种方法,使用列头字符串获取[col_reorder]。它就像定义要使用的字符串数组一样简单。
pd.read_csv(filepath, index_col=False, usecols=cols_to_use)[index_strings]
#1
6
you can reuse the same cols_to_use
list for selecting columns in desired order:
您可以重用相同的cols_to_use列表来按需要的顺序选择列:
df_ret = pd.read_csv(filepath, index_col=False, usecols=cols_to_use)[cols_to_use]
#2
1
Just piggybacking off this question here (hi from 2018).
在这里借用一下这个问题(从2018年开始)。
I discovered the same problem with my pandas read_csv and wanted to figure out a way to take the [col_reorder] using column header strings. It's as simple as defining an array of strings to use.
我在我的熊猫read_csv中发现了同样的问题,并希望找到一种方法,使用列头字符串获取[col_reorder]。它就像定义要使用的字符串数组一样简单。
pd.read_csv(filepath, index_col=False, usecols=cols_to_use)[index_strings]