熊猫:从DataFrame提取特定的选定列到新的DataFrame [duplicate]

This question already has an answer here:

这个问题已经有了答案:

Selecting columns in a pandas dataframe 9 answers
在熊猫数据存储器中选择列9个答案

I have a pandas DataFrame with 4 columns and I want to create a new DataFrame that only has three of the columns. This question is similar to: Extracting specific columns from a data frame but for pandas not R. The following code does not work, raises an error, and is certainly not the pandasnic way to do it.

我有一个包含4列的熊猫DataFrame，我想创建一个只有3列的新DataFrame。这个问题类似于:从数据帧中提取特定的列，但是对于熊猫不是r。下面的代码不起作用，会产生错误，而且肯定不是熊猫人的方式。

import pandas as pd
old = pd.DataFrame({'A' : [4,5], 'B' : [10,20], 'C' : [100,50], 'D' : [-30,-50]})
new = pd.DataFrame(zip(old.A, old.C, old.D)) # raises TypeError: data argument can't be an iterator

What is the pandasnic way to do it?

熊猫世界的做法是什么?

1 个解决方案

#1

There is a way of doing this and it actually looks similar to R

有一种方法，它看起来和R很像

new = old[['A', 'C', 'D']].copy()

Here you are just selecting the columns you want from the original data frame and creating a variable for those. If you want to modify the new dataframe at all you'll probably want to use .copy() to avoid a SettingWithCopyWarning.

在这里，您只需从原始数据帧中选择所需的列，并为它们创建一个变量。如果您想修改新的dataframe，您可能需要使用.copy()来避免使用带有copywarning的SettingWithCopyWarning。

An alternative method is to use filter which will create a copy by default:

另一种方法是使用过滤器，默认情况下会创建一个副本:

new = old.filter(['A','B','D'], axis=1)

Finally, depending on the number of columns in your original dataframe, it might be more succinct to express this using a drop (this will also create a copy by default):

最后，根据原始dataframe中列的数量，使用drop来表示这一点可能更简洁(默认情况下也会创建一个副本):

new = old.drop('B', axis=1)

#1