删除所有值为零的所有列

时间:2020-12-20 12:33:57

I have a simple question which relates to similar questions here, and here.

我有一个简单的问题,与此处的类似问题有关。

I am trying to drop all columns from a pandas dataframe, which have only zeroes (vertically, axis=1). Let me give you an example:

我试图从pandas数据帧中删除所有列,这些数据帧只有零(垂直,轴= 1)。让我举一个例子:

df = pd.DataFrame({'a':[0,0,0,0], 'b':[0,-1,0,1]})

    a   b
0   0   0
1   0  -1
2   0   0
3   0   1

I'd like to drop column asince it has only zeroes.

我想删除列,因为它只有零。

However, I'd like to do it in a nice and vectorized fashion if possible. My data set is huge - so I don't want to loop. Hence I tried

但是,如果可能的话,我想以漂亮和矢量化的方式做到这一点。我的数据集很大 - 所以我不想循环。因此我试过了

df = df.loc[(df).any(1), (df!=0).any(0)]

    b
1  -1
3   1

Which allows me to drop both columns and rows. But if I just try to drop the columns, locseems to fail. Any ideas?

这允许我删除列和行。但是,如果我只是试图删除列,则看似失败。有任何想法吗?

3 个解决方案

#1


4  

If it's a matter of 0s and not sum, use df.any:

如果它是0的问题而不是总和,请使用df.any:

In [291]: df.T[df.any()].T
Out[291]: 
   b
0  0
1 -1
2  0
3  1

Alternatively:

或者:

In [296]: df.T[(df != 0).any()].T # or df.loc[:, (df != 0).any()]
Out[296]: 
   b
0  0
1 -1
2  0
3  1

#2


5  

You are really close, use any - 0 are casted to Falses:

你真的很接近,使用任何 - 0被铸造到Falses:

df = df.loc[:, df.any()]
print (df)

   b
0  0
1  1
2  0
3  1

#3


4  

In [73]: df.loc[:, df.ne(0).any()]
Out[73]:
   b
0  0
1  1
2  0
3  1

or:

要么:

In [71]: df.loc[:, ~df.eq(0).all()]
Out[71]:
   b
0  0
1  1
2  0
3  1

If we want to check those that do NOT sum up to 0:

如果我们要检查那些不总和为0的那些:

In [78]: df.loc[:, df.sum().astype(bool)]
Out[78]:
   b
0  0
1  1
2  0
3  1

#1


4  

If it's a matter of 0s and not sum, use df.any:

如果它是0的问题而不是总和,请使用df.any:

In [291]: df.T[df.any()].T
Out[291]: 
   b
0  0
1 -1
2  0
3  1

Alternatively:

或者:

In [296]: df.T[(df != 0).any()].T # or df.loc[:, (df != 0).any()]
Out[296]: 
   b
0  0
1 -1
2  0
3  1

#2


5  

You are really close, use any - 0 are casted to Falses:

你真的很接近,使用任何 - 0被铸造到Falses:

df = df.loc[:, df.any()]
print (df)

   b
0  0
1  1
2  0
3  1

#3


4  

In [73]: df.loc[:, df.ne(0).any()]
Out[73]:
   b
0  0
1  1
2  0
3  1

or:

要么:

In [71]: df.loc[:, ~df.eq(0).all()]
Out[71]:
   b
0  0
1  1
2  0
3  1

If we want to check those that do NOT sum up to 0:

如果我们要检查那些不总和为0的那些:

In [78]: df.loc[:, df.sum().astype(bool)]
Out[78]:
   b
0  0
1  1
2  0
3  1