I have a dataFrame with rows and columns that sum to 0.
我有一个dataFrame,行和列总和为0。
A B C D
0 1 1 0 1
1 0 0 0 0
2 1 0 0 1
3 0 1 0 0
4 1 1 0 1
The end result should be
最终结果应该是
A B D
0 1 1 1
2 1 0 1
3 0 1 0
4 1 1 1
Notice the rows and columns that only had zeros have been removed.
请注意,已删除仅具有零的行和列。
3 个解决方案
#1
10
df.loc[row_indexer, column_indexer]
allows you to select rows and columns using boolean masks:
df.loc [row_indexer,column_indexer]允许您使用布尔掩码选择行和列:
In [88]: df.loc[(df.sum(axis=1) != 0), (df.sum(axis=0) != 0)]
Out[88]:
A B D
0 1 1 1
2 1 0 1
3 0 1 0
4 1 1 1
[4 rows x 3 columns]
df.sum(axis=1) != 0
is True if and only if the row does not sum to 0.
当且仅当行不总和为0时,df.sum(axis = 1)!= 0为True。
df.sum(axis=0) != 0
is True if and only if the column does not sum to 0.
当且仅当列不总和为0时,df.sum(axis = 0)!= 0为True。
#2
2
building on Drop rows with all zeros in pandas data frame to avoid using the sum()
在pandas数据框中删除全部为零的行以避免使用sum()
df = pd.DataFrame({'A': [1,0,1,0,1],
'B': [1,0,0,1,1],
'C': [0,0,0,0,0],
'D': [1,0,1,0,1]})
df.loc[(df!=0).any(1), (df!=0).any(0)]
A B D
0 1 1 1
2 1 0 1
3 0 1 0
4 1 1 1
#3
0
This is my way to do it:
这是我的方法:
import pandas as pd
hl = []
df = pd.read_csv("my.csv")
l = list(df.columns.values)
for l in l:
if sum(df[l]) != 0:
hl.append(l)
df2 = df[hl]
to write reduced_Data:
写reduce_Data:
df2.to_csv("my_reduced_data.csv")
It will only check columns but ignore Rows
它只会检查列但忽略行
#1
10
df.loc[row_indexer, column_indexer]
allows you to select rows and columns using boolean masks:
df.loc [row_indexer,column_indexer]允许您使用布尔掩码选择行和列:
In [88]: df.loc[(df.sum(axis=1) != 0), (df.sum(axis=0) != 0)]
Out[88]:
A B D
0 1 1 1
2 1 0 1
3 0 1 0
4 1 1 1
[4 rows x 3 columns]
df.sum(axis=1) != 0
is True if and only if the row does not sum to 0.
当且仅当行不总和为0时,df.sum(axis = 1)!= 0为True。
df.sum(axis=0) != 0
is True if and only if the column does not sum to 0.
当且仅当列不总和为0时,df.sum(axis = 0)!= 0为True。
#2
2
building on Drop rows with all zeros in pandas data frame to avoid using the sum()
在pandas数据框中删除全部为零的行以避免使用sum()
df = pd.DataFrame({'A': [1,0,1,0,1],
'B': [1,0,0,1,1],
'C': [0,0,0,0,0],
'D': [1,0,1,0,1]})
df.loc[(df!=0).any(1), (df!=0).any(0)]
A B D
0 1 1 1
2 1 0 1
3 0 1 0
4 1 1 1
#3
0
This is my way to do it:
这是我的方法:
import pandas as pd
hl = []
df = pd.read_csv("my.csv")
l = list(df.columns.values)
for l in l:
if sum(df[l]) != 0:
hl.append(l)
df2 = df[hl]
to write reduced_Data:
写reduce_Data:
df2.to_csv("my_reduced_data.csv")
It will only check columns but ignore Rows
它只会检查列但忽略行