I want to count the rows by conditions on multiple columns and remove the rows by the conditions.
我想按多列上的条件计数行,按条件删除行。
I want to count the rows if v1
and v2
have the same value, such that my df has 2 rows on (v1=0, v2=30), 1 row on (v1 = 0, v2 = 15; v1 = 0, v2 = 20), 2 rows on (v1 = 15, v2 = 10), 3 rows on (v1 = 10, v2 = 10). Then remove the rows if the v1 and v2 don't have 2 rows, in this case, remove (v1 = 0, v2 = 15; v1 = 0, v2 = 20) and (v1 = 10, v2 = 10).
如果v1和v2具有相同的值,我想计算行数,这样我的df上有2行(v1=0, v2=30), 1行(v1=0, v2= 15);v1 = 0 v2 = 20, 2行(v1 = 15, v2 = 10), 3行(v1 = 10, v2 = 10)。然后移除行如果v1和v2没有2行,在这种情况下,移除(v1 = 0, v2 = 15;v1 = 0 v2 = 20)和(v1 = 10 v2 = 10)
df
df
ID v1 v2
1 0 30
1 15 10
1 0 30
1 0 15
1 0 20
1 15 10
1 10 10
1 10 10
1 10 10
expected output
预期的输出
ID v1 v2
1 0 30
1 0 30
1 15 10
1 15 10
I groupby
the values first, but not sure what conditions should I write for the removal.
我先按值分组,但不确定删除时应该写什么条件。
df[df.groupby(['ID', 'v_1', 'v_2'])]
Hope I explain clearly.
希望我解释清楚。
Thanks!
谢谢!
1 个解决方案
#1
2
IIUC
IIUC
df[df.duplicated(keep=False)]
Out[29]:
ID v1 v2
0 1 0 30
1 1 15 10
2 1 0 30
5 1 15 10
#1
2
IIUC
IIUC
df[df.duplicated(keep=False)]
Out[29]:
ID v1 v2
0 1 0 30
1 1 15 10
2 1 0 30
5 1 15 10