如何消除Pandas Dataframe中的对重复?

时间:2022-07-24 07:36:20

After reading mostly all the questions related to pair duplicates, no question address the following issue:

在阅读了大部分关于对重复的问题之后,没有问题解决以下问题:

Given a Df:

鉴于Df:

   Letter
0   a
1   b
2   c
3   d
4   a
5   b
6   a
7   a
8   a

Eliminate only pairs of duplicates. For example: as the Df have 5 a's, the solution is to eliminate the first two set of pairs of a's and leave the last a (order is important). The two b's are just eliminated because they are a set of pairs. The resulting Df would look like this:

仅消除一对重复项。例如:因为Df有5个a,所以解决方法是消除前两个a对并留下最后一个(顺序很重要)。两个b刚刚被淘汰,因为它们是一组对。生成的Df看起来像这样:

   Letter
2   c
3   d
8   a

I hope it was clear the issue. Thanks!

我希望这个问题很清楚。谢谢!

1 个解决方案

#1


0  

You can first get rid of letters with even number of rows, then use drop_duplicates.

你可以先删除偶数行的字母,然后使用drop_duplicates。

df.groupby('Letter').filter(lambda x: len(x)%2>0).drop_duplicates(keep="last")
Out[174]: 
  Letter
2      c
3      d
8      a

#1


0  

You can first get rid of letters with even number of rows, then use drop_duplicates.

你可以先删除偶数行的字母,然后使用drop_duplicates。

df.groupby('Letter').filter(lambda x: len(x)%2>0).drop_duplicates(keep="last")
Out[174]: 
  Letter
2      c
3      d
8      a