在python中分组元组列表

I have a list that consist of tuples and I already sorted this list based on 2nd item. Then I want to make my list grouped based on the 2nd item, and put 1st item into list.

我有一个由元组组成的列表,我已经根据第二项对此列表进行了排序。然后我想根据第二项将我的列表分组,并将第一项放入列表中。

This is my input:

这是我的意见:

[('aaa', 1), ('bbb', 1), ('ccc', 2), ('ddd', 2), ('eee', 3)]

and what I need is this:

我需要的是这个:

[(g1, 1, ['aaa', 'bbb']), (g2, 2, ['ccc', 'ddd']), (g3, 1, ['eee'])]

Each tuple, 1st item is an id (increment). The second is how many item which grouped by its grouping, and 3rd item is list of grouped tuple. How this input could be implemented in python? Already trying with itertools, still get nothing. Any help would be appreciated.

每个元组,第1项是id(增量)。第二个是按分组分组的项目数,第3个项目是分组元组的列表。如何在python中实现此输入?已经尝试使用itertools,仍然一无所获。任何帮助,将不胜感激。

3 个解决方案

#1

One way would be to do it in steps:

一种方法是逐步完成:

>>> grouped = enumerate(groupby(seq, key=lambda x: x[1]), 1)
>>> extracted = ((i, [g[0] for g in gg]) for i, (k,gg) in grouped)
>>> final = [(i, len(x), x) for i,x in extracted]
>>> final
[(1, 2, ['aaa', 'bbb']), (2, 2, ['ccc', 'ddd']), (3, 1, ['eee'])]

But even though each line makes sense on its own, I think it's hard to see what it's actually doing. Using a generator function makes everything much clearer:

但即使每条线路本身都有意义,我认为很难看出它实际上在做什么。使用生成器函数可以使一切更清晰:

def grouper(elems):
    grouped = groupby(elems, key=lambda x: x[1])
    for i, (k, group) in enumerate(grouped, 1):
        vals = [g[0] for g in group]
        yield i, len(vals), vals

>> list(grouper(seq))
[(1, 2, ['aaa', 'bbb']), (2, 2, ['ccc', 'ddd']), (3, 1, ['eee'])]

(Here I've arbitrarily used an index starting at one for your g1/g2/g3; it'd be easy to replace it with yield 'g{}'.format(i) or something.)

(这里我任意使用一个从你的g1 / g2 / g3开始的索引;用yield'g {}'替换它很容易。格式(i)或其他东西。)

#2

In [5]: L = [('aaa', 1), ('bbb', 1), ('ccc', 2), ('ddd', 2), ('eee', 3)]

In [6]: for key, group in itertools.groupby(L, operator.itemgetter(1)):
   ...:     print(key, list(group))
   ...:     
1 [('aaa', 1), ('bbb', 1)]
2 [('ccc', 2), ('ddd', 2)]
3 [('eee', 3)]

In [7]: answer = []

In [8]: for k,group in itertools.groupby(L, operator.itemgetter(1)):
   ...:     answer.append((k, [g[0] for g in group]))
   ...:     

In [9]: answer
Out[9]: [(1, ['aaa', 'bbb']), (2, ['ccc', 'ddd']), (3, ['eee'])]

#3

If you know how to use collections module, it easily solves it.

如果您知道如何使用集合模块,它可以轻松解决它。

from collections import defaultdict

a = [('aaa', 1), ('bbb', 1), ('ccc', 2), ('ddd', 2), ('eee', 3)]

d = defaultdict(list)
for k, v in a:   
    d[v].append(k)

print d.items()
# [(1, ['aaa', 'bbb']), (2, ['ccc', 'ddd']), (3, ['eee'])]

#1