结合两个dicts列表，将值相加

I want to combine two lists of multiple dicts into a new list of dicts, appending new dicts to the final list, and adding together the 'views' values if encountered.

我想将两个多个dicts列表组合成一个新的dicts列表,将新的dicts附加到最终列表,并在遇到时将“views”值相加。

a = [{'title': 'Learning How to Program', 'views': 1,'url': '/4XvR', 'slug': 'learning-how-to-program'},
     {'title': 'Mastering Programming', 'views': 3,'url': '/7XqR', 'slug': 'mastering-programming'}]

b = [{'title': 'Learning How to Program', 'views': 7,'url': '/4XvR', 'slug': 'learning-how-to-program'},
     {'title': 'Mastering Programming', 'views': 2,'url': '/7XqR', 'slug': 'mastering-programming'},
     {'title': 'Programming Fundamentals', 'views': 1,'url': '/93hB', 'slug': 'programming-fundamentals'}]

And the desired output would be:

而期望的输出将是:

c = [{'title': 'Learning How to Program', 'views': 8,'url': '/4XvR', 'slug': 'learning-how-to-program'},
     {'title': 'Mastering Programming', 'views': 5,'url': '/7XqR', 'slug': 'mastering-programming'},
     {'title': 'Programming Fundamentals', 'views': 1,'url': '/93hB', 'slug': 'programming-fundamentals'}]

I found Is there any pythonic way to combine two dicts (adding values for keys that appear in both)? -- however I do not understand how to get the desired output in my situation, having two lists of multiple dicts.

我发现是否有任何pythonic方法来组合两个dicts(为两个键中出现的键添加值)? - 但是我不明白如何在我的情况下获得所需的输出,有两个多个dicts列表。

7 个解决方案

#1

You need to convert your input dictionaries to (title: count) pairs, using them as keys and values in a Counter; then after summing, you can convert these back to your old format:

您需要将输入字典转换为(标题:计数)对,将它们用作计数器中的键和值;然后在总结之后,您可以将它们转换回旧格式:

from collections import Counter

summed = sum((Counter({elem['title']: elem['views']}) for elem in a + b), Counter())
c = [{'title': title, 'views': counts} for title, counts in summed.items()]

Demo:

>>> from collections import Counter
>>> a = [{'title': 'Learning How to Program', 'views': 1},
...      {'title': 'Mastering Programming', 'views': 3}]
>>> b = [{'title': 'Learning How to Program', 'views': 7},
...      {'title': 'Mastering Programming', 'views': 2},
...      {'title': 'Programming Fundamentals', 'views': 1}]
>>> summed = sum((Counter({elem['title']: elem['views']}) for elem in a + b), Counter())
>>> summed
Counter({'Learning How to Program': 8, 'Mastering Programming': 5, 'Programming Fundamentals': 1})
>>> [{'title': title, 'views': counts} for title, counts in summed.items()]
[{'views': 8, 'title': 'Learning How to Program'}, {'views': 5, 'title': 'Mastering Programming'}, {'views': 1, 'title': 'Programming Fundamentals'}]

The goal here is to have a unique identifier per count. If your dictionaries are more complex, you either need to convert the whole dictionary (minus the count) to a unique identifier, or pick one of the values from the dictionary to be that identifier. Then sum the view counts per identifier.

这里的目标是每个计数都有一个唯一的标识符。如果您的词典更复杂,您需要将整个字典(减去计数)转换为唯一标识符,或者从字典中选择一个值作为该标识符。然后将每个标识符的视图计数相加。

From your updated example, the URL would be a good identifier. That'd let you collect the view count in place:

从您更新的示例中,URL将是一个很好的标识符。那可以让你收集到的视图数:

per_url = {}
for entry in a + b:
    key = entry['url']
    if key not in per_url:
        per_url[key] = entry.copy()
    else:
        per_url[key]['views'] += entry['views']

c = per_url.values()  # use list(per_url.values()) on Python 3

This simply uses the dictionaries themselves (or at least a copy of the first one encountered) to sum the view counts:

这只是使用字典本身(或至少是遇到的第一个字典的副本)来总结视图计数:

>>> from pprint import pprint
>>> a = [{'title': 'Learning How to Program', 'views': 1,'url': '/4XvR', 'slug': 'learning-how-to-program'},
...      {'title': 'Mastering Programming', 'views': 3,'url': '/7XqR', 'slug': 'mastering-programming'}]
>>> b = [{'title': 'Learning How to Program', 'views': 7,'url': '/4XvR', 'slug': 'learning-how-to-program'},
...      {'title': 'Mastering Programming', 'views': 2,'url': '/7XqR', 'slug': 'mastering-programming'},
...      {'title': 'Programming Fundamentals', 'views': 1,'url': '/93hB', 'slug': 'programming-fundamentals'}]
>>> per_url = {}
>>> for entry in a + b:
...     key = entry['url']
...     if key not in per_url:
...         per_url[key] = entry.copy()
...     else:
...         per_url[key]['views'] += entry['views']
... 
>>> per_url
{'/93hB': {'url': '/93hB', 'title': 'Programming Fundamentals', 'slug': 'programming-fundamentals', 'views': 1}, '/4XvR': {'url': '/4XvR', 'title': 'Learning How to Program', 'slug': 'learning-how-to-program', 'views': 8}, '/7XqR': {'url': '/7XqR', 'title': 'Mastering Programming', 'slug': 'mastering-programming', 'views': 5}}
>>> pprint(per_url.values())
[{'slug': 'programming-fundamentals',
  'title': 'Programming Fundamentals',
  'url': '/93hB',
  'views': 1},
 {'slug': 'learning-how-to-program',
  'title': 'Learning How to Program',
  'url': '/4XvR',
  'views': 8},
 {'slug': 'mastering-programming',
  'title': 'Mastering Programming',
  'url': '/7XqR',
  'views': 5}]

#2

First, you need to convert your inputs into dicts, for example

首先,您需要将输入转换为dicts

b = {'Learning How to Program': 7,
     'Mastering Programming': 2,
     'Programming Fundamentals': 1}

After that, apply the solution you found, then convert it back to list of dicts.

之后,应用您找到的解决方案,然后将其转换回dicts列表。

#3

Here's a simple one. Walks over all entries, copies an entry the first time it's encountered, and adds the views in subsequent encounters:

这是一个简单的。遍历所有条目,在第一次遇到条目时复制条目,并在后续遭遇中添加视图:

summary = {}    
for entry in a + b:
    key = entry['url']
    if key not in summary:
        summary[key] = entry.copy()
    else:
        summary[key]['views'] += entry['views']
c = list(summary.values())

#4

It might may not be the most pythonic solution:

它可能不是最pythonic的解决方案:

def coalesce(d1,d2):
    combined = [i for i in d1]
    for d in d2:
        found = False
        for itr in combined:          
            if itr['title'] == d['title']:
                itr['views'] += d['views']
                found = True
                break
        if not found:
             combined.append(d)
     return combined

#5

Non-optimal, but works:

非最佳,但有效:

>>> from collections import Counter
>>> from pprint import pprint
>>> a = [{'title': 'Learning How to Program', 'views': 1,'url': '/4XvR', 'slug': 'learning-how-to-program'},
...      {'title': 'Mastering Programming', 'views': 3,'url': '/7XqR', 'slug': 'mastering-programming'}]
>>> b = [{'title': 'Learning How to Program', 'views': 7,'url': '/4XvR', 'slug': 'learning-how-to-program'},
...      {'title': 'Mastering Programming', 'views': 2,'url': '/7XqR', 'slug': 'mastering-programming'},
...      {'title': 'Programming Fundamentals', 'views': 1,'url': '/93hB', 'slug': 'programming-fundamentals'}]
>>> summed = sum((Counter({x['slug']: x['views']}) for x in a+b), Counter())
>>> c = dict()
>>> _ = [c.update({x['slug']: x}) for x in a + b]
>>> _ = [c[x].update({'views': summed[x]}) for x in c.keys()]
>>> pprint(c.values())
[{'slug': 'mastering-programming',
  'title': 'Mastering Programming',
  'url': '/7XqR',
  'views': 5},
 {'slug': 'programming-fundamentals',
  'title': 'Programming Fundamentals',
  'url': '/93hB',
  'views': 1},
 {'slug': 'learning-how-to-program',
  'title': 'Learning How to Program',
  'url': '/4XvR',
  'views': 8}]

Based on the Counter idea from Martijn with some more iterations to update the counter values with the other attributes, assuming they don't change.

基于Martijn的Counter理念,使用其他属性更新计数器值,假设它们不会改变。

Note that there are some "encrypted" loops in the generators...

请注意,生成器中有一些“加密”循环......

#6

A simple function that does what you need for any given number of lists:

一个简单的函数,可以为任何给定数量的列表执行所需的操作:

import itertools
from collections import Counter, OrderedDict

def sum_views(*lists):
    views = Counter()
    docs = OrderedDict()  # to preserve input order
    for doc in itertools.chain(*lists):
        slug = doc['slug']
        views[slug] += doc['views']
        docs[slug] = dict(doc)   # shallow copy of original dict
        docs[slug]['views'] = views[slug]
    return docs.values()

#7

Assuming that you don't want to title it as "title" and "views". More professional way is to write it this way:

假设您不想将其标题为“标题”和“视图”。更专业的方式是这样写:

  def combing(x):
     result = {}
     for i in x:
        h = i.values()
        result[h[0]] = result.get(h[0],0)+ h[1]
     return result

combing([{'item': 'item1', 'amount': 400}, {'item': 'item2', 'amount': 
300}, {'item': 'item1', 'amount': 750}])

#1