I have the following list:
我有以下列表:
[('mail', 167, datetime.datetime(2010, 9, 29)) ,
('name', 1317, datetime.datetime(2011, 12, 12)),
('mail', 1045, datetime.datetime(2010, 8, 13)),
('name', 3, datetime.datetime(2011, 11, 3))]
And I want to remove items from the list with coinciding first item in a tuple where date is not the latest. In other words I need to get this:
并且我想从列表中删除项目,其中元组中的第一项是重合的,其中日期不是最新的。换句话说,我需要得到这个:
[('mail', 167, datetime.datetime(2010, 9, 29)) ,
('name', 1317, datetime.datetime(2011, 12, 12))]
5 个解决方案
#1
14
You can use a dictionary to store the highest value found for a given key so far:
到目前为止,您可以使用字典存储给定键的最高值:
temp = {}
for key, number, date in input_list:
if key not in temp: # we see this key for the first time
temp[key] = (key, number, date)
else:
if temp[key][2] < date: # the new date is larger than the old one
temp[key] = (key, number, date)
result = temp.values()
#2
2
The following approach uses a dictionary to overwrite entries with the same key. Since the list is sorted by the date, older entries get overwritten by newer ones.
以下方法使用字典覆盖具有相同键的条目。由于列表按日期排序,较旧的条目会被较新的条目覆盖。
temp = {}
for v in sorted(L, key=lambda L: L[2]): # where L is your list
temp[v[0]] = v
result = temp.values()
Or, for something a lot more compact (but much less readable):
或者,对于更紧凑(但可读性更低)的东西:
result = dict((v[0],v) for v in sorted(L, key=lambda L: L[2])).values()
Update
This method would be reasonably quick if the list is already (or mostly) sorted by date. If it isn't, and especially if it is a large list, then this may not be the best approach.
如果列表已经(或大部分)按日期排序,则此方法将相当快。如果不是,特别是如果它是一个大型列表,那么这可能不是最好的方法。
For unsorted lists, you will likely get a some performance improvement by sorting by the key first, then the date. i.e. sorted(L, key=lambda L: (L[0],L[2]))
.
对于未排序的列表,您可能会先通过键排序,然后按日期排序,从而获得一些性能提升。即排序(L,key =λL:(L [0],L [2]))。
Or, better yet, go for Space_C0wb0y's answer.
或者,更好的是,去找Space_C0wb0y的答案。
#3
0
d = {}
for item in list:
if (item[0], item[1]) not in d:
d[(item[0], item[1])] = item[2]
else:
if item[2] > d[(item[0], item[1])]:
d[(item[0], item[1])] = item[2]
item = [(x[0], x[1], d[x] for x in d.keys()]
#4
0
You can do it via sorting the list and getting the highest values by d[2]:
你可以通过对列表进行排序并通过d [2]获得最高值来实现:
In [26]: d
Out[26]:
[('mail', 167, datetime.datetime(2010, 9, 29, 0, 0)),
('name', 1317, datetime.datetime(2011, 12, 12, 0, 0)),
('mail', 1045, datetime.datetime(2010, 8, 13, 0, 0)),
('name', 3, datetime.datetime(2011, 11, 3, 0, 0))]
In [27]: d.sort(key = lambda i: i[2], reverse=True)
In [28]: d
Out[28]:
[('name', 1317, datetime.datetime(2011, 12, 12, 0, 0)),
('name', 3, datetime.datetime(2011, 11, 3, 0, 0)),
('mail', 167, datetime.datetime(2010, 9, 29, 0, 0)),
('mail', 1045, datetime.datetime(2010, 8, 13, 0, 0))]
In [29]: [i for pos, i in enumerate(d) if i[0] in [j[0] for j in d[pos+1:]]]
Out[29]:
[('name', 1317, datetime.datetime(2011, 12, 12, 0, 0)),
('mail', 167, datetime.datetime(2010, 9, 29, 0, 0))]
#5
-1
Here you go.
干得好。
#!/usr/bin/python2
from pprint import pprint
import datetime
ol = [('mail', 167, datetime.datetime(2010, 9, 29)) ,
('name', 1317, datetime.datetime(2011, 12, 12)),
('mail', 1045, datetime.datetime(2010, 8, 13)),
('name', 3, datetime.datetime(2011, 11, 3))]
d = {}
for t in sorted(ol, key=lambda t: (t[0], t[2])):
d[t[0]] = t
out = d.values()
pprint(out)
That sorts the list using the first and third tuple elements as keys, then removes duplicates by using a hash table.
这使用第一个和第三个元组元素作为键对列表进行排序,然后使用哈希表删除重复项。
#1
14
You can use a dictionary to store the highest value found for a given key so far:
到目前为止,您可以使用字典存储给定键的最高值:
temp = {}
for key, number, date in input_list:
if key not in temp: # we see this key for the first time
temp[key] = (key, number, date)
else:
if temp[key][2] < date: # the new date is larger than the old one
temp[key] = (key, number, date)
result = temp.values()
#2
2
The following approach uses a dictionary to overwrite entries with the same key. Since the list is sorted by the date, older entries get overwritten by newer ones.
以下方法使用字典覆盖具有相同键的条目。由于列表按日期排序,较旧的条目会被较新的条目覆盖。
temp = {}
for v in sorted(L, key=lambda L: L[2]): # where L is your list
temp[v[0]] = v
result = temp.values()
Or, for something a lot more compact (but much less readable):
或者,对于更紧凑(但可读性更低)的东西:
result = dict((v[0],v) for v in sorted(L, key=lambda L: L[2])).values()
Update
This method would be reasonably quick if the list is already (or mostly) sorted by date. If it isn't, and especially if it is a large list, then this may not be the best approach.
如果列表已经(或大部分)按日期排序,则此方法将相当快。如果不是,特别是如果它是一个大型列表,那么这可能不是最好的方法。
For unsorted lists, you will likely get a some performance improvement by sorting by the key first, then the date. i.e. sorted(L, key=lambda L: (L[0],L[2]))
.
对于未排序的列表,您可能会先通过键排序,然后按日期排序,从而获得一些性能提升。即排序(L,key =λL:(L [0],L [2]))。
Or, better yet, go for Space_C0wb0y's answer.
或者,更好的是,去找Space_C0wb0y的答案。
#3
0
d = {}
for item in list:
if (item[0], item[1]) not in d:
d[(item[0], item[1])] = item[2]
else:
if item[2] > d[(item[0], item[1])]:
d[(item[0], item[1])] = item[2]
item = [(x[0], x[1], d[x] for x in d.keys()]
#4
0
You can do it via sorting the list and getting the highest values by d[2]:
你可以通过对列表进行排序并通过d [2]获得最高值来实现:
In [26]: d
Out[26]:
[('mail', 167, datetime.datetime(2010, 9, 29, 0, 0)),
('name', 1317, datetime.datetime(2011, 12, 12, 0, 0)),
('mail', 1045, datetime.datetime(2010, 8, 13, 0, 0)),
('name', 3, datetime.datetime(2011, 11, 3, 0, 0))]
In [27]: d.sort(key = lambda i: i[2], reverse=True)
In [28]: d
Out[28]:
[('name', 1317, datetime.datetime(2011, 12, 12, 0, 0)),
('name', 3, datetime.datetime(2011, 11, 3, 0, 0)),
('mail', 167, datetime.datetime(2010, 9, 29, 0, 0)),
('mail', 1045, datetime.datetime(2010, 8, 13, 0, 0))]
In [29]: [i for pos, i in enumerate(d) if i[0] in [j[0] for j in d[pos+1:]]]
Out[29]:
[('name', 1317, datetime.datetime(2011, 12, 12, 0, 0)),
('mail', 167, datetime.datetime(2010, 9, 29, 0, 0))]
#5
-1
Here you go.
干得好。
#!/usr/bin/python2
from pprint import pprint
import datetime
ol = [('mail', 167, datetime.datetime(2010, 9, 29)) ,
('name', 1317, datetime.datetime(2011, 12, 12)),
('mail', 1045, datetime.datetime(2010, 8, 13)),
('name', 3, datetime.datetime(2011, 11, 3))]
d = {}
for t in sorted(ol, key=lambda t: (t[0], t[2])):
d[t[0]] = t
out = d.values()
pprint(out)
That sorts the list using the first and third tuple elements as keys, then removes duplicates by using a hash table.
这使用第一个和第三个元组元素作为键对列表进行排序,然后使用哈希表删除重复项。