I have two lists of elements that look like
我有两个看起来像的元素列表
a=[['10', 'name_1'],['50','name_2'],['40','name_3'], ..., ['80', 'name_N']]
b=[(10,40),(40,60),(60,90),(90,100)]
a
contains a set of data, and b
defines some intervals, my aim is to create a list c
with as many list as the intervals in b
. Each list in c
contains all the x
elements in a for which x[0]
is contained in the interval. Ex:
a包含一组数据,b定义了一些区间,我的目的是创建一个列表c,其列表与b中的区间一样多。 c中的每个列表包含a中包含x [0]的所有x个元素。例如:
c=[
[['10', 'name_1']],
[['50','name_2'],['40','name_3']],
[...,['80', 'name_N']]
]
4 个解决方案
#1
1
You can use collections.defaultdict
and bisect
module here:
您可以在此处使用collections.defaultdict和bisect模块:
As the ranges are continuous so it would be better to convert the list b
into something like this first:
因为范围是连续的所以最好将列表b转换为这样的第一个:
[10, 40, 60, 90, 100]
The advantage of this is that we can now use bisect
module to find the index where the items from a list can fit in. For example 50 will come between 40 and 60 so bisect.bisect_right
will return 2 in this case. No we can use this 2 as key and stores the list as it's value. This way we can group those items based on the index returned from bisect.bisect_right
.
这样做的好处是我们现在可以使用bisect模块来查找列表中的项目所适合的索引。例如,50将介于40和60之间,因此bisect.bisect_right在这种情况下将返回2。不,我们可以使用此2作为密钥并将列表存储为其值。这样我们就可以根据bisect.bisect_right返回的索引对这些项进行分组。
L_b = 2* len(b)
L_a = len(a)
L_b1 = len(b1)
The overall complexity is going to be : max ( L_b log L_b , L_a log L_b1 )
总体复杂性将是:max(L_b log L_b,L_a log L_b1)
>>> import bisect
>>> from collections import defaultdict
>>> b=[(10,40),(40,60),(60,90),(90,100)]
>>> b1 = sorted( set(z for x in b for z in x))
>>> b1
[10, 40, 60, 90, 100]
>>> dic = defaultdict(list)
for x,y in a:
#Now find the index where the value from the list can fit in the
#b1 list, bisect uses binary search so this is an O(log n ) step.
# use this returned index as key and append the list to that key.
ind = bisect.bisect_right(b1,int(x))
dic[ind].append([x,y])
...
>>> dic.values()
[[['10', 'name_1']], [['50', 'name_2'], ['40', 'name_3']], [['80', 'name_N']]]
As dicts don't have any specific order use sorting to get a sorted output:
由于dicts没有任何特定的顺序,因此使用排序来获取排序的输出:
>>> [dic[k] for k in sorted(dic)]
[[['10', 'name_1']], [['50', 'name_2'], ['40', 'name_3']], [['80', 'name_N']]]
#2
1
c = []
for r in b:
l = []
rn = range(*r)
for element in a:
if int(element[0]) in rn:
l.append(element)
c.append(l)
If your intervals are extremely large, consider using xrange
instead of range
. Actually, if your intervals are even moderately large, consider the following.
如果间隔非常大,请考虑使用xrange而不是range。实际上,如果你的间隔甚至是中等大小,请考虑以下几点。
c = []
for r in b:
l = []
for element in a:
if r[0] <= int(element[0]) < r[1]:
l.append(element)
c.append(l)
#3
0
You could do this:
你可以这样做:
>>> a=[['10', 'name_1'],['50','name_2'],['40','name_3'], ['80', 'name_N']]
>>> b=[(10,40),(40,60),(60,90),(90,100)]
>>> c=[]
>>> for t in b:
... f=list(filter(lambda l: t[0]<=int(l[0])<t[1],a))
... if f: c.append(f)
...
>>> c
[[['10', 'name_1']], [['50', 'name_2'], ['40', 'name_3']], [['80', 'name_N']]]
#4
0
Or you could do this:
或者你可以这样做:
>>> a=[['10', 'name_1'],['50','name_2'],['40','name_3'], ['80', 'name_N']]
>>> b=[(10,40),(40,60),(60,90),(90,100)]
>>> filter(None, [filter(lambda l: t[0]<=int(l[0])<t[1], a) for t in b])
[[['10', 'name_1']], [['50', 'name_2'], ['40', 'name_3']], [['80', 'name_N']]]
#1
1
You can use collections.defaultdict
and bisect
module here:
您可以在此处使用collections.defaultdict和bisect模块:
As the ranges are continuous so it would be better to convert the list b
into something like this first:
因为范围是连续的所以最好将列表b转换为这样的第一个:
[10, 40, 60, 90, 100]
The advantage of this is that we can now use bisect
module to find the index where the items from a list can fit in. For example 50 will come between 40 and 60 so bisect.bisect_right
will return 2 in this case. No we can use this 2 as key and stores the list as it's value. This way we can group those items based on the index returned from bisect.bisect_right
.
这样做的好处是我们现在可以使用bisect模块来查找列表中的项目所适合的索引。例如,50将介于40和60之间,因此bisect.bisect_right在这种情况下将返回2。不,我们可以使用此2作为密钥并将列表存储为其值。这样我们就可以根据bisect.bisect_right返回的索引对这些项进行分组。
L_b = 2* len(b)
L_a = len(a)
L_b1 = len(b1)
The overall complexity is going to be : max ( L_b log L_b , L_a log L_b1 )
总体复杂性将是:max(L_b log L_b,L_a log L_b1)
>>> import bisect
>>> from collections import defaultdict
>>> b=[(10,40),(40,60),(60,90),(90,100)]
>>> b1 = sorted( set(z for x in b for z in x))
>>> b1
[10, 40, 60, 90, 100]
>>> dic = defaultdict(list)
for x,y in a:
#Now find the index where the value from the list can fit in the
#b1 list, bisect uses binary search so this is an O(log n ) step.
# use this returned index as key and append the list to that key.
ind = bisect.bisect_right(b1,int(x))
dic[ind].append([x,y])
...
>>> dic.values()
[[['10', 'name_1']], [['50', 'name_2'], ['40', 'name_3']], [['80', 'name_N']]]
As dicts don't have any specific order use sorting to get a sorted output:
由于dicts没有任何特定的顺序,因此使用排序来获取排序的输出:
>>> [dic[k] for k in sorted(dic)]
[[['10', 'name_1']], [['50', 'name_2'], ['40', 'name_3']], [['80', 'name_N']]]
#2
1
c = []
for r in b:
l = []
rn = range(*r)
for element in a:
if int(element[0]) in rn:
l.append(element)
c.append(l)
If your intervals are extremely large, consider using xrange
instead of range
. Actually, if your intervals are even moderately large, consider the following.
如果间隔非常大,请考虑使用xrange而不是range。实际上,如果你的间隔甚至是中等大小,请考虑以下几点。
c = []
for r in b:
l = []
for element in a:
if r[0] <= int(element[0]) < r[1]:
l.append(element)
c.append(l)
#3
0
You could do this:
你可以这样做:
>>> a=[['10', 'name_1'],['50','name_2'],['40','name_3'], ['80', 'name_N']]
>>> b=[(10,40),(40,60),(60,90),(90,100)]
>>> c=[]
>>> for t in b:
... f=list(filter(lambda l: t[0]<=int(l[0])<t[1],a))
... if f: c.append(f)
...
>>> c
[[['10', 'name_1']], [['50', 'name_2'], ['40', 'name_3']], [['80', 'name_N']]]
#4
0
Or you could do this:
或者你可以这样做:
>>> a=[['10', 'name_1'],['50','name_2'],['40','name_3'], ['80', 'name_N']]
>>> b=[(10,40),(40,60),(60,90),(90,100)]
>>> filter(None, [filter(lambda l: t[0]<=int(l[0])<t[1], a) for t in b])
[[['10', 'name_1']], [['50', 'name_2'], ['40', 'name_3']], [['80', 'name_N']]]