I am a python newbie here, and I have been struck on a rather simple problem - and I am looking for the most efficient way to solve this. So, I have 5 lists as follows:
我在这里是一个python新手,我遇到了一个相当简单的问题——我正在寻找解决这个问题的最有效的方法。因此,我有5个列表如下:
a,b,c,d,score
where the above lists all have the same size (500 in my case). a,b,c,d
are string lists and score
is an int
list.
上面的列表都有相同的大小(在我的例子中是500)。a,b,c,d是字符串列表,score是一个int列表。
What I would like to do is sort a,b,c,d
based on ascending or descending sorting of score
. So, I would first want to sort score
based on a descending pattern, and then sort the corresponding elements in a,b,c,d
based on the sorted score list (in the same order).
我想做的是排序a,b,c,d,基于升序或降序排序。因此,我首先要根据下降的模式对分数进行排序,然后根据排序的分数列表(按照相同的顺序)对a、b、c、d中的对应元素进行排序。
I was thinking of enumerate
to achieve this, but I am wondering if itertools
could be used here to make it faster and more efficient.
我在考虑枚举来实现这一点,但是我想知道是否可以在这里使用itertools使它更快更高效。
Any guidance on how this can be achieved would be much appreciated && sorry if this is a 101 question.
任何关于如何实现这一目标的指导都将非常感谢,如果这是101个问题的话,我将非常抱歉。
2 个解决方案
#1
7
sorted_lists = sorted(izip(a, b, c, d, score), reverse=True, key=lambda x: x[4])
a, b, c, d, score = [[x[i] for x in sorted_lists] for i in range(5)]
In this first step, zip
the lists together. This takes the first element from every list and puts them into a tuple, appends that tuple to a new list, then does the same for the second element in every list, and so on. Then we sort this list of tuples by the fifth element (this is from the anonymous function passed into the key
argument). We set reverse=True
so that the list is descending.
在第一步中,将列表压缩到一起。这将从每个列表中获取第一个元素,并将它们放入一个tuple中,将tuple添加到一个新列表中,然后对每个列表中的第二个元素执行相同的操作,以此类推。然后,我们将这个元组列表按第五元素排序(这是由传递到关键参数中的匿名函数)。我们设置反向=True,使列表下降。
In the second step, we split the lists out using some nested list comprehensions and tuple unpacking. We make a new list of lists, where each inner list is all the first elements of each tuple in sorted_lists
. You could do this in one line as below, but I think splitting it into two pieces may be a bit clearer:
在第二步中,我们使用一些嵌套的列表理解和元组解压缩列表。我们创建了一个列表的新列表,其中每个内部列表都是sorted_list中每个tuple的第一个元素。你可以在下面一行做这个,但是我认为把它分成两部分可能会更清楚一些:
a, b, c, d, score = izip(*sorted(izip(a, b, c, d, score), reverse=True,
key=lambda x: x[4]))
Here is a generic function that returns a list of tuples, where the tuples are the sorted lists:
这是一个返回元组列表的泛型函数,其中元组是已排序的列表:
def sort_lists_by(lists, key_list=0, desc=False):
return izip(*sorted(izip(*lists), reverse=desc,
key=lambda x: x[key_list]))
#2
2
If you are doing a lot of numerical work or array manipulation, it might be worth looking into using numpy
. This problem is very easily solved with a numpy array:
如果您正在做大量的数字工作或数组操作,那么使用numpy可能是值得的。这个问题很容易用一个numpy数组解决:
In [1]: import numpy as np
In [2]: a = ['hi','hello']
In [3]: b = ['alice','bob']
In [4]: c = ['foo','bar']
In [5]: d = ['spam','eggs']
In [6]: score = [42,17]
From this, make a list of tuples in the format (a,b,c,d,score)
and store each one with a dtype (str,str,str,str,int)
, and you can even give them names ('a','b','c','d','score')
to access them later:
从这里,将tuple的列表(a、b、c、d、score)列出来,并将每个元组存储为dtype (str、str、str、str、int),甚至可以给它们命名('a'、'b'、'c'、'd'、'score'),以便稍后访问它们:
In [7]: data = np.array(zip(a,b,c,d,score),
...: dtype = [('a','S5'),('b','S5'),('c','S5'),('d','S5'),('score',int)]
...: )
In [8]: data
Out[8]:
array([('hi', 'alice', 'foo', 'spam', 42),
('hello', 'bob', 'bar', 'eggs', 17)],
dtype=[('a', 'S5'), ('b', 'S5'), ('c', 'S5'), ('d', 'S5'), ('score', '<i8')])
The advantage of this array is that you can access all the 'lists' (fields) by their name:
这个数组的优点是可以通过它们的名称访问所有的“列表”(字段):
In [9]: data['a']
Out[9]:
array(['hi', 'hello'],
dtype='|S5')
In [10]: data['score']
Out[10]: array([42, 17])
To sort them, just give the name of field you want to sort by:
要对它们进行排序,只需给出要排序的字段的名称:
In [11]: sdata = np.sort(data, order='score')
In [12]: sdata
Out[12]:
array([('hello', 'bob', 'bar', 'eggs', 17),
('hi', 'alice', 'foo', 'spam', 42)],
dtype=[('a', 'S5'), ('b', 'S5'), ('c', 'S5'), ('d', 'S5'), ('score', '<i8')])
In [13]: sdata['b']
Out[13]:
array(['bob', 'alice'],
dtype='|S5')
#1
7
sorted_lists = sorted(izip(a, b, c, d, score), reverse=True, key=lambda x: x[4])
a, b, c, d, score = [[x[i] for x in sorted_lists] for i in range(5)]
In this first step, zip
the lists together. This takes the first element from every list and puts them into a tuple, appends that tuple to a new list, then does the same for the second element in every list, and so on. Then we sort this list of tuples by the fifth element (this is from the anonymous function passed into the key
argument). We set reverse=True
so that the list is descending.
在第一步中,将列表压缩到一起。这将从每个列表中获取第一个元素,并将它们放入一个tuple中,将tuple添加到一个新列表中,然后对每个列表中的第二个元素执行相同的操作,以此类推。然后,我们将这个元组列表按第五元素排序(这是由传递到关键参数中的匿名函数)。我们设置反向=True,使列表下降。
In the second step, we split the lists out using some nested list comprehensions and tuple unpacking. We make a new list of lists, where each inner list is all the first elements of each tuple in sorted_lists
. You could do this in one line as below, but I think splitting it into two pieces may be a bit clearer:
在第二步中,我们使用一些嵌套的列表理解和元组解压缩列表。我们创建了一个列表的新列表,其中每个内部列表都是sorted_list中每个tuple的第一个元素。你可以在下面一行做这个,但是我认为把它分成两部分可能会更清楚一些:
a, b, c, d, score = izip(*sorted(izip(a, b, c, d, score), reverse=True,
key=lambda x: x[4]))
Here is a generic function that returns a list of tuples, where the tuples are the sorted lists:
这是一个返回元组列表的泛型函数,其中元组是已排序的列表:
def sort_lists_by(lists, key_list=0, desc=False):
return izip(*sorted(izip(*lists), reverse=desc,
key=lambda x: x[key_list]))
#2
2
If you are doing a lot of numerical work or array manipulation, it might be worth looking into using numpy
. This problem is very easily solved with a numpy array:
如果您正在做大量的数字工作或数组操作,那么使用numpy可能是值得的。这个问题很容易用一个numpy数组解决:
In [1]: import numpy as np
In [2]: a = ['hi','hello']
In [3]: b = ['alice','bob']
In [4]: c = ['foo','bar']
In [5]: d = ['spam','eggs']
In [6]: score = [42,17]
From this, make a list of tuples in the format (a,b,c,d,score)
and store each one with a dtype (str,str,str,str,int)
, and you can even give them names ('a','b','c','d','score')
to access them later:
从这里,将tuple的列表(a、b、c、d、score)列出来,并将每个元组存储为dtype (str、str、str、str、int),甚至可以给它们命名('a'、'b'、'c'、'd'、'score'),以便稍后访问它们:
In [7]: data = np.array(zip(a,b,c,d,score),
...: dtype = [('a','S5'),('b','S5'),('c','S5'),('d','S5'),('score',int)]
...: )
In [8]: data
Out[8]:
array([('hi', 'alice', 'foo', 'spam', 42),
('hello', 'bob', 'bar', 'eggs', 17)],
dtype=[('a', 'S5'), ('b', 'S5'), ('c', 'S5'), ('d', 'S5'), ('score', '<i8')])
The advantage of this array is that you can access all the 'lists' (fields) by their name:
这个数组的优点是可以通过它们的名称访问所有的“列表”(字段):
In [9]: data['a']
Out[9]:
array(['hi', 'hello'],
dtype='|S5')
In [10]: data['score']
Out[10]: array([42, 17])
To sort them, just give the name of field you want to sort by:
要对它们进行排序,只需给出要排序的字段的名称:
In [11]: sdata = np.sort(data, order='score')
In [12]: sdata
Out[12]:
array([('hello', 'bob', 'bar', 'eggs', 17),
('hi', 'alice', 'foo', 'spam', 42)],
dtype=[('a', 'S5'), ('b', 'S5'), ('c', 'S5'), ('d', 'S5'), ('score', '<i8')])
In [13]: sdata['b']
Out[13]:
array(['bob', 'alice'],
dtype='|S5')