python内置函数:sorted中的参数key

时间:2022-07-29 19:18:33

x.sortsorted函数中参数key的使用

介绍

python中,列表自带了排序函数sort

>>> l = [1, 3, 2]
>>> l.sort()
>>> l
[1, 2, 3]

对于其他字典元组集合容器,可以使用内置方法sort来做排序,注意返回的结果是列表结构, 字典容器,默认是key进行排序的。

>>> # tuple sort
>>> t = (1, 3, 2)
>>> sorted(t)
[1, 2, 3]
>>>
>>> # set sort
>>> s = {1, 3, 2}
>>> sorted(s)
[1, 2, 3]
>>>
>>> # dict sort
>>> d = {1:100, 3:200, 2: 0}
>>> sorted(d)
[1, 2, 3]
>>> sorted(d.values())
[0, 100, 200]
>>> sorted(d.items())
[(1, 100), (2, 0), (3, 200)]
>>>

参数key的使用

先看一下sorted函数的文档说明

>>> help(sorted)
Help on built-in function sorted in module builtins: sorted(iterable, /, *, key=None, reverse=False)
Return a new list containing all items from the iterable in ascending order. A custom key function can be supplied to customize the sort order, and the
reverse flag can be set to request the result in descending order.

参数key是函数类型,用来支持自定义的排序方式。我们先看一个使用参数key的场景,比如:有一组员工工资单

Name salary age
Tom 4000 20
Jerry 4000 24
Bob 5000 28

希望可以按照工资从多到少排序,如果工资一样,按年龄从小到大排序

>>> salary_list = [
... ("Tom", 4000, 20),
... ("Jerry", 4000, 24),
... ("Bob", 5000, 28),
... ]
>>> # 自定义比较函数,返回值为元组(-salary, age)
>>> def mycmp(salary_item: tuple) -> (int, int):
... return (-salary_item[1], salary_item[2])
...
>>> sorted(salary_list, key=mycmp)
[('Bob', 5000, 28), ('Tom', 4000, 20), ('Jerry', 4000, 24)]

所以,为什么是这样呢?

我们来实现一个类CmpObject,在它的一些魔法方法中加入调试信息,来看看sorted函数是什么进行比较的

class CmpObject:
def __init__(self, name, val):
self.name = name
self.val = val def __neg__(self):
print("called __neg__", self.name, self.val)
return CmpObject(self.name, -self.val) def __eq__(self, other):
print("called __eq__")
return self.get_val() == other.get_val() def __lt__(self, other):
print("called __lt__")
return self.get_val() < other.get_val() def __gt__(self, other):
print("called __gt__")
return self.get_val() > other.get_val() def get_val(self):
print("called get_val", self.name, self.val)
return self.val def __repr__(self):
return f"{self.val}"
>>> # 初始化工资单
>>> salary_list = [
... ("Tom", CmpObject("Tom", 4000), CmpObject("Tom", 20)),
... ("Jerry", CmpObject("Jerry", 4000), CmpObject("Jerry", 24)),
... ("Bob", CmpObject("Bob", 5000), CmpObject("Bob", 28)),
... ]
>>> salary_list
[('Tom', 4000, 20), ('Jerry', 4000, 24), ('Bob', 5000, 28)]
>>> # 还是用原来的mycmp比较函数
>>> def mycmp(salary_item: tuple) -> (int, int):
... return (-salary_item[1], salary_item[2])
...
>>> # 执行排序
>>> sorted(salary_list, key=mycmp)
# 分析一下比较顺序
# sorted函数会先变量保存用于比较的key
called __neg__ Tom 4000
called __neg__ Jerry 4000
called __neg__ Bob 5000
# 新比较 -4000 -4000, 即Jerry和Tom的工资,因为是按工资倒序排(工资多的在前),所以比较的是工资的负数
called __eq__
called get_val Jerry -4000
called get_val Tom -4000
# 发现Jerry和Tom的工资相同,再去比较他们的年龄
called __eq__
called get_val Jerry 24
called get_val Tom 20
# 发现Jerry的年龄(24)大于Tom的年龄(20),Jerry要排在Tom后面
called __lt__
called get_val Jerry 24
called get_val Tom 20
# 然后比较 Bob和Jerry的工资,Bob的工资比Jerry的工资高,Jerry要排在Bob后面
called __eq__
called get_val Bob -5000
called get_val Jerry -4000
called __lt__
called get_val Bob -5000
called get_val Jerry -4000
# 又比较了 Bob和Jerry
called __eq__
called get_val Bob -5000
called get_val Jerry -4000
called __lt__
called get_val Bob -5000
called get_val Jerry -4000
# 最后比较Bob和Tom, 发现Bob的工资比Tom多,Tom要排在Bob后面
called __eq__
called get_val Bob -5000
called get_val Tom -4000
called __lt__
called get_val Bob -5000
called get_val Tom -4000 # 最后结果 Bob > Tom > Jerry
[('Bob', 5000, 28), ('Tom', 4000, 20), ('Jerry', 4000, 24)]

结论

1.sorted函数的排序策略是对于比较用的key,如果是元组形式,从左到右权重递减,(-工资, 年龄),如果-工资相同,则再比较年龄

2.比较是用使用__eq____lt__进行比较,如果__eq__为真,再比较__lt__,如果__lt__为真,则要调换顺序;否则,不变。