为什么在Python的定义时间评估默认参数?

时间:2021-06-28 05:27:42

I had a very difficult time with understanding the root cause of a problem in an algorithm. Then, by simplifying the functions step by step I found out that evaluation of default arguments in Python doesn't behave as I expected.

我很难理解算法中问题的根本原因。然后,通过逐步简化函数,我发现在Python中对默认参数的评估并不像我预期的那样。

The code is as follows:

代码如下:

class Node(object):
    def __init__(self, children = []):
        self.children = children

The problem is that every instance of Node class shares the same children attribute, if the attribute is not given explicitly, such as:

问题是,如果未明确给出属性,则每个Node类实例共享相同的子属性,例如:

>>> n0 = Node()
>>> n1 = Node()
>>> id(n1.children)
Out[0]: 25000176
>>> id(n0.children)
Out[0]: 25000176

I don't understand the logic of this design decision? Why did Python designers decide that default arguments are to be evaluated at definition time? This seems very counter-intuitive to me.

我不明白这个设计决定的逻辑?为什么Python设计者决定在定义时评估默认参数?这对我来说似乎非常违反直觉。

8 个解决方案

#1


38  

The alternative would be quite heavyweight -- storing "default argument values" in the function object as "thunks" of code to be executed over and over again every time the function is called without a specified value for that argument -- and would make it much harder to get early binding (binding at def time), which is often what you want. For example, in Python as it exists:

替代方案将是非常重量级 - 在函数对象中存储“默认参数值”作为代码的“thunk”,每次调用函数时都会反复执行而没有为该参数指定的值 - 并且会使它成为更难以获得早期绑定(在def时间绑定),这通常是你想要的。例如,在Python中存在:

def ack(m, n, _memo={}):
  key = m, n
  if key not in _memo:
    if m==0: v = n + 1
    elif n==0: v = ack(m-1, 1)
    else: v = ack(m-1, ack(m, n-1))
    _memo[key] = v
  return _memo[key]

...writing a memoized function like the above is quite an elementary task. Similarly:

...像上面那样编写一个记忆功能是一项非常基本的任务。同理:

for i in range(len(buttons)):
  buttons[i].onclick(lambda i=i: say('button %s', i))

...the simple i=i, relying on the early-binding (definition time) of default arg values, is a trivially simple way to get early binding. So, the current rule is simple, straightforward, and lets you do all you want in a way that's extremely easy to explain and understand: if you want late binding of an expression's value, evaluate that expression in the function body; if you want early binding, evaluate it as the default value of an arg.

...简单的i = i,依赖于默认arg值的早期绑定(定义时间),是获得早期绑定的一种简单方法。因此,当前规则简单,直接,并且允许您以非常容易解释和理解的方式执行所有操作:如果您希望延迟绑定表达式的值,请在函数体中计算该表达式;如果您想要早期绑定,请将其评估为arg的默认值。

The alternative, forcing late binding for both situation, would not offer this flexibility, and would force you to go through hoops (such as wrapping your function into a closure factory) every time you needed early binding, as in the above examples -- yet more heavy-weight boilerplate forced on the programmer by this hypothetical design decision (beyond the "invisible" ones of generating and repeatedly evaluating thunks all over the place).

替代方案,强制两种情况的后期绑定,不会提供这种灵活性,并且会在每次需要早期绑定时强制您通过环(例如将函数包装到闭包工厂),如上例所示 - 通过这个假设的设计决策强迫程序员使用更重量级的样板(超出了生成和反复评估整个地方的thunk的“看不见的”)。

In other words, "There should be one, and preferably only one, obvious way to do it [1]": when you want late binding, there's already a perfectly obvious way to achieve it (since all of the function's code is only executed at call time, obviously everything evaluated there is late-bound); having default-arg evaluation produce early binding gives you an obvious way to achieve early binding as well (a plus!-) rather than giving TWO obvious ways to get late binding and no obvious way to get early binding (a minus!-).

换句话说,“应该有一个,最好只有一个,显而易见的方法[1]”:当你想要后期绑定时,已经有一种非常明显的方法来实现它(因为所有函数的代码都只执行了)在通话时间,显然所有评估的都是晚期的);使用default-arg评估产生早期绑定为您提供了一种明显的方法来实现早期绑定(加上! - ),而不是给出两种明显的方法来获得后期绑定,而没有明显的方法来获得早期绑定(减去! - )。

[1]: "Although that way may not be obvious at first unless you're Dutch."

[1]:“虽然这种方式起初可能并不明显,除非你是荷兰人。”

#2


10  

The issue is this.

问题是这个。

It's too expensive to evaluate a function as an initializer every time the function is called.

每次调用函数时,将函数计算为初始值函数太昂贵了。

  • 0 is a simple literal. Evaluate it once, use it forever.

    0是一个简单的文字。评估一次,永远使用它。

  • int is a function (like list) that would have to be evaluated each time it's required as an initializer.

    int是一个函数(如列表),每次需要作为初始化程序时都必须进行评估。

The construct [] is literal, like 0, that means "this exact object".

construct []是文字,如0,表示“这个确切的对象”。

The problem is that some people hope that it to means list as in "evaluate this function for me, please, to get the object that is the initializer".

问题是有些人希望它意味着列表中的“为我评估此函数,请获取初始化器的对象”。

It would be a crushing burden to add the necessary if statement to do this evaluation all the time. It's better to take all arguments as literals and not do any additional function evaluation as part of trying to do a function evaluation.

添加必要的if语句以便始终进行此评估将是一个沉重的负担。最好将所有参数作为文字,而不是做任何额外的功能评估,作为尝试进行功能评估的一部分。

Also, more fundamentally, it's technically impossible to implement argument defaults as function evaluations.

另外,从根本上说,在技术上不可能将参数默认值实现为函数评估。

Consider, for a moment the recursive horror of this kind of circularity. Let's say that instead of default values being literals, we allow them to be functions which are evaluated each time a parameter's default values are required.

考虑一下这种循环的递归恐怖。假设我们不是默认值是文字,而是允许它们成为每次需要参数默认值时评估的函数。

[This would parallel the way collections.defaultdict works.]

[这与collections.defaultdict的工作方式相同。]

def aFunc( a=another_func ):
    return a*2

def another_func( b=aFunc ):
    return b*3

What is the value of another_func()? To get the default for b, it must evaluate aFunc, which requires an eval of another_func. Oops.

another_func()的值是多少?要获取b的默认值,它必须计算aFunc,这需要另一个__ eunc的eval。哎呀。

#3


7  

Of course in your situation it is difficult to understand. But you must see, that evaluating default args every time would lay a heavy runtime burden on the system.

当然在你的情况下很难理解。但是你必须看到,每次评估默认args都会给系统带来沉重的运行时负担。

Also you should know, that in case of container types this problem may occur -- but you could circumvent it by making the thing explicit:

另外你应该知道,如果是容器类型,可能会出现这个问题 - 但你可以通过使事物明确来规避它:

def __init__(self, children = None):
    if children is None:
       children = []
    self.children = children

#4


7  

The workaround for this, discussed here (and very solid), is:

这里讨论的解决方法(非常可靠)是:

class Node(object):
    def __init__(self, children = None):
        self.children = [] if children is None else children

As for why look for an answer from von Löwis, but it's likely because the function definition makes a code object due to the architecture of Python, and there might not be a facility for working with reference types like this in default arguments.

至于为什么要从vonLöwis寻找答案,但这可能是因为函数定义由于Python的体系结构而产生代码对象,并且可能没有在默认参数中使用这样的引用类型的工具。

#5


5  

I thought this was counterintuitive too, until I learned how Python implements default arguments.

我认为这也是违反直觉的,直到我学会了Python如何实现默认参数。

A function's an object. At load time, Python creates the function object, evaluates the defaults in the def statement, puts them into a tuple, and adds that tuple as an attribute of the function named func_defaults. Then, when a function is called, if the call doesn't provide a value, Python grabs the default value out of func_defaults.

函数是一个对象。在加载时,Python创建函数对象,计算def语句中的默认值,将它们放入元组中,并将该元组添加为名为func_defaults的函数的属性。然后,当调用函数时,如果调用没有提供值,Python会从func_defaults中获取默认值。

For instance:

例如:

>>> class C():
        pass

>>> def f(x=C()):
        pass

>>> f.func_defaults
(<__main__.C instance at 0x0298D4B8>,)

So all calls to f that don't provide an argument will use the same instance of C, because that's the default value.

因此,所有不提供参数的f调用都将使用相同的C实例,因为这是默认值。

As far as why Python does it this way: well, that tuple could contain functions that would get called every time a default argument value was needed. Apart from the immediately obvious problem of performance, you start getting into a universe of special cases, like storing literal values instead of functions for non-mutable types to avoid unnecessary function calls. And of course there are performance implications galore.

至于为什么Python以这种方式做到这一点:好吧,该元组可以包含每次需要默认参数值时都会被调用的函数。除了明显的性能问题之外,您还开始涉及一些特殊情况,例如存储文字值而不是非可变类型的函数,以避免不必要的函数调用。当然,还有很多性能影响。

The actual behavior is really simple. And there's a trivial workaround, in the case where you want a default value to be produced by a function call at runtime:

实际行为非常简单。如果您希望在运行时通过函数调用生成默认值,那么有一个简单的解决方法:

def f(x = None):
   if x == None:
      x = g()

#6


4  

This comes from python's emphasis on syntax and execution simplicity. a def statement occurs at a certain point during execution. When the python interpreter reaches that point, it evaluates the code in that line, and then creates a code object from the body of the function, which will be run later, when you call the function.

这来自python强调语法和执行简单性。 def语句在执行期间的某个点发生。当python解释器到达那一点时,它会评估该行中的代码,然后在调用函数时从函数体中创建一个代码对象,该代码对象将在稍后运行。

It's a simple split between function declaration and function body. The declaration is executed when it is reached in the code. The body is executed at call time. Note that the declaration is executed every time it is reached, so you can create multiple functions by looping.

它是函数声明和函数体之间的简单分割。声明在代码中到达时执行。正文在通话时执行。请注意,每次到达时都会执行声明,因此您可以通过循环创建多个函数。

funcs = []
for x in xrange(5):
    def foo(x=x, lst=[]):
        lst.append(x)
        return lst
    funcs.append(foo)
for func in funcs:
    print "1: ", func()
    print "2: ", func()

Five separate functions have been created, with a separate list created each time the function declaration was executed. On each loop through funcs, the same function is executed twice on each pass through, using the same list each time. This gives the results:

已创建五个单独的函数,每次执行函数声明时都会创建一个单独的列表。在每个循环中通过funcs,相同的函数在每次传递时执行两次,每次使用相同的列表。这给出了结果:

1:  [0]
2:  [0, 0]
1:  [1]
2:  [1, 1]
1:  [2]
2:  [2, 2]
1:  [3]
2:  [3, 3]
1:  [4]
2:  [4, 4]

Others have given you the workaround, of using param=None, and assigning a list in the body if the value is None, which is fully idiomatic python. It's a little ugly, but the simplicity is powerful, and the workaround is not too painful.

其他人已经给你解决方法,使用param = None,如果值为None,则在正文中指定一个列表,这是完全惯用的python。它有点难看,但简单性很强大,而且解决方法也不会太痛苦。

Edited to add: For more discussion on this, see effbot's article here: http://effbot.org/zone/default-values.htm, and the language reference, here: http://docs.python.org/reference/compound_stmts.html#function

编辑补充:有关这方面的更多讨论,请参阅effbot的文章:http://effbot.org/zone/default-values.htm,以及语言参考,这里:http://docs.python.org/reference/ compound_stmts.html#功能

#7


0  

Python function definitions are just code, like all the other code; they're not "magical" in the way that some languages are. For example, in Java you could refer "now" to something defined "later":

Python函数定义只是代码,就像所有其他代码一样;它们不像某些语言那样“神奇”。例如,在Java中,您可以将“now”引用为“稍后”定义的内容:

public static void foo() { bar(); }
public static void main(String[] args) { foo(); }
public static void bar() {}

but in Python

但在Python中

def foo(): bar()
foo()   # boom! "bar" has no binding yet
def bar(): pass
foo()   # ok

So, the default argument is evaluated at the moment that that line of code is evaluated!

因此,在评估该行代码时评估默认参数!

#8


0  

Because if they had, then someone would post a question asking why it wasn't the other way around :-p

因为如果他们有,那么有人会发一个问题,问为什么它不是相反:-p

Suppose now that they had. How would you implement the current behaviour if needed? It's easy to create new objects inside a function, but you cannot "uncreate" them (you can delete them, but it's not the same).

现在假设他们有。如果需要,您将如何实现当前行为?在函数内部创建新对象很容易,但是你不能“取消”它们(你可以删除它们,但它们不一样)。

#1


38  

The alternative would be quite heavyweight -- storing "default argument values" in the function object as "thunks" of code to be executed over and over again every time the function is called without a specified value for that argument -- and would make it much harder to get early binding (binding at def time), which is often what you want. For example, in Python as it exists:

替代方案将是非常重量级 - 在函数对象中存储“默认参数值”作为代码的“thunk”,每次调用函数时都会反复执行而没有为该参数指定的值 - 并且会使它成为更难以获得早期绑定(在def时间绑定),这通常是你想要的。例如,在Python中存在:

def ack(m, n, _memo={}):
  key = m, n
  if key not in _memo:
    if m==0: v = n + 1
    elif n==0: v = ack(m-1, 1)
    else: v = ack(m-1, ack(m, n-1))
    _memo[key] = v
  return _memo[key]

...writing a memoized function like the above is quite an elementary task. Similarly:

...像上面那样编写一个记忆功能是一项非常基本的任务。同理:

for i in range(len(buttons)):
  buttons[i].onclick(lambda i=i: say('button %s', i))

...the simple i=i, relying on the early-binding (definition time) of default arg values, is a trivially simple way to get early binding. So, the current rule is simple, straightforward, and lets you do all you want in a way that's extremely easy to explain and understand: if you want late binding of an expression's value, evaluate that expression in the function body; if you want early binding, evaluate it as the default value of an arg.

...简单的i = i,依赖于默认arg值的早期绑定(定义时间),是获得早期绑定的一种简单方法。因此,当前规则简单,直接,并且允许您以非常容易解释和理解的方式执行所有操作:如果您希望延迟绑定表达式的值,请在函数体中计算该表达式;如果您想要早期绑定,请将其评估为arg的默认值。

The alternative, forcing late binding for both situation, would not offer this flexibility, and would force you to go through hoops (such as wrapping your function into a closure factory) every time you needed early binding, as in the above examples -- yet more heavy-weight boilerplate forced on the programmer by this hypothetical design decision (beyond the "invisible" ones of generating and repeatedly evaluating thunks all over the place).

替代方案,强制两种情况的后期绑定,不会提供这种灵活性,并且会在每次需要早期绑定时强制您通过环(例如将函数包装到闭包工厂),如上例所示 - 通过这个假设的设计决策强迫程序员使用更重量级的样板(超出了生成和反复评估整个地方的thunk的“看不见的”)。

In other words, "There should be one, and preferably only one, obvious way to do it [1]": when you want late binding, there's already a perfectly obvious way to achieve it (since all of the function's code is only executed at call time, obviously everything evaluated there is late-bound); having default-arg evaluation produce early binding gives you an obvious way to achieve early binding as well (a plus!-) rather than giving TWO obvious ways to get late binding and no obvious way to get early binding (a minus!-).

换句话说,“应该有一个,最好只有一个,显而易见的方法[1]”:当你想要后期绑定时,已经有一种非常明显的方法来实现它(因为所有函数的代码都只执行了)在通话时间,显然所有评估的都是晚期的);使用default-arg评估产生早期绑定为您提供了一种明显的方法来实现早期绑定(加上! - ),而不是给出两种明显的方法来获得后期绑定,而没有明显的方法来获得早期绑定(减去! - )。

[1]: "Although that way may not be obvious at first unless you're Dutch."

[1]:“虽然这种方式起初可能并不明显,除非你是荷兰人。”

#2


10  

The issue is this.

问题是这个。

It's too expensive to evaluate a function as an initializer every time the function is called.

每次调用函数时,将函数计算为初始值函数太昂贵了。

  • 0 is a simple literal. Evaluate it once, use it forever.

    0是一个简单的文字。评估一次,永远使用它。

  • int is a function (like list) that would have to be evaluated each time it's required as an initializer.

    int是一个函数(如列表),每次需要作为初始化程序时都必须进行评估。

The construct [] is literal, like 0, that means "this exact object".

construct []是文字,如0,表示“这个确切的对象”。

The problem is that some people hope that it to means list as in "evaluate this function for me, please, to get the object that is the initializer".

问题是有些人希望它意味着列表中的“为我评估此函数,请获取初始化器的对象”。

It would be a crushing burden to add the necessary if statement to do this evaluation all the time. It's better to take all arguments as literals and not do any additional function evaluation as part of trying to do a function evaluation.

添加必要的if语句以便始终进行此评估将是一个沉重的负担。最好将所有参数作为文字,而不是做任何额外的功能评估,作为尝试进行功能评估的一部分。

Also, more fundamentally, it's technically impossible to implement argument defaults as function evaluations.

另外,从根本上说,在技术上不可能将参数默认值实现为函数评估。

Consider, for a moment the recursive horror of this kind of circularity. Let's say that instead of default values being literals, we allow them to be functions which are evaluated each time a parameter's default values are required.

考虑一下这种循环的递归恐怖。假设我们不是默认值是文字,而是允许它们成为每次需要参数默认值时评估的函数。

[This would parallel the way collections.defaultdict works.]

[这与collections.defaultdict的工作方式相同。]

def aFunc( a=another_func ):
    return a*2

def another_func( b=aFunc ):
    return b*3

What is the value of another_func()? To get the default for b, it must evaluate aFunc, which requires an eval of another_func. Oops.

another_func()的值是多少?要获取b的默认值,它必须计算aFunc,这需要另一个__ eunc的eval。哎呀。

#3


7  

Of course in your situation it is difficult to understand. But you must see, that evaluating default args every time would lay a heavy runtime burden on the system.

当然在你的情况下很难理解。但是你必须看到,每次评估默认args都会给系统带来沉重的运行时负担。

Also you should know, that in case of container types this problem may occur -- but you could circumvent it by making the thing explicit:

另外你应该知道,如果是容器类型,可能会出现这个问题 - 但你可以通过使事物明确来规避它:

def __init__(self, children = None):
    if children is None:
       children = []
    self.children = children

#4


7  

The workaround for this, discussed here (and very solid), is:

这里讨论的解决方法(非常可靠)是:

class Node(object):
    def __init__(self, children = None):
        self.children = [] if children is None else children

As for why look for an answer from von Löwis, but it's likely because the function definition makes a code object due to the architecture of Python, and there might not be a facility for working with reference types like this in default arguments.

至于为什么要从vonLöwis寻找答案,但这可能是因为函数定义由于Python的体系结构而产生代码对象,并且可能没有在默认参数中使用这样的引用类型的工具。

#5


5  

I thought this was counterintuitive too, until I learned how Python implements default arguments.

我认为这也是违反直觉的,直到我学会了Python如何实现默认参数。

A function's an object. At load time, Python creates the function object, evaluates the defaults in the def statement, puts them into a tuple, and adds that tuple as an attribute of the function named func_defaults. Then, when a function is called, if the call doesn't provide a value, Python grabs the default value out of func_defaults.

函数是一个对象。在加载时,Python创建函数对象,计算def语句中的默认值,将它们放入元组中,并将该元组添加为名为func_defaults的函数的属性。然后,当调用函数时,如果调用没有提供值,Python会从func_defaults中获取默认值。

For instance:

例如:

>>> class C():
        pass

>>> def f(x=C()):
        pass

>>> f.func_defaults
(<__main__.C instance at 0x0298D4B8>,)

So all calls to f that don't provide an argument will use the same instance of C, because that's the default value.

因此,所有不提供参数的f调用都将使用相同的C实例,因为这是默认值。

As far as why Python does it this way: well, that tuple could contain functions that would get called every time a default argument value was needed. Apart from the immediately obvious problem of performance, you start getting into a universe of special cases, like storing literal values instead of functions for non-mutable types to avoid unnecessary function calls. And of course there are performance implications galore.

至于为什么Python以这种方式做到这一点:好吧,该元组可以包含每次需要默认参数值时都会被调用的函数。除了明显的性能问题之外,您还开始涉及一些特殊情况,例如存储文字值而不是非可变类型的函数,以避免不必要的函数调用。当然,还有很多性能影响。

The actual behavior is really simple. And there's a trivial workaround, in the case where you want a default value to be produced by a function call at runtime:

实际行为非常简单。如果您希望在运行时通过函数调用生成默认值,那么有一个简单的解决方法:

def f(x = None):
   if x == None:
      x = g()

#6


4  

This comes from python's emphasis on syntax and execution simplicity. a def statement occurs at a certain point during execution. When the python interpreter reaches that point, it evaluates the code in that line, and then creates a code object from the body of the function, which will be run later, when you call the function.

这来自python强调语法和执行简单性。 def语句在执行期间的某个点发生。当python解释器到达那一点时,它会评估该行中的代码,然后在调用函数时从函数体中创建一个代码对象,该代码对象将在稍后运行。

It's a simple split between function declaration and function body. The declaration is executed when it is reached in the code. The body is executed at call time. Note that the declaration is executed every time it is reached, so you can create multiple functions by looping.

它是函数声明和函数体之间的简单分割。声明在代码中到达时执行。正文在通话时执行。请注意,每次到达时都会执行声明,因此您可以通过循环创建多个函数。

funcs = []
for x in xrange(5):
    def foo(x=x, lst=[]):
        lst.append(x)
        return lst
    funcs.append(foo)
for func in funcs:
    print "1: ", func()
    print "2: ", func()

Five separate functions have been created, with a separate list created each time the function declaration was executed. On each loop through funcs, the same function is executed twice on each pass through, using the same list each time. This gives the results:

已创建五个单独的函数,每次执行函数声明时都会创建一个单独的列表。在每个循环中通过funcs,相同的函数在每次传递时执行两次,每次使用相同的列表。这给出了结果:

1:  [0]
2:  [0, 0]
1:  [1]
2:  [1, 1]
1:  [2]
2:  [2, 2]
1:  [3]
2:  [3, 3]
1:  [4]
2:  [4, 4]

Others have given you the workaround, of using param=None, and assigning a list in the body if the value is None, which is fully idiomatic python. It's a little ugly, but the simplicity is powerful, and the workaround is not too painful.

其他人已经给你解决方法,使用param = None,如果值为None,则在正文中指定一个列表,这是完全惯用的python。它有点难看,但简单性很强大,而且解决方法也不会太痛苦。

Edited to add: For more discussion on this, see effbot's article here: http://effbot.org/zone/default-values.htm, and the language reference, here: http://docs.python.org/reference/compound_stmts.html#function

编辑补充:有关这方面的更多讨论,请参阅effbot的文章:http://effbot.org/zone/default-values.htm,以及语言参考,这里:http://docs.python.org/reference/ compound_stmts.html#功能

#7


0  

Python function definitions are just code, like all the other code; they're not "magical" in the way that some languages are. For example, in Java you could refer "now" to something defined "later":

Python函数定义只是代码,就像所有其他代码一样;它们不像某些语言那样“神奇”。例如,在Java中,您可以将“now”引用为“稍后”定义的内容:

public static void foo() { bar(); }
public static void main(String[] args) { foo(); }
public static void bar() {}

but in Python

但在Python中

def foo(): bar()
foo()   # boom! "bar" has no binding yet
def bar(): pass
foo()   # ok

So, the default argument is evaluated at the moment that that line of code is evaluated!

因此,在评估该行代码时评估默认参数!

#8


0  

Because if they had, then someone would post a question asking why it wasn't the other way around :-p

因为如果他们有,那么有人会发一个问题,问为什么它不是相反:-p

Suppose now that they had. How would you implement the current behaviour if needed? It's easy to create new objects inside a function, but you cannot "uncreate" them (you can delete them, but it's not the same).

现在假设他们有。如果需要,您将如何实现当前行为?在函数内部创建新对象很容易,但是你不能“取消”它们(你可以删除它们,但它们不一样)。