子类化Python字典以覆盖__setitem__

时间:2022-11-08 21:23:28

I am building a class which subclasses dict, and overrides __setitem__. I would like to be certain that my method will be called in all instances where dictionary items could possibly be set.

我正在构建一个子类dict的类,并覆盖__setitem__。我想确定我的方法将在可能设置字典项的所有实例中被调用。

I have discovered three situations where Python (in this case, 2.6.4) does not call my overridden __setitem__ method when setting values, and instead calls PyDict_SetItem directly

我发现三种情况,Python(在这种情况下,2.6.4)在设置值时不会调用我重写的__setitem__方法,而是直接调用PyDict_SetItem

  1. In the constructor
  2. 在构造函数中
  3. In the setdefault method
  4. 在setdefault方法中
  5. In the update method
  6. 在更新方法中

As a very simple test:

作为一个非常简单的测试:

class MyDict(dict):
    def __setitem__(self, key, value):
        print "Here"
        super(MyDict, self).__setitem__(key, str(value).upper())

>>> a = MyDict(abc=123)
>>> a['def'] = 234
Here
>>> a.update({'ghi': 345})
>>> a.setdefault('jkl', 456)
456
>>> print a
{'jkl': 456, 'abc': 123, 'ghi': 345, 'def': '234'}

You can see that the overridden method is only called when setting the items explicitly. To get Python to always call my __setitem__ method, I have had to reimplement those three methods, like this:

您可以看到仅在显式设置项时才调用重写的方法。为了让Python始终调用我的__setitem__方法,我不得不重新实现这三个方法,如下所示:

class MyUpdateDict(dict):
    def __init__(self, *args, **kwargs):
        self.update(*args, **kwargs)

    def __setitem__(self, key, value):
        print "Here"
        super(MyUpdateDict, self).__setitem__(key, value)

    def update(self, *args, **kwargs):
        if args:
            if len(args) > 1:
                raise TypeError("update expected at most 1 arguments, got %d" % len(args))
            other = dict(args[0])
            for key in other:
                self[key] = other[key]
        for key in kwargs:
            self[key] = kwargs[key]

    def setdefault(self, key, value=None):
        if key not in self:
            self[key] = value
        return self[key]

Are there any other methods which I need to override, in order to know that Python will always call my __setitem__ method?

有没有其他方法需要覆盖,以便知道Python将始终调用我的__setitem__方法?

UPDATE

UPDATE

Per gs's suggestion, I've tried subclassing UserDict (actually, IterableUserDict, since I want to iterate over the keys) like this:

根据gs的建议,我已经尝试了子类化UserDict(实际上,IterableUserDict,因为我想迭代键),如下所示:

from UserDict import *;
class MyUserDict(IterableUserDict):
    def __init__(self, *args, **kwargs):
        UserDict.__init__(self,*args,**kwargs)

    def __setitem__(self, key, value):
        print "Here"
        UserDict.__setitem__(self,key, value)

This class seems to correctly call my __setitem__ on setdefault, but it doesn't call it on update, or when initial data is provided to the constructor.

这个类似乎在setdefault上正确调用了我的__setitem__,但是它没有在更新时调用它,或者在向构造函数提供初始数据时调用它。

UPDATE 2

更新2

Peter Hansen's suggestion got me to look more carefully at dictobject.c, and I realised that the update method could be simplified a bit, since the built-in dictionary constructor simply calls the built-in update method anyway. It now looks like this:

Peter Hansen的建议让我更仔细地看看dictobject.c,并且我意识到更新方法可以简化一些,因为内置的字典构造函数无论如何都只是调用内置的更新方法。它现在看起来像这样:

def update(self, *args, **kwargs):
    if len(args) > 1:
        raise TypeError("update expected at most 1 arguments, got %d" % len(args))
    other = dict(*args, **kwargs)
    for key in other:
        self[key] = other[key]

4 个解决方案

#1


43  

I'm answering my own question, since I eventually decided that I really do want to subclass Dict, rather than creating a new mapping class, and UserDict still defers to the underlying Dict object in some cases, rather than using the provided __setitem__.

我正在回答我自己的问题,因为我最终决定我确实想要继承Dict,而不是创建一个新的映射类,而UserDict在某些情况下仍然遵循底层的Dict对象,而不是使用提供的__setitem__。

After reading and re-reading the Python 2.6.4 source (mostly Objects/dictobject.c, but I grepped eveywhere else to see where the various methods are used,) my understanding is that the following code is sufficient to have my __setitem__ called every time that the object is changed, and to otherwise behave exactly as a Python Dict:

在阅读并重新阅读Python 2.6.4源代码(主要是Objects / dictobject.c,但我在其他地方看到其他方法以查看各种方法的使用位置)后,我的理解是以下代码足以让我的__setitem__被调用更改对象的时间,以及与Python Dict完全相同的行为:

Peter Hansen's suggestion got me to look more carefully at dictobject.c, and I realised that the update method in my original answer could be simplified a bit, since the built-in dictionary constructor simply calls the built-in update method anyway. So the second update in my answer has been added to the code below (by some helpful person ;-).

Peter Hansen的建议让我更仔细地看看dictobject.c,我意识到我原来的答案中的更新方法可以简化一些,因为内置的字典构造函数无论如何都只是调用了内置的更新方法。所以我的答案中的第二次更新已添加到下面的代码中(由一些有帮助的人;-)。

class MyUpdateDict(dict):
    def __init__(self, *args, **kwargs):
        self.update(*args, **kwargs)

    def __setitem__(self, key, value):
        # optional processing here
        super(MyUpdateDict, self).__setitem__(key, value)

    def update(self, *args, **kwargs):
        if args:
            if len(args) > 1:
                raise TypeError("update expected at most 1 arguments, "
                                "got %d" % len(args))
            other = dict(args[0])
            for key in other:
                self[key] = other[key]
        for key in kwargs:
            self[key] = kwargs[key]

    def setdefault(self, key, value=None):
        if key not in self:
            self[key] = value
        return self[key]

I've tested it with this code:

我用这段代码测试了它:

def test_updates(dictish):
    dictish['abc'] = 123
    dictish.update({'def': 234})
    dictish.update(red=1, blue=2)
    dictish.update([('orange', 3), ('green',4)])
    dictish.update({'hello': 'kitty'}, black='white')
    dictish.update({'yellow': 5}, yellow=6)
    dictish.setdefault('brown',7)
    dictish.setdefault('pink')
    try:
        dictish.update({'gold': 8}, [('purple', 9)], silver=10)
    except TypeError:
        pass
    else:
        raise RunTimeException("Error did not occur as planned")

python_dict = dict([('b',2),('c',3)],a=1)
test_updates(python_dict)

my_dict = MyUpdateDict([('b',2),('c',3)],a=1)
test_updates(my_dict)

and it passes. All other implementations I've tried have failed at some point. I'll still accept any answers that show me that I've missed something, but otherwise, I'm ticking the checkmark beside this one in a couple of days, and calling it the right answer :)

它通过了。我尝试过的所有其他实现在某些方面都失败了。我仍然接受任何答案,告诉我我错过了什么,但除此之外,我在几天之内勾选了这个旁边的复选标记,并称之为正确答案:)

#2


4  

What is your use-case for subclassing dict?

你继承dict的用例是什么?

You don't need to do this to implement a dict-like object, and it might be simpler in your case to write an ordinary class, then add support for the required subset of the dict interface.

您不需要这样做来实现类似dict的对象,在您的情况下编写普通类可能更简单,然后添加对dict接口所需子集的支持。

The best way to accomplish what you're after is probably the MutableMapping abstract base class. PEP 3119 -- Introducing Abstract Base Classes

实现目标的最佳方法可能是MutableMapping抽象基类。 PEP 3119 - 介绍抽象基类

This will also help you anser the question "Are there any other methods which I need to override?". You will need to override all the abstract methods. For MutableMapping: Abstract methods include setitem, delitem. Concrete methods include pop, popitem, clear, update.

这也可以帮助您解决“我还需要覆盖其他任何方法吗?”的问题。您将需要覆盖所有抽象方法。对于MutableMapping:抽象方法包括setitem,delitem。具体方法包括pop,popitem,clear,update。

#3


3  

I found Ian answer and comments very helpful and clear. I would just point out that maybe a first call to the super-class __init__ method might be safer, when not necessary: I recently needed to implement a custom OrderedDict (I'm working with Python 2.7): after implementing and modifying my code according to the proposed MyUpdateDict implementation, I found out that by simply replacing

我发现伊恩的回答和评论非常有帮助和明确。我只想指出,在没有必要的情况下,第一次调用超类__init__方法可能更安全:我最近需要实现一个自定义的OrderedDict(我正在使用Python 2.7):在实现和修改我的代码之后对于提议的MyUpdateDict实现,我发现只需更换即可

class MyUpdateDict(dict):

with:

有:

from collections import OrderedDict
class MyUpdateDict(OrderedDict):

then the test code posted above failed:

那么上面发布的测试代码失败了:

Traceback (most recent call last):
File "Desktop/test_updates.py", line 52, in <module>
    my_dict = MyUpdateDict([('b',2),('c',3)],a=1)
File "Desktop/test_updates.py", line 5, in __init__
    self.update(*args, **kwargs)
File "Desktop/test_updates.py", line 18, in update
    self[key] = other[key]
File "Desktop/test_updates.py", line 9, in __setitem__
    super(MyUpdateDict, self).__setitem__(key, value)
File "/usr/lib/python2.7/collections.py", line 59, in __setitem__
    root = self.__root
AttributeError: 'MyUpdateDict' object has no attribute '_OrderedDict__root'

Looking at collections.py code it turns out that OrderedDict needs its __init__ method to be called in order to initialize and setup necessary custom attributes.

看看collections.py代码,事实证明OrderedDict需要调用它的__init__方法才能初始化和设置必要的自定义属性。

Therefore, by simply adding a first call to the super __init__ method,

因此,只需添加第一次调用super __init__方法,

from collections import OrderedDict
class MyUpdateDict(Orderedict):
def __init__(self, *args, **kwargs):
    super(MyUpdateDict, self).__init__() #<-- HERE call to super __init__
    self.update(*args, **kwargs)

we have a more general solution which apparently works for both dict and OrderedDict.

我们有一个更通用的解决方案,显然适用于dict和OrderedDict。

I cannot state if this solution is generally valid, because I tested it with OrderedDict only. However, it is likely that a call to the super __init__ method is either harmless or necessary rather than harmful, when trying to extend other dict subclasses

我无法说明此解决方案是否通常有效,因为我仅使用OrderedDict进行了测试。但是,当尝试扩展其他dict子类时,调用super __init__方法可能是无害的或必要的而不是有害的。

#4


0  

Use object.keyname = value instead of object["keyname"] = value

使用object.keyname = value而不是object [“keyname”] = value

#1


43  

I'm answering my own question, since I eventually decided that I really do want to subclass Dict, rather than creating a new mapping class, and UserDict still defers to the underlying Dict object in some cases, rather than using the provided __setitem__.

我正在回答我自己的问题,因为我最终决定我确实想要继承Dict,而不是创建一个新的映射类,而UserDict在某些情况下仍然遵循底层的Dict对象,而不是使用提供的__setitem__。

After reading and re-reading the Python 2.6.4 source (mostly Objects/dictobject.c, but I grepped eveywhere else to see where the various methods are used,) my understanding is that the following code is sufficient to have my __setitem__ called every time that the object is changed, and to otherwise behave exactly as a Python Dict:

在阅读并重新阅读Python 2.6.4源代码(主要是Objects / dictobject.c,但我在其他地方看到其他方法以查看各种方法的使用位置)后,我的理解是以下代码足以让我的__setitem__被调用更改对象的时间,以及与Python Dict完全相同的行为:

Peter Hansen's suggestion got me to look more carefully at dictobject.c, and I realised that the update method in my original answer could be simplified a bit, since the built-in dictionary constructor simply calls the built-in update method anyway. So the second update in my answer has been added to the code below (by some helpful person ;-).

Peter Hansen的建议让我更仔细地看看dictobject.c,我意识到我原来的答案中的更新方法可以简化一些,因为内置的字典构造函数无论如何都只是调用了内置的更新方法。所以我的答案中的第二次更新已添加到下面的代码中(由一些有帮助的人;-)。

class MyUpdateDict(dict):
    def __init__(self, *args, **kwargs):
        self.update(*args, **kwargs)

    def __setitem__(self, key, value):
        # optional processing here
        super(MyUpdateDict, self).__setitem__(key, value)

    def update(self, *args, **kwargs):
        if args:
            if len(args) > 1:
                raise TypeError("update expected at most 1 arguments, "
                                "got %d" % len(args))
            other = dict(args[0])
            for key in other:
                self[key] = other[key]
        for key in kwargs:
            self[key] = kwargs[key]

    def setdefault(self, key, value=None):
        if key not in self:
            self[key] = value
        return self[key]

I've tested it with this code:

我用这段代码测试了它:

def test_updates(dictish):
    dictish['abc'] = 123
    dictish.update({'def': 234})
    dictish.update(red=1, blue=2)
    dictish.update([('orange', 3), ('green',4)])
    dictish.update({'hello': 'kitty'}, black='white')
    dictish.update({'yellow': 5}, yellow=6)
    dictish.setdefault('brown',7)
    dictish.setdefault('pink')
    try:
        dictish.update({'gold': 8}, [('purple', 9)], silver=10)
    except TypeError:
        pass
    else:
        raise RunTimeException("Error did not occur as planned")

python_dict = dict([('b',2),('c',3)],a=1)
test_updates(python_dict)

my_dict = MyUpdateDict([('b',2),('c',3)],a=1)
test_updates(my_dict)

and it passes. All other implementations I've tried have failed at some point. I'll still accept any answers that show me that I've missed something, but otherwise, I'm ticking the checkmark beside this one in a couple of days, and calling it the right answer :)

它通过了。我尝试过的所有其他实现在某些方面都失败了。我仍然接受任何答案,告诉我我错过了什么,但除此之外,我在几天之内勾选了这个旁边的复选标记,并称之为正确答案:)

#2


4  

What is your use-case for subclassing dict?

你继承dict的用例是什么?

You don't need to do this to implement a dict-like object, and it might be simpler in your case to write an ordinary class, then add support for the required subset of the dict interface.

您不需要这样做来实现类似dict的对象,在您的情况下编写普通类可能更简单,然后添加对dict接口所需子集的支持。

The best way to accomplish what you're after is probably the MutableMapping abstract base class. PEP 3119 -- Introducing Abstract Base Classes

实现目标的最佳方法可能是MutableMapping抽象基类。 PEP 3119 - 介绍抽象基类

This will also help you anser the question "Are there any other methods which I need to override?". You will need to override all the abstract methods. For MutableMapping: Abstract methods include setitem, delitem. Concrete methods include pop, popitem, clear, update.

这也可以帮助您解决“我还需要覆盖其他任何方法吗?”的问题。您将需要覆盖所有抽象方法。对于MutableMapping:抽象方法包括setitem,delitem。具体方法包括pop,popitem,clear,update。

#3


3  

I found Ian answer and comments very helpful and clear. I would just point out that maybe a first call to the super-class __init__ method might be safer, when not necessary: I recently needed to implement a custom OrderedDict (I'm working with Python 2.7): after implementing and modifying my code according to the proposed MyUpdateDict implementation, I found out that by simply replacing

我发现伊恩的回答和评论非常有帮助和明确。我只想指出,在没有必要的情况下,第一次调用超类__init__方法可能更安全:我最近需要实现一个自定义的OrderedDict(我正在使用Python 2.7):在实现和修改我的代码之后对于提议的MyUpdateDict实现,我发现只需更换即可

class MyUpdateDict(dict):

with:

有:

from collections import OrderedDict
class MyUpdateDict(OrderedDict):

then the test code posted above failed:

那么上面发布的测试代码失败了:

Traceback (most recent call last):
File "Desktop/test_updates.py", line 52, in <module>
    my_dict = MyUpdateDict([('b',2),('c',3)],a=1)
File "Desktop/test_updates.py", line 5, in __init__
    self.update(*args, **kwargs)
File "Desktop/test_updates.py", line 18, in update
    self[key] = other[key]
File "Desktop/test_updates.py", line 9, in __setitem__
    super(MyUpdateDict, self).__setitem__(key, value)
File "/usr/lib/python2.7/collections.py", line 59, in __setitem__
    root = self.__root
AttributeError: 'MyUpdateDict' object has no attribute '_OrderedDict__root'

Looking at collections.py code it turns out that OrderedDict needs its __init__ method to be called in order to initialize and setup necessary custom attributes.

看看collections.py代码,事实证明OrderedDict需要调用它的__init__方法才能初始化和设置必要的自定义属性。

Therefore, by simply adding a first call to the super __init__ method,

因此,只需添加第一次调用super __init__方法,

from collections import OrderedDict
class MyUpdateDict(Orderedict):
def __init__(self, *args, **kwargs):
    super(MyUpdateDict, self).__init__() #<-- HERE call to super __init__
    self.update(*args, **kwargs)

we have a more general solution which apparently works for both dict and OrderedDict.

我们有一个更通用的解决方案,显然适用于dict和OrderedDict。

I cannot state if this solution is generally valid, because I tested it with OrderedDict only. However, it is likely that a call to the super __init__ method is either harmless or necessary rather than harmful, when trying to extend other dict subclasses

我无法说明此解决方案是否通常有效,因为我仅使用OrderedDict进行了测试。但是,当尝试扩展其他dict子类时,调用super __init__方法可能是无害的或必要的而不是有害的。

#4


0  

Use object.keyname = value instead of object["keyname"] = value

使用object.keyname = value而不是object [“keyname”] = value