Django Signals实践与源码分析

时间:2022-11-13 19:17:58

引言

Signals是Django提供的一种用于提高代码可读性和复用的一种机制,熟悉的开发者可以把Django提供的Signals机制视为一种发布/订阅模式,一个Signal可以有多个订阅者,当一个Signal发出的时候,所有订阅了该信号的订阅者都会收到该信号并运行。
笔者一开始的时候以为Django Signals是一种异步机制,但在阅读Signals的源码以后发现它是同步机制,并且能保证线程安全。

简单的使用Django Signals

笔者在首次注意到signals的时候,是在django.contrib.auth.authenticate的源码分析中看到如下代码:

user_login_failed.send(sender=__name__, credentials=_clean_credentials(credentials))

这是Django auth系统中提供的一些默认signals,当有用户尝试登录验证而失败的时候就会发出该信号,定义在django.contrib.auth.signals文件中。除此以外,Django Signals还提供了以下默认Signals供开发者使用:

  • django.db.models.signals.pre_save & django.db.models.signals.post_save

    Sent before or after a model’s save() method is called.

  • django.db.models.signals.pre_delete & django.db.models.signals.post_delete

    Sent before or after a model’s delete() method or queryset’s delete() method is called.

  • django.db.models.signals.m2m_changed

    Sent when a ManyToManyField on a model is changed.

  • django.core.signals.request_started & django.core.signals.request_finished

    Sent when Django starts or finishes an HTTP request

当然,开发者可以自行定义自己的signals以及handler。笔者在这里给出一个当发送signal的时候就会在后台打印出hello world的简单例子,这个例子也说明了Signals并非异步机制:

# views.py

from django.shortcuts import HttpResponse
from django.dispatch import Signal, receiver

import time

# Create your views here.

hello_signal = Signal(providing_args=[])


@receiver(hello_signal)
def print_hello(sender, **kwargs):
time.sleep(5)
print("hello world")


def hello(request):
hello_signal.send(__name__)
return HttpResponse("OK")

在Signal实例化中的providing_args声明了订阅这个signal的recevier会接收到哪些关键字参数,但是Django Signals并不会对这个参数是否准确进行检查,也就是说即使在调用send方法的时候如果传入了一个没有在providing_args中定义的关键字,Django也不会报错。

如果想要一个函数或者一个实例方法订阅一个信号,Django Signals提供了两种方法:

  • 使用receiver装饰器
  • 使用Signal实例的connect方法

为了防止一个信号被同样的函数或者实例方法多次订阅,可以使用一个dispatch_uid参数来标记一个函数或者实例方法,在使用上述两种订阅方法的时候传入。

详细的Django Signals语法请参考官方文档:Signals

在哪里定义signals和handlers?

在执行signals连接一个receiver(比如使用receiver装饰器)的时候,开发者可能会担心发生receiver尚未定义或者没有导入内存的情况。
下文是官方文档给出的解释:

Where should this code live?

Strictly speaking, signal handling and registration code can live anywhere you like, although it’s recommended to avoid the application’s root module and its models module to minimize side-effects of importing code.

In practice, signal handlers are usually defined in a signals submodule of the application they relate to. Signal receivers are connected in the ready() method of your application configuration class. If you’re using the receiver() decorator, simply import the signals submodule inside ready().

在*上也有人提出了这一个问题:Where should signal handlers live in a django project?

由于在django.setup()的过程中,它会遍历settings.INSTALLED_APPS列表中的每一项,并调用该AppConfig的ready方法,因此,将recevier订阅signal的过程放置于ready方法中就能保证该代码的执行。

假设将上述的hello的例子转换成该实践,则目录结构如下:

helloService
├── __init__.py
├── admin.py
├── apps.py
├── migrations
│ ├── __init__.py
├── models.py
├── signals
│ ├── __init__.py
│ └── handlers.py
├── tests.py
├── urls.py
└── views.py

其中,在signals包目录下:

# __init__.py
from django.dispatch import Signal

hello_signal = Signal(providing_args=[])

# handlers.py
import time

def print_hello(sender, **kwargs):
time.sleep(5)
print("hello world")

最后设置apps.py:

from django.apps import AppConfig

from helloService.signals import hello_signal
from helloService.signals.handlers import print_hello

import hashlib


class HelloserviceConfig(AppConfig):
name = 'emailService'

def ready(self):
func_uid = hashlib.sha256('print_hello'.encode()).hexdigest()
hello_signal.connect(print_hello, dispatch_uid=func_uid)

如果在signals包里面使用的是receiver装饰器,则在ready方法中直接导入handlers.py即可。

Django Signals源码分析

准备

在读懂Signals的源码之前,首先需要了解Python的弱引用和线程概念。

Python的弱引用

Python使用了垃圾回收器(GC)来回收不再使用的对象,GC会维持对每一个对象的引用计数,当引用计数为0的时候该对象就会被销毁。但是,单纯的引用计数无法解决循环引用的问题,一旦两个对象之间存在相互应用则这两个对象无法销毁(当然Python的GC还有其他的辅助机制帮忙解决循环引用的问题)。

详细的弱引用学习可以参考:Python 弱引用 学习

在Django Signals中,使用了weakref作为缓存的使用,以及在为信号加入订阅者的时候是通过弱引用实现的,这样就保证了已经被回收的订阅者不会再接收到信号的发出。

在Signals的__init__.py中如果设置use_cashing为true则缓存就会被设置为弱引用:

    def __init__(self, providing_args=None, use_caching=False):
"""
Create a new signal.

providing_args
A list of the arguments this signal can pass along in a send() call.
"""

self.receivers = []
if providing_args is None:
providing_args = []
self.providing_args = set(providing_args)
self.lock = threading.Lock()
self.use_caching = use_caching
# For convenience we create empty caches even if they are not used.
# A note about caching: if use_caching is defined, then for each
# distinct sender we cache the receivers that sender has in
# 'sender_receivers_cache'. The cache is cleaned when .connect() or
# .disconnect() is called and populated on send().
self.sender_receivers_cache = weakref.WeakKeyDictionary() if use_caching else {}
self._dead_receivers = False

在Signals的connect方法中,也是将订阅信号的函数或者实例方法设置为弱引用:

# Signals.connect()
if weak:
ref = weakref.ref
receiver_object = receiver
# Check for bound methods
if hasattr(receiver, '__self__') and hasattr(receiver, '__func__'):
ref = WeakMethod
receiver_object = receiver.__self__
if six.PY3:
receiver = ref(receiver)
weakref.finalize(receiver_object, self._remove_receiver)
else:
receiver = ref(receiver, self._remove_receiver)

当函数或者实例方法被回收的时候,就会触发self._remove_receiver方法,该方法会设置self._dead_receivers = True,而Signals在connect,disconnect,send等实例方法调用之前都会检查该标志,如果为真则清除已经失效的弱引用。

注意,__self__和__func__是为了区分函数和实例方法的,详情请参阅:Python (类)实例方法的特殊属性

线程锁

在Signals中的__init__.py中可以看到订阅者都被存储在self.receivers这个列表中,因此这个列表需要保证是线程安全的,需要加上线程锁来保证在信号通知订阅者的中途不会发生订阅者突然被删除的情况。

# Signals.connect()
with self.lock:
self._clear_dead_receivers()
for r_key, _ in self.receivers:
if r_key == lookup_key:
break
else:
self.receivers.append((lookup_key, receiver))
self.sender_receivers_cache.clear()

Django Signals在使用send方法的时候,获取当前receivers的时候调用了self._live_receivers,所以这个方法也需要是线程安全的:

    def _live_receivers(self, sender):
"""
Filter sequence of receivers to get resolved, live receivers.

This checks for weak references and resolves them, then returning only
live receivers.
"""

receivers = None
if self.use_caching and not self._dead_receivers:
receivers = self.sender_receivers_cache.get(sender)
# We could end up here with NO_RECEIVERS even if we do check this case in
# .send() prior to calling _live_receivers() due to concurrent .send() call.
if receivers is NO_RECEIVERS:
return []
if receivers is None:
with self.lock:
self._clear_dead_receivers()
senderkey = _make_id(sender)
receivers = []
for (receiverkey, r_senderkey), receiver in self.receivers:
if r_senderkey == NONE_ID or r_senderkey == senderkey:
receivers.append(receiver)
if self.use_caching:
if not receivers:
self.sender_receivers_cache[sender] = NO_RECEIVERS
else:
# Note, we must cache the weakref versions.
self.sender_receivers_cache[sender] = receivers
non_weak_receivers = []
for receiver in receivers:
if isinstance(receiver, weakref.ReferenceType):
# Dereference the weak reference.
receiver = receiver()
if receiver is not None:
non_weak_receivers.append(receiver)
else:
non_weak_receivers.append(receiver)
return non_weak_receivers

参考文档:weakref — Weak references
gc — Garbage Collector interface
Signals