Scrapy源码分析(三):信号管理器SignalManager

时间:2021-01-26 00:10:49

类的位置scrapy.signalmanager.SignalManager。主要是对pydispatch.dispatcher的一层封装。

首先来看看pydispatch.dispatcher都有哪些功能:项目主页

这个模块主要提供了消息的发送和接收功能,主页的示例:

To set up a function to receive signals:

from pydispatch import dispatcher
SIGNAL = 'my-first-signal'

def handle_event( sender ):
"""Simple event handler"""
print 'Signal was sent by', sender
dispatcher.connect( handle_event, signal=SIGNAL, sender=dispatcher.Any )

The use of the Any object allows the handler to listen for messages from any Sender or to listen to Any message being sent.  To send messages:

first_sender = object()
second_sender = {}
def main( ):
dispatcher.send( signal=SIGNAL, sender=first_sender )
dispatcher.send( signal=SIGNAL, sender=second_sender )

Which causes the following to be printed:

Signal was sent by <object object at 0x196a090>
Signal was sent by {}
一个简单的的例子: 点击打开链接

目测这个类是异步事件驱动的。下面来看看SignalManager对其的包装:

class SignalManager(object):

def __init__(self, sender=dispatcher.Anonymous):
self.sender = sender

def connect(self, receiver, signal, **kwargs):
"""
Connect a receiver function to a signal.

The signal can be any object, although Scrapy comes with some
predefined signals that are documented in the :ref:`topics-signals`
section.

:param receiver: the function to be connected
:type receiver: callable

:param signal: the signal to connect to
:type signal: object
"""
kwargs.setdefault('sender', self.sender)
return dispatcher.connect(receiver, signal, **kwargs)

def disconnect(self, receiver, signal, **kwargs):
"""
Disconnect a receiver function from a signal. This has the
opposite effect of the :meth:`connect` method, and the arguments
are the same.
"""
kwargs.setdefault('sender', self.sender)
return dispatcher.disconnect(receiver, signal, **kwargs)

def send_catch_log(self, signal, **kwargs):
"""
Send a signal, catch exceptions and log them.

The keyword arguments are passed to the signal handlers (connected
through the :meth:`connect` method).
"""
kwargs.setdefault('sender', self.sender)
return _signal.send_catch_log(signal, **kwargs)

def send_catch_log_deferred(self, signal, **kwargs):
"""
Like :meth:`send_catch_log` but supports returning `deferreds`_ from
signal handlers.

Returns a Deferred that gets fired once all signal handlers
deferreds were fired. Send a signal, catch exceptions and log them.

The keyword arguments are passed to the signal handlers (connected
through the :meth:`connect` method).

.. _deferreds: http://twistedmatrix.com/documents/current/core/howto/defer.html
"""
kwargs.setdefault('sender', self.sender)
return _signal.send_catch_log_deferred(signal, **kwargs)

def disconnect_all(self, signal, **kwargs):
"""
Disconnect all receivers from the given signal.

:param signal: the signal to disconnect from
:type signal: object
"""
kwargs.setdefault('sender', self.sender)
return _signal.disconnect_all(signal, **kwargs)

1、__init__

初始化self.sender为dispatcher.Anonymous匿名对象


2、connect(self, receiver, signal, **kwargs)

对dispatcher.connect(receiver, signal, **kwargs)的封装,如果没有显示指定sender,则使用dispatcher.Anonymous匿名对象


3、disconnect(self, receiver, signal, **kwargs)

断开连接,逻辑同connect


4、send_catch_log(self, signal, **kwargs)

是对signal.send_catch_log(signal, **kwargs)的包装。

def send_catch_log(signal=Any, sender=Anonymous, *arguments, **named):
"""Like pydispatcher.robust.sendRobust but it also logs errors and returns
Failures instead of exceptions.
"""
dont_log = named.pop('dont_log', _IgnoredException)
spider = named.get('spider', None)
responses = []
for receiver in liveReceivers(getAllReceivers(sender, signal)):
try:
response = robustApply(receiver, signal=signal, sender=sender,
*arguments, **named)
if isinstance(response, Deferred):
logger.error("Cannot return deferreds from signal handler: %(receiver)s",
{'receiver': receiver}, extra={'spider': spider})
except dont_log:
result = Failure()
except Exception:
result = Failure()
logger.error("Error caught on signal handler: %(receiver)s",
{'receiver': receiver},
exc_info=True, extra={'spider': spider})
else:
result = response
responses.append((receiver, result))
return responses

这个函数是对pydispatch.robustapply.robustApply的封装,使用log记录错误,使用twisted.python.failure.Failure记录错误。


5、send_catch_log_deferred(self, signal, **kwargs)

是对signal.send_catch_log_deferred(signal, **kwargs)的封装。

def send_catch_log_deferred(signal=Any, sender=Anonymous, *arguments, **named):
"""Like send_catch_log but supports returning deferreds on signal handlers.
Returns a deferred that gets fired once all signal handlers deferreds were
fired.
"""
def logerror(failure, recv):
if dont_log is None or not isinstance(failure.value, dont_log):
logger.error("Error caught on signal handler: %(receiver)s",
{'receiver': recv},
exc_info=failure_to_exc_info(failure),
extra={'spider': spider})
return failure

dont_log = named.pop('dont_log', None)
spider = named.get('spider', None)
dfds = []
for receiver in liveReceivers(getAllReceivers(sender, signal)):
d = maybeDeferred(robustApply, receiver, signal=signal, sender=sender,
*arguments, **named)
d.addErrback(logerror, receiver)
d.addBoth(lambda result: (receiver, result))
dfds.append(d)
d = DeferredList(dfds)
d.addCallback(lambda out: [x[1] for x in out])
return d

感觉Defered是一个placeholder,类似于Tornado的Future。


6、disconnect_all(self, signal, **kwargs)

是对signal.disconnect_all(signal, **kwargs)的封装

def disconnect_all(signal=Any, sender=Any):
"""Disconnect all signal handlers. Useful for cleaning up after running
tests
"""
for receiver in liveReceivers(getAllReceivers(sender, signal)):
disconnect(receiver, signal=signal, sender=sender)

主要是获取所有Receivers,断开连接。

主要使用了pydispatch.dispatcher中的liveReceivers,getAllReceivers, disconnect三个函数,获取所有的Receivers,检查是不是live的,然后依次断开连接。