I'm writing a web-app that uses several 3rd party web APIs, and I want to keep track of the low level request and responses for ad-hock analysis. So I'm looking for a recipe that will get Python's urllib2 to log all bytes transferred via HTTP. Maybe a sub-classed Handler?
我正在编写一个使用多个第三方Web API的网络应用程序,我想跟踪低级别请求和广告分析的响应。所以我正在寻找一个能让Python的urllib2记录通过HTTP传输的所有字节的配方。也许是一个次级的Handler?
2 个解决方案
#1
Well, I've found how to setup the built-in debugging mechanism of the library:
好吧,我已经找到了如何设置库的内置调试机制:
import logging, urllib2, sys
hh = urllib2.HTTPHandler()
hsh = urllib2.HTTPSHandler()
hh.set_http_debuglevel(1)
hsh.set_http_debuglevel(1)
opener = urllib2.build_opener(hh, hsh)
logger = logging.getLogger()
logger.addHandler(logging.StreamHandler(sys.stdout))
logger.setLevel(logging.NOTSET)
But I'm still looking for a way to dump all the information transferred.
但我仍在寻找一种方法来转储所有传输的信息。
#2
This looks pretty tricky to do. There are no hooks in urllib2, urllib, or httplib (which this builds on) for intercepting either input or output data.
这看起来很棘手。 urllib2,urllib或httplib(构建于此基础上)中没有用于拦截输入或输出数据的挂钩。
The only thing that occurs to me, other than switching tactics to use an external tool (of which there are many, and most people use such things), would be to write a subclass of socket.socket in your own new module (say, "capture_socket") and then insert that into httplib using "import capture_socket; import httplib; httplib.socket = capture_socket". You'd have to copy all the necessary references (anything of the form "socket.foo" that is used in httplib) into your own module, but then you could override things like recv() and sendall() in your subclass to do what you like with the data.
除了切换策略以使用外部工具(其中有很多人,大多数人使用这些东西)之外,我唯一想到的就是在你自己的新模块中编写socket.socket的子类(比方说, “capture_socket”)然后使用“import capture_socket; import httplib; httplib.socket = capture_socket”将其插入到httplib中。您必须将所有必需的引用(httplib中使用的“socket.foo”形式的任何内容)复制到您自己的模块中,但是您可以覆盖子类中的recv()和sendall()之类的内容你喜欢什么数据。
Complications would likely arise if you were using SSL, and I'm not sure whether this would be sufficient or if you'd also have to make your own socket._fileobject as well. It appears doable though, and perusing the source in httplib.py and socket.py in the standard library would tell you more.
如果您使用SSL,可能会出现并发症,我不确定这是否足够,或者您是否还必须制作自己的socket._fileobject。它似乎可行,并且在标准库中浏览httplib.py和socket.py中的源会告诉你更多。
#1
Well, I've found how to setup the built-in debugging mechanism of the library:
好吧,我已经找到了如何设置库的内置调试机制:
import logging, urllib2, sys
hh = urllib2.HTTPHandler()
hsh = urllib2.HTTPSHandler()
hh.set_http_debuglevel(1)
hsh.set_http_debuglevel(1)
opener = urllib2.build_opener(hh, hsh)
logger = logging.getLogger()
logger.addHandler(logging.StreamHandler(sys.stdout))
logger.setLevel(logging.NOTSET)
But I'm still looking for a way to dump all the information transferred.
但我仍在寻找一种方法来转储所有传输的信息。
#2
This looks pretty tricky to do. There are no hooks in urllib2, urllib, or httplib (which this builds on) for intercepting either input or output data.
这看起来很棘手。 urllib2,urllib或httplib(构建于此基础上)中没有用于拦截输入或输出数据的挂钩。
The only thing that occurs to me, other than switching tactics to use an external tool (of which there are many, and most people use such things), would be to write a subclass of socket.socket in your own new module (say, "capture_socket") and then insert that into httplib using "import capture_socket; import httplib; httplib.socket = capture_socket". You'd have to copy all the necessary references (anything of the form "socket.foo" that is used in httplib) into your own module, but then you could override things like recv() and sendall() in your subclass to do what you like with the data.
除了切换策略以使用外部工具(其中有很多人,大多数人使用这些东西)之外,我唯一想到的就是在你自己的新模块中编写socket.socket的子类(比方说, “capture_socket”)然后使用“import capture_socket; import httplib; httplib.socket = capture_socket”将其插入到httplib中。您必须将所有必需的引用(httplib中使用的“socket.foo”形式的任何内容)复制到您自己的模块中,但是您可以覆盖子类中的recv()和sendall()之类的内容你喜欢什么数据。
Complications would likely arise if you were using SSL, and I'm not sure whether this would be sufficient or if you'd also have to make your own socket._fileobject as well. It appears doable though, and perusing the source in httplib.py and socket.py in the standard library would tell you more.
如果您使用SSL,可能会出现并发症,我不确定这是否足够,或者您是否还必须制作自己的socket._fileobject。它似乎可行,并且在标准库中浏览httplib.py和socket.py中的源会告诉你更多。