如何在urllib2请求中发送utf-8内容?

时间:2023-01-05 16:20:43

I'm struggling with the following question for the past half a day and although I've found some info about similar problems, nothing really hits the spot.

在过去的半天里,我正在努力解决以下问题,尽管我已经找到了一些关于类似问题的信息,但实际上并没有真正发现。

I'm trying to send a PUT request using urllib2 with data that contains some Unicode characters:

我正在尝试使用urllib2发送包含一些Unicode字符的数据的PUT请求:

body = u'{ "bbb" : "asdf\xd7\xa9\xd7\x93\xd7\x92"}'
conn = urllib2.Request(request_url, body, headers)
conn.get_method = lambda: 'PUT'
response = urllib2.urlopen(conn)

I've tried to use body = body.encode('utf-8') and other variations, but whatever I do I get the following error:

我试过使用body = body.encode('utf-8')和其他变种,但无论我做什么,我都会收到以下错误:

UnicodeEncodeError at ...
'ascii' codec can't decode byte 0xc3 in position 15: ordinal not in range(128)

With one of the following call stacks:

使用以下调用堆栈之一:

File "..." in ...
  195.         response = urllib2.urlopen(conn)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in urlopen
  126.     return _opener.open(url, data, timeout)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in open
  394.         response = self._open(req, data)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in _open
  412.                                   '_open', req)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in _call_chain
  372.             result = func(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in http_open
  1199.         return self.do_open(httplib.HTTPConnection, req)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in do_open
  1168.             h.request(req.get_method(), req.get_selector(), req.data, headers)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in request
  955.         self._send_request(method, url, body, headers)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in _send_request
  989.         self.endheaders(body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in endheaders
  951.         self._send_output(message_body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in _send_output
  815.             self.send(message_body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in send
  787.             self.sock.sendall(data)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py" in meth
  224.     return getattr(self._sock,name)(*args)

Or the following call stack (for when I do body = body.encode('utf-8')):

或者下面的调用堆栈(当我执行body = body.encode('utf-8'))时:

File "..." in ...
  195.         response = urllib2.urlopen(conn)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in urlopen
  126.     return _opener.open(url, data, timeout)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in open
  394.         response = self._open(req, data)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in _open
  412.                                   '_open', req)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in _call_chain
  372.             result = func(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in http_open
  1199.         return self.do_open(httplib.HTTPConnection, req)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py" in do_open
  1168.             h.request(req.get_method(), req.get_selector(), req.data, headers)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in request
  955.         self._send_request(method, url, body, headers)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in _send_request
  989.         self.endheaders(body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in endheaders
  951.         self._send_output(message_body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py" in _send_output
  809.             msg += message_body

What am I doing wrong? How can I send a body with Unicode characters via urllib2? If there are no Unicode characters, everything works fine.

我究竟做错了什么?如何通过urllib2发送带有Unicode字符的正文?如果没有Unicode字符,一切正常。

Also note that my Content-Type header is set to application/json;charset=utf-8.

另请注意,我的Content-Type标头设置为application / json; charset = utf-8。

If it's relevant in any way, the context of what I'm doing is this: I'm getting a request to my Django server, and I delegate the request to another Django server. I don't redirect, just send the request from my own server get the response and send it back. So body is the request.body in the Django view.

如果它以任何方式相关,我正在做的事情的背景是这样的:我正在向我的Django服务器请求,并且我将请求委托给另一个Django服务器。我没有重定向,只是从我自己的服务器发送请求获取响应并将其发回。所以body是Django视图中的request.body。

Edit:

My headers are:

我的标题是:

{
'Origin': 'http://10.0.0.146:8000', 
'Accept-Language': 'en-US,en;q=0.8', 
'Accept-Encoding': 'gzip,deflate,sdch', 
'Host': 'localhost:5000', 
'Accept': 'application/json, text/plain, */*', 
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.65 Safari/537.31', 
'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3', 
'Connection': 'keep-alive', 
'X-Requested-With': 'XMLHttpRequest', 
'Pragma': 'no-cache', 
'Cache-Control': 'no-cache', 
'Referer': 'http://localhost:5000/', 
'Content-Type': 'application/json;charset=UTF-8', 
'Authorization': 'ApiKey ogkLPgSESNyTOgIdbSLDhJjvyVJcbg:0d5897b5204c2f2527f532c6a97ba18a7f06acdc', 
'Cookie': 'username=ogkLPgSESNyTOgIdbSLDhJjvyVJcbg; _we_wk_ls_=%7B%22time%22%3A1369123506709%7D; __jwpusr=39e63770-ec5c-4b96-9f7f-b199703d0d36; sessionid=0d741a7560258b301979a1c853b42a81; api_key=0d5897b5204c2f2527f532c6a97ba18a7f06acdc'
}

1 个解决方案

#1


1  

You need to pass only byte strings to Request. This applies to the headers, the url and the body.

您只需要将字节字符串传递给Request。这适用于标题,网址和正文。

If any of those three inputs contain Unicode values, automatic conversions between Unicode and strings will take place when concatenating, which will invariably lead to grief.

如果这三个输入中的任何一个包含Unicode值,则在连接时将发生Unicode和字符串之间的自动转换,这将导致悲伤。

#1


1  

You need to pass only byte strings to Request. This applies to the headers, the url and the body.

您只需要将字节字符串传递给Request。这适用于标题,网址和正文。

If any of those three inputs contain Unicode values, automatic conversions between Unicode and strings will take place when concatenating, which will invariably lead to grief.

如果这三个输入中的任何一个包含Unicode值,则在连接时将发生Unicode和字符串之间的自动转换,这将导致悲伤。