官方文档:https://2.python-requests.org//en/master/
工作中涉及到一个功能,需要上传附件到一个接口,接口参数如下:
使用http post提交附件 multipart/form-data 格式,url : http://test.com/flow/upload,
1
2
3
4
5
6
|
字段列表:
md5: //md5 加密(随机值_当时时间戳)
filesize: // 文件大小
file : // 文件内容(须含文件名)
返回值:
{ "success" : true , "uploadName" : "tmp.xml" , "uploadPath" : "uploads\/201311\/758e875fb7c7a508feef6b5036119b9f" }
|
由于工作中主要用python,并且项目中已有使用requests库的地方,所以计划使用requests来实现,本来以为是很简单的一个小功能,结果花费了大量的时间,requests官方的例子只提到了上传文件,并不需要传额外的参数:
https://2.python-requests.org//en/master/user/quickstart/#post-a-multipart-encoded-file
1
2
3
4
5
6
7
8
9
10
11
12
|
>>> url = 'https://httpbin.org/post'
>>> files = { 'file' : ( 'report.xls' , open ( 'report.xls' , 'rb' ), 'application/vnd.ms-excel' , { 'Expires' : '0' })}
>>> r = requests.post(url, files = files)
>>> r.text
{
...
"files" : {
"file" : "<censored...binary...data>"
},
...
}
|
但是如果涉及到了参数的传递时,其实就要用到requests的两个参数:data、files,将要上传的文件传入files,将其他参数传入data,request库会将两者合并到一起做一个multi part,然后发送给服务器。
最终实现的代码是这样的:
1
2
3
4
5
6
7
8
9
|
with open (file_name) as f:
content = f.read()
request_data = {
'md5' :md5.md5( '%d_%d' % ( 0 , int (time.time()))).hexdigest(),
'filesize' : len (content),
}
files = { 'file' :(file_name, open (file_name, 'rb' ))}
MyLogger().getlogger().info( 'url:%s' % (request_url))
resp = requests.post(request_url, data = request_data, files = files)
|
虽然最终代码可能看起来很简单,但是其实我费了好大功夫才确认这样是OK的,中间还翻了requests的源码,下面记录一下翻阅源码的过程:
首先,找到post方法的实现,在requests.api.py中:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
def post(url, data = None , json = None , * * kwargs):
r """Sends a POST request.
:param url: URL for the new :class:`Request` object.
:param data: (optional) Dictionary, list of tuples, bytes, or file-like
object to send in the body of the :class:`Request`.
:param json: (optional) json data to send in the body of the :class:`Request`.
:param \*\*kwargs: Optional arguments that ``request`` takes.
:return: :class:`Response <Response>` object
:rtype: requests.Response
"""
return request( 'post' , url, data = data, json = json, * * kwargs)
|
这里可以看到它调用了request方法,咱们继续跟进request方法,在requests.api.py中:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
|
def request(method, url, * * kwargs):
"""Constructs and sends a :class:`Request <Request>`.
:param method: method for the new :class:`Request` object: ``GET``, ``OPTIONS``, ``HEAD``, ``POST``, ``PUT``, ``PATCH``, or ``DELETE``.
:param url: URL for the new :class:`Request` object.
:param params: (optional) Dictionary, list of tuples or bytes to send
in the query string for the :class:`Request`.
:param data: (optional) Dictionary, list of tuples, bytes, or file-like
object to send in the body of the :class:`Request`.
:param json: (optional) A JSON serializable Python object to send in the body of the :class:`Request`.
:param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`.
:param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`.
:param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload.
``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')``
or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string
defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers
to add for the file.
:param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
:param timeout: (optional) How many seconds to wait for the server to send data
before giving up, as a float, or a :ref:`(connect timeout, read
timeout) <timeouts>` tuple.
:type timeout: float or tuple
:param allow_redirects: (optional) Boolean. Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to ``True``.
:type allow_redirects: bool
:param proxies: (optional) Dictionary mapping protocol to the URL of the proxy.
:param verify: (optional) Either a boolean, in which case it controls whether we verify
the server's TLS certificate, or a string, in which case it must be a path
to a CA bundle to use. Defaults to ``True``.
:param stream: (optional) if ``False``, the response content will be immediately downloaded.
:param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair.
:return: :class:`Response <Response>` object
:rtype: requests.Response
Usage::
>>> import requests
>>> req = requests.request('GET', 'https://httpbin.org/get')
<Response [200]>
"""
# By using the 'with' statement we are sure the session is closed, thus we
# avoid leaving sockets open which can trigger a ResourceWarning in some
# cases, and look like a memory leak in others.
with sessions.Session() as session:
return session.request(method = method, url = url, * * kwargs)
|
这个方法的注释比较多,从注释里其实已经可以看到files参数使用传送文件,但是还是无法知道当需要同时传递参数和文件时该如何处理,继续跟进session.request方法,在requests.session.py中:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
|
def request( self , method, url,
params = None , data = None , headers = None , cookies = None , files = None ,
auth = None , timeout = None , allow_redirects = True , proxies = None ,
hooks = None , stream = None , verify = None , cert = None , json = None ):
"""Constructs a :class:`Request <Request>`, prepares it and sends it.
Returns :class:`Response <Response>` object.
:param method: method for the new :class:`Request` object.
:param url: URL for the new :class:`Request` object.
:param params: (optional) Dictionary or bytes to be sent in the query
string for the :class:`Request`.
:param data: (optional) Dictionary, list of tuples, bytes, or file-like
object to send in the body of the :class:`Request`.
:param json: (optional) json to send in the body of the
:class:`Request`.
:param headers: (optional) Dictionary of HTTP Headers to send with the
:class:`Request`.
:param cookies: (optional) Dict or CookieJar object to send with the
:class:`Request`.
:param files: (optional) Dictionary of ``'filename': file-like-objects``
for multipart encoding upload.
:param auth: (optional) Auth tuple or callable to enable
Basic/Digest/Custom HTTP Auth.
:param timeout: (optional) How long to wait for the server to send
data before giving up, as a float, or a :ref:`(connect timeout,
read timeout) <timeouts>` tuple.
:type timeout: float or tuple
:param allow_redirects: (optional) Set to True by default.
:type allow_redirects: bool
:param proxies: (optional) Dictionary mapping protocol or protocol and
hostname to the URL of the proxy.
:param stream: (optional) whether to immediately download the response
content. Defaults to ``False``.
:param verify: (optional) Either a boolean, in which case it controls whether we verify
the server's TLS certificate, or a string, in which case it must be a path
to a CA bundle to use. Defaults to ``True``.
:param cert: (optional) if String, path to ssl client cert file (.pem).
If Tuple, ('cert', 'key') pair.
:rtype: requests.Response
"""
# Create the Request.
req = Request(
method = method.upper(),
url = url,
headers = headers,
files = files,
data = data or {},
json = json,
params = params or {},
auth = auth,
cookies = cookies,
hooks = hooks,
)
prep = self .prepare_request(req)
proxies = proxies or {}
settings = self .merge_environment_settings(
prep.url, proxies, stream, verify, cert
)
# Send the request.
send_kwargs = {
'timeout' : timeout,
'allow_redirects' : allow_redirects,
}
send_kwargs.update(settings)
resp = self .send(prep, * * send_kwargs)
return resp
|
先大概看一下这个方法,先是准备request,最后一步是调用send,推测应该是发送请求了,所以我们需要跟进到prepare_request方法中,在requests.session.py中:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
|
def prepare_request( self , request):
"""Constructs a :class:`PreparedRequest <PreparedRequest>` for
transmission and returns it. The :class:`PreparedRequest` has settings
merged from the :class:`Request <Request>` instance and those of the
:class:`Session`.
:param request: :class:`Request` instance to prepare with this
session's settings.
:rtype: requests.PreparedRequest
"""
cookies = request.cookies or {}
# Bootstrap CookieJar.
if not isinstance (cookies, cookielib.CookieJar):
cookies = cookiejar_from_dict(cookies)
# Merge with session cookies
merged_cookies = merge_cookies(
merge_cookies(RequestsCookieJar(), self .cookies), cookies)
# Set environment's basic authentication if not explicitly set.
auth = request.auth
if self .trust_env and not auth and not self .auth:
auth = get_netrc_auth(request.url)
p = PreparedRequest()
p.prepare(
method = request.method.upper(),
url = request.url,
files = request.files,
data = request.data,
json = request.json,
headers = merge_setting(request.headers, self .headers, dict_class = CaseInsensitiveDict),
params = merge_setting(request.params, self .params),
auth = merge_setting(auth, self .auth),
cookies = merged_cookies,
hooks = merge_hooks(request.hooks, self .hooks),
)
return p
|
在prepare_request中,生成了一个PreparedRequest对象,并调用其prepare方法,跟进到prepare方法中,在requests.models.py中:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
def prepare( self ,
method = None , url = None , headers = None , files = None , data = None ,
params = None , auth = None , cookies = None , hooks = None , json = None ):
"""Prepares the entire request with the given parameters."""
self .prepare_method(method)
self .prepare_url(url, params)
self .prepare_headers(headers)
self .prepare_cookies(cookies)
self .prepare_body(data, files, json)
self .prepare_auth(auth, url)
# Note that prepare_auth must be last to enable authentication schemes
# such as OAuth to work on a fully prepared request.
# This MUST go after prepare_auth. Authenticators could add a hook
self .prepare_hooks(hooks)
|
这里调用许多prepare_xx方法,这里我们只关心处理了data、files、json的方法,跟进到prepare_body中,在requests.models.py中:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
|
def prepare_body( self , data, files, json = None ):
"""Prepares the given HTTP body data."""
# Check if file, fo, generator, iterator.
# If not, run through normal process.
# Nottin' on you.
body = None
content_type = None
if not data and json is not None :
# urllib3 requires a bytes-like body. Python 2's json.dumps
# provides this natively, but Python 3 gives a Unicode string.
content_type = 'application/json'
body = complexjson.dumps(json)
if not isinstance (body, bytes):
body = body.encode( 'utf-8' )
is_stream = all ([
hasattr (data, '__iter__' ),
not isinstance (data, ( basestring , list , tuple , Mapping))
])
try :
length = super_len(data)
except (TypeError, AttributeError, UnsupportedOperation):
length = None
if is_stream:
body = data
if getattr (body, 'tell' , None ) is not None :
# Record the current file position before reading.
# This will allow us to rewind a file in the event
# of a redirect.
try :
self ._body_position = body.tell()
except (IOError, OSError):
# This differentiates from None, allowing us to catch
# a failed `tell()` later when trying to rewind the body
self ._body_position = object ()
if files:
raise NotImplementedError( 'Streamed bodies and files are mutually exclusive.' )
if length:
self .headers[ 'Content-Length' ] = builtin_str(length)
else :
self .headers[ 'Transfer-Encoding' ] = 'chunked'
else :
# Multi-part file uploads.
if files:
(body, content_type) = self ._encode_files(files, data)
else :
if data:
body = self ._encode_params(data)
if isinstance (data, basestring ) or hasattr (data, 'read' ):
content_type = None
else :
content_type = 'application/x-www-form-urlencoded'
self .prepare_content_length(body)
# Add content-type if it wasn't explicitly provided.
if content_type and ( 'content-type' not in self .headers):
self .headers[ 'Content-Type' ] = content_type
self .body = body
|
这个函数比较长,需要重点关注L52,这里调用了_encode_files方法,我们跟进这个方法:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
|
def _encode_files(files, data):
"""Build the body for a multipart/form-data request.
Will successfully encode files when passed as a dict or a list of
tuples. Order is retained if data is a list of tuples but arbitrary
if parameters are supplied as a dict.
The tuples may be 2-tuples (filename, fileobj), 3-tuples (filename, fileobj, contentype)
or 4-tuples (filename, fileobj, contentype, custom_headers).
"""
if ( not files):
raise ValueError( "Files must be provided." )
elif isinstance (data, basestring ):
raise ValueError( "Data must not be a string." )
new_fields = []
fields = to_key_val_list(data or {})
files = to_key_val_list(files or {})
for field, val in fields:
if isinstance (val, basestring ) or not hasattr (val, '__iter__' ):
val = [val]
for v in val:
if v is not None :
# Don't call str() on bytestrings: in Py3 it all goes wrong.
if not isinstance (v, bytes):
v = str (v)
new_fields.append(
(field.decode( 'utf-8' ) if isinstance (field, bytes) else field,
v.encode( 'utf-8' ) if isinstance (v, str ) else v))
for (k, v) in files:
# support for explicit filename
ft = None
fh = None
if isinstance (v, ( tuple , list )):
if len (v) = = 2 :
fn, fp = v
elif len (v) = = 3 :
fn, fp, ft = v
else :
fn, fp, ft, fh = v
else :
fn = guess_filename(v) or k
fp = v
if isinstance (fp, ( str , bytes, bytearray)):
fdata = fp
elif hasattr (fp, 'read' ):
fdata = fp.read()
elif fp is None :
continue
else :
fdata = fp
rf = RequestField(name = k, data = fdata, filename = fn, headers = fh)
rf.make_multipart(content_type = ft)
new_fields.append(rf)
body, content_type = encode_multipart_formdata(new_fields)
return body, content_type
|
OK,到此为止,仔细阅读完这个段代码,就可以搞明白requests.post方法传入的data、files两个参数的作用了,其实requests在这里把它俩合并在一起了,作为post的body。
以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持服务器之家。
原文链接:https://www.cnblogs.com/lit10050528/p/11285600.html