从元组中提取信息(Python)

时间:2021-04-18 07:15:58

I'm currently using the httplib library in Python 2.7 to obtain some headers from a website to establish a) the filesize of a download and b) the last modified date of the file. I've used some online tools and these details do exist.

我目前正在使用Python 2.7中的httplib库从网站获取一些标题,以建立a)下载的文件大小和b)文件的最后修改日期。我使用了一些在线工具,这些细节确实存在。

I'm currently scripting my Python code and it appears to work correctly bringing back the required information. Nonetheless, the response containing the header information is a list containing a number of tuples. A sample of the response is below:-

我目前正在编写我的Python代码脚本,它似乎正常工作,带回所需的信息。尽管如此,包含标题信息的响应是包含多个元组的列表。答复的样本如下: -

[('content-length', '2501479'),
 ('accept-ranges', 'bytes'),
 ('vary', 'Accept-Encoding'),
 ('server', 'off'),
 ('last-modified', 'Thu, 20 Oct 2011 04:30:01 GMT'),
 ('etag', '"2c8171a-262b67-4afb368edfffc"'),
 ('date', 'Thu, 20 Oct 2011 16:01:11 GMT'),
 ('content-type', 'text/plain')]

What I am looking to do is strip out basically the file size ("2501479") and the date ("Thu, 20 Oct 2011 04:30:01 GMT"). Any ideas how I can go about doing this? I originally tried variable[0] but this returns "'content-length', '2501479'". How can I return the filesize solely (in theory the second part of the first tuple in the list!).

我要做的是基本上删除文件大小(“2501479”)和日期(“星期四,2011年10月20日04:30:01 GMT”)。我有什么想法可以做到这一点?我最初尝试变量[0],但这会返回“'content-length','2501479'”。我怎样才能单独返回文件大小(理论上是列表中第一个元组的第二部分!)。

5 个解决方案

#1


7  

First, you can make it a little easier to work with by turning your list of tuples into a dictionary:

首先,您可以通过将元组列表转换为字典来使其更容易使用:

>>> headers = [('content-length', '2501479'),
...  ('accept-ranges', 'bytes'),
...  ('vary', 'Accept-Encoding'),
...  ('server', 'off'),
...  ('last-modified', 'Thu, 20 Oct 2011 04:30:01 GMT'),
...  ('etag', '"2c8171a-262b67-4afb368edfffc"'),
...  ('date', 'Thu, 20 Oct 2011 16:01:11 GMT'),
...  ('content-type', 'text/plain')]
>>> 
>>> headers = dict(headers)
>>> int(headers['content-length'])
2501479

For the date, I would turn it into a datetime object using the email.utils.parsedate function:

对于日期,我会使用email.utils.parsedate函数将其转换为日期时间对象:

>>> import email.utils
>>> email.utils.parsedate(headers['date'])
(2011, 10, 20, 16, 1, 11, 0, 1, -1)

#2


4  

First, convert the tuples into a dict, and then convert the value to int to get a number:

首先,将元组转换为dict,然后将值转换为int以获取数字:

response_tupels = [('content-length', '2501479'), ('accept-ranges', 'bytes'),]
response = dict(response_tupels)
try:
  content_length = int(response['content-length'])
except KeyError:
  raise # Handle missing content-length here

#3


2  

You simply have to index it again in order to access the tuple. Like

您只需要再次索引它以访问元组。喜欢

length = variable[0][1]
last_mod = variable[4][1]

for size and the date of last modification.

尺寸和最后修改日期。

Note: This only works when the indices of content-length and last-modified are always the same.

注意:仅当content-length和last-modified的索引始终相同时才有效。

#4


0  

You've got tuples inside an array... Luckily you can reference (or dereference them depending on your terminology) the same way...

你有一个阵列里面的元组......幸运的是,你可以用同样的方式引用(或取消引用它们,具体取决于你的术语)......

so v = x[0] will give you as you state the tuple ("'content-length', '2501479'") and v[0] will give you 'content-length' and v[1] will give you '2501479' (although you probably want to do an int(v[0]) on that with perhaps some error checking.

所以v = x [0]会在你陈述元组(“'content-length','2501479'”)时给你,而v [0]会给你'内容长度'而v [1]会给你' 2501479'(虽然你可能想对它做一个int(v [0]),可能还有一些错误检查。

You may be better putting that array into a dict though; so you can be certain you are getting out the content length if the order should ever change.

你可能最好把那个数组放入dict中;因此,如果订单发生变化,您可以确定您的内容长度是多少。

Thankfully, the syntax is almost the same - it uses the [] operator. However, I am going to leave it to you to look at the python man pages to see how to convert an array -> dict (can't do everything for you!!)

值得庆幸的是,语法几乎相同 - 它使用[]运算符。但是,我将留给你看看python手册页,看看如何转换数组 - > dict(不能为你做所有事情!)

#5


0  

mas = [('content-length', '2501479'),
 ('accept-ranges', 'bytes'),
 ('vary', 'Accept-Encoding'),
 ('server', 'off'),
 ('last-modified', 'Thu, 20 Oct 2011 04:30:01 GMT'),
 ('etag', '"2c8171a-262b67-4afb368edfffc"'),
 ('date', 'Thu, 20 Oct 2011 16:01:11 GMT'),
 ('content-type', 'text/plain')]
mas = dict(mas)
mas.get('content-length')

#1


7  

First, you can make it a little easier to work with by turning your list of tuples into a dictionary:

首先,您可以通过将元组列表转换为字典来使其更容易使用:

>>> headers = [('content-length', '2501479'),
...  ('accept-ranges', 'bytes'),
...  ('vary', 'Accept-Encoding'),
...  ('server', 'off'),
...  ('last-modified', 'Thu, 20 Oct 2011 04:30:01 GMT'),
...  ('etag', '"2c8171a-262b67-4afb368edfffc"'),
...  ('date', 'Thu, 20 Oct 2011 16:01:11 GMT'),
...  ('content-type', 'text/plain')]
>>> 
>>> headers = dict(headers)
>>> int(headers['content-length'])
2501479

For the date, I would turn it into a datetime object using the email.utils.parsedate function:

对于日期,我会使用email.utils.parsedate函数将其转换为日期时间对象:

>>> import email.utils
>>> email.utils.parsedate(headers['date'])
(2011, 10, 20, 16, 1, 11, 0, 1, -1)

#2


4  

First, convert the tuples into a dict, and then convert the value to int to get a number:

首先,将元组转换为dict,然后将值转换为int以获取数字:

response_tupels = [('content-length', '2501479'), ('accept-ranges', 'bytes'),]
response = dict(response_tupels)
try:
  content_length = int(response['content-length'])
except KeyError:
  raise # Handle missing content-length here

#3


2  

You simply have to index it again in order to access the tuple. Like

您只需要再次索引它以访问元组。喜欢

length = variable[0][1]
last_mod = variable[4][1]

for size and the date of last modification.

尺寸和最后修改日期。

Note: This only works when the indices of content-length and last-modified are always the same.

注意:仅当content-length和last-modified的索引始终相同时才有效。

#4


0  

You've got tuples inside an array... Luckily you can reference (or dereference them depending on your terminology) the same way...

你有一个阵列里面的元组......幸运的是,你可以用同样的方式引用(或取消引用它们,具体取决于你的术语)......

so v = x[0] will give you as you state the tuple ("'content-length', '2501479'") and v[0] will give you 'content-length' and v[1] will give you '2501479' (although you probably want to do an int(v[0]) on that with perhaps some error checking.

所以v = x [0]会在你陈述元组(“'content-length','2501479'”)时给你,而v [0]会给你'内容长度'而v [1]会给你' 2501479'(虽然你可能想对它做一个int(v [0]),可能还有一些错误检查。

You may be better putting that array into a dict though; so you can be certain you are getting out the content length if the order should ever change.

你可能最好把那个数组放入dict中;因此,如果订单发生变化,您可以确定您的内容长度是多少。

Thankfully, the syntax is almost the same - it uses the [] operator. However, I am going to leave it to you to look at the python man pages to see how to convert an array -> dict (can't do everything for you!!)

值得庆幸的是,语法几乎相同 - 它使用[]运算符。但是,我将留给你看看python手册页,看看如何转换数组 - > dict(不能为你做所有事情!)

#5


0  

mas = [('content-length', '2501479'),
 ('accept-ranges', 'bytes'),
 ('vary', 'Accept-Encoding'),
 ('server', 'off'),
 ('last-modified', 'Thu, 20 Oct 2011 04:30:01 GMT'),
 ('etag', '"2c8171a-262b67-4afb368edfffc"'),
 ('date', 'Thu, 20 Oct 2011 16:01:11 GMT'),
 ('content-type', 'text/plain')]
mas = dict(mas)
mas.get('content-length')