[python标准库]Pickle模块

时间:2022-07-19 22:22:24

Pickle-------python对象序列化

本文主要阐述以下几点:

  1.pickle模块简介

  2.cPickle模块

  3.pickle模块提供的方法

  4.注意事项

  5.实例解析

1.pickle模块简介

The pickle module implements a fundamental, but powerful algorithm for serializing(序列化) and de-serializing(反序列化) a Python object structure. 
“Pickling”
is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream is converted back into an object hierarchy.

         [python标准库]Pickle模块

                  上面图示说明了pickle模块的功能

  那么为什么需要序列化和反序列化这一操作呢?

  1.便于存储。序列化过程将文本信息转变为二进制数据流。这样就信息就容易存储在硬盘之中,当需要读取文件的时候,从硬盘中读取数据,然后再将其反序列化便可以得到原始的数据。在Python程序运行中得到了一些字符串、列表、字典等数据,想要长久的保存下来,方便以后使用,而不是简单的放入内存中关机断电就丢失数据。python模块大全中的Pickle模块就派上用场了,它可以将对象转换为一种可以传输或存储的格式。

  2.便于传输。当两个进程在进行远程通信时,彼此可以发送各种类型的数据。无论是何种类型的数据,都会以二进制序列的形式在网络上传送。发送方需要把這个对象转换为字节序列,在能在网络上传输;接收方则需要把字节序列在恢复为对象。

2.cPickle模块--一种比pickle更快的模块

  cpickle模块与pickle模块一样,也支持序列化和反序列化。但是二者也有一些区别。

  第一,cPickle比pickle快1000倍(官方说法).因为前者用C来执行的.

  第二,其次,在cPickle模块可以调用pickler()和unpickler()函数,而不是类。

  但是,pickle和cpickle产生的字节流是一样。现在cpickle基本没怎么使用,python2.x和python3.x都支持pickle.

3.pickle模块提供的方法(python2.x与python3.x)

 先来看看python2.x提供的方法,然后通过源码来看看两着的区别。

pickle.dump(obj, file[, protocol]) #将pickle对象保存至文件
  Write a pickled representation of obj to the open file object file.

#将一个pickled对象写入到文件对象中。等价于
Pickler(file, protocol).dump(obj)
  If the protocol parameter is omitted, protocol 0 is used. 
#如果协议参数省略,默认为0,这里的协议参数我们先不讨论
   file must have a write() method that accepts a single string argument.
#注意这里。文件必须具有可写权限

  由上看出dump方法是将一个被pickle的字符串写入到文件中保存。这种方法可以用在保存用户密码,通过dump方法写入文件保存。

pickle.load(file) #从文件读取数据,边读取边反序列
Read a string from the open file object file and interpret it as a pickle data stream, reconstructing and returning the original object hierarchy.
  #从文件对象读取字符串,同时将其反序列化输出,这个方法等价于
Unpickler(file).load().
 file must have two methods, a read() method that takes an integer argument, and a readline() method that requires no arguments. 
 #文件必须可读.其中read()方法必须接受一个整数参数,来 确定读取的行数;readline()可不必设置参数
This function automatically determines whether the data stream was written in binary mode or not.
 #这个函数能够自动识别写入的数据是否为二进制形式

   以上两个方法是对文件的序列化和反序列化,切记不要和以下两个方法弄混了。

pickle.dumps(obj[, protocol])
  Return the pickled representation of the object as a string, instead of writing it to a file.#序列化字符串对象,但不写入到文件

  If the protocol parameter
is omitted, protocol 0 is used.

pickle.loads(string)
  Read a pickled object hierarchy
from a string.

  总结:1.如果想把序列化的字符串写入文件,用load()和dump()方法。

     2.序列化到内存,可以采用loads()和dumps()方法。

  下面来看看python3.x提供的四种方法,我们重点关注它们与Python2.x提供方法的区别。

pickle.dump(obj, file, protocol=None, *, fix_imports=True)
Write a pickled representation of obj to the open file object file. 等价于
Pickler(file, protocol).dump(obj).#保存至文件
  

The file argument must have a write() method that accepts a single bytes argument.

pickle.dumps(obj, protocol
=None, *, fix_imports=True)
Return the pickled representation of the object as a bytes object, instead of writing it to a file.

pickle.load(file,
*, fix_imports=True, encoding="ASCII", errors="strict")
Read a pickled object representation
from the open file object file and return the reconstituted object hierarchy specified therein.
 # 等价于 Unpickler(file).load().
 The argument file must have two methods, a read() method that takes an integer argument, and a readline() method that requires no arguments. 

Both methods should return bytes. #read()和readline()返回字节
Optional keyword arguments are fix_importsre, encoding
and errors.#重点来关注编码

The encoding can be ‘bytes’ to read these 8-bit string instances as bytes objects.
pickle.loads(bytes_object,
*, fix_imports=True, encoding="ASCII", errors="strict")
Read a pickled object hierarchy
from a bytes object and return the reconstituted object hierarchy specified therein.

similar to pickle.load()

  总结:

4.注意事项

  哪些可以被序列化和反序列化? 

None, True, and False
整数, 长整数, 浮点数, 复合整数
normal
and Unicode strings
tuples, lists, sets,
and dictionaries containing only picklable objects(也就是说这三种类型可嵌套)
functions defined at the top level of a module#定义在顶层模块的函数
built
-in functions defined at the top level of a module#定义在顶层模块的内置函数
classes that are defined at the top level of a module#定义在顶层模块的类
instances of such classes whose
__dict__ or the result of calling __getstate__() is picklable (see section The pickle protocol for details).  

5.实例解析

#example1
import pickle
info = [1,2,3,4,'asd']
data1
= pickle.dumps(info)
data2
= pickle.loads(data1)

print data1
print data2
#example2
import
pickle
entry
= {'a' : 11,'b' : 22}
with open(
'entry.pickle','wb') as f :
pickle.dump(entry,f) #序列化到文件

with open(
'entry.pickle','rb') as f:
entry
= pickle.load(f) #从文件中反序列化出数据
print entry
#序列化
import pickle

data1
= {'a': [1, 2.0, 3, 4+6j],
'b': ('string', u'Unicode string'),
'c': None}

selfref_list
= [1, 2, 3]
selfref_list.append(selfref_list)

output
= open('data', 'wb')

# Pickle dictionary using protocol 0.
pickle.dump(data1, output)

# Pickle the list using the highest protocol available.
pickle.dump(selfref_list, output, -1)

output.close()
#反序列化
import pprint, pickle

pkl_file
= open('data', 'rb')

data1
= pickle.load(pkl_file)
pprint.pprint(data1)

data2
= pickle.load(pkl_file)
pprint.pprint(data2)

pkl_file.close()
reader = TextReader("hello.txt")
>>> reader.readline()
'1: Hello world!'
>>> reader.readline()
'2: I am line number two.'
>>> new_reader = pickle.loads(pickle.dumps(reader))
>>> new_reader.readline()
'3: Goodbye!'

  python2.x和python3.x序列化与反序列化基本上是一样,这里就不再赘述了。

  

下一篇:[python标准库]JSON模块http://www.cnblogs.com/vipchenwei/p/6951455.html