collections----容器数据类型
collections模块包含了除list、dict、和tuple之外的容器数据类型,如counter、defaultdict、deque、namedtuple、orderdict,下面将一一介绍。
Counter
初始化:
Counter 支持三种形式的初始化。它的构造函数可以调用序列,一个字典包含密钥和计数,或使用关键字参数映射的字符串名称。
import collections print (collections.Counter(['a', 'b', 'c', 'a', 'b', 'b'])) print (collections.Counter({'a':2, 'b':3, 'c':1})) print (collections.Counter(a=2, b=3, c=1))
输出结果:
Counter({'b': 3, 'a': 2, 'c': 1}) Counter({'b': 3, 'a': 2, 'c': 1}) Counter({'b': 3, 'a': 2, 'c': 1})
空的Counter容器可以无参数构造,并采用update()方法进行更新
import collections c = collections.Counter() print ('Initial :', c) c.update('abcdaab') print ('Sequence:', c) c.update({'a':1, 'd':5}) print ('Dict :', c)
输出结果:
Initial : Counter() Sequence: Counter({'a': 3, 'b': 2, 'c': 1, 'd': 1}) Dict : Counter({'d': 6, 'a': 4, 'b': 2, 'c': 1})
访问计数:
当一个Counter被构造成功,它的值可以采用字典进行访问
import collections c = collections.Counter('abcdaab') for letter in 'abcde': print ('%s : %d' % (letter, c[letter]))
结果:
a : 3 b : 2 c : 1 d : 1 e : 0
elements()方法可以返回一个包含所有Counter数据的迭代器
import collections c = collections.Counter('extremely') c['z'] = 0 print(c) print (list(c.elements()))
Counter({'e': 3, 'm': 1, 'l': 1, 'r': 1, 't': 1, 'y': 1, 'x': 1, 'z': 0}) ['e', 'e', 'e', 'm', 'l', 'r', 't', 'y', 'x']
most_common()返回前n个最多的数据
import collections c=collections.Counter('aassdddffff') for letter, count in c.most_common(2): print ('%s: %d' % (letter, count))
f: 4
d: 3
算法:
Counter实例支持聚合结果的算术和集合操作。
import collections c1 = collections.Counter(['a', 'b', 'c', 'a', 'b', 'b']) c2 = collections.Counter('alphabet') print ('C1:', c1) print ('C2:', c2) print ('\nCombined counts:') print (c1 + c2) print ('\nSubtraction:') print (c1 - c2) print ('\nIntersection (taking positive minimums):') print (c1 & c2) print ('\nUnion (taking maximums):') print (c1 | c2)
C1: Counter({'b': 3, 'a': 2, 'c': 1}) C2: Counter({'a': 2, 'l': 1, 'p': 1, 'h': 1, 'b': 1, 'e': 1, 't': 1}) Combined counts: Counter({'a': 4, 'b': 4, 'c': 1, 'l': 1, 'p': 1, 'h': 1, 'e': 1, 't': 1}) Subtraction: Counter({'b': 2, 'c': 1}) Intersection (taking positive minimums): Counter({'a': 2, 'b': 1}) Union (taking maximums): Counter({'b': 3, 'a': 2, 'c': 1, 'l': 1, 'p': 1, 'h': 1, 'e': 1, 't': 1})
defaultdict
标准字典包括setdefault方法()获取一个值,如果值不存在,建立一个默认。相比之下,defaultdict允许调用者在初始化时预先设置默认值。
import collections def default_factory(): return 'default value' d = collections.defaultdict(default_factory, foo='bar') print ('d:', d) print ('foo =>', d['foo']) print ('x =>', d['x'])
d: defaultdict(<function default_factory at 0x000002567E713E18>, {'foo': 'bar'}) foo => bar x => default value
Deque
双端队列,支持从两端添加和删除元素。更常用的栈和队列是退化形式的双端队列,仅限于一端在输入和输出。
import collections d = collections.deque('abcdefg') print ('Deque:', d) print ('Length:', len(d)) print ('Left end:', d[0]) print ('Right end:', d[-1]) d.remove('c') print ('remove(c):', d)
Deque: deque(['a', 'b', 'c', 'd', 'e', 'f', 'g']) Length: 7 Left end: a Right end: g remove(c): deque(['a', 'b', 'd', 'e', 'f', 'g'])
双端队列从左右两端插入数据
import collections # 右端插入 d = collections.deque() d.extend('abcdefg') print ('extend :', d) d.append('h') print ('append :', d) # 左端插入 d = collections.deque() d.extendleft('abcdefg') print ('extendleft:', d) d.appendleft('h') print ('appendleft:', d)
extend : deque(['a', 'b', 'c', 'd', 'e', 'f', 'g']) append : deque(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']) extendleft: deque(['g', 'f', 'e', 'd', 'c', 'b', 'a']) appendleft: deque(['h', 'g', 'f', 'e', 'd', 'c', 'b', 'a'])
类似地,双端队列的元素可以从两端获取。
import collections print ('From the right:') d = collections.deque('abcdefg') while True: try: print (d.pop()) except IndexError: break print ('\nFrom the left:') d = collections.deque('abcdefg') while True: try: print (d.popleft()) except IndexError: break
From the right:
g
f
e
d
c
b
a
From the left:
a
b
c
d
e
f
g
由于双端队列是线程安全的,在单独的线程中内容甚至可以从两端同时消费。
import collections import threading import time candle = collections.deque(xrange(11)) def burn(direction, nextSource): while True: try: next = nextSource() except IndexError: break else: print ('%8s: %s' % (direction, next)) time.sleep(0.1) print ('%8s done' % direction) return left = threading.Thread(target=burn, args=('Left', candle.popleft)) right = threading.Thread(target=burn, args=('Right', candle.pop)) left.start() right.start() left.join() right.join()
Left: 0 Right: 10 Right: 9 Left: 1 Right: 8 Left: 2 Right: 7 Left: 3 Right: 6 Left: 4 Right: 5 Left done Right done
队列的另一个有用的功能是在任意方向旋转,通俗来讲就是队列的左移右移
import collections d = collections.deque(xrange(10)) print ('Normal :', d) d = collections.deque(xrange(10)) d.rotate(2) print ('Right rotation:', d) d = collections.deque(xrange(10)) d.rotate(-2) print ('Left rotation :', d)
Normal : deque([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) Right rotation: deque([8, 9, 0, 1, 2, 3, 4, 5, 6, 7]) Left rotation : deque([2, 3, 4, 5, 6, 7, 8, 9, 0, 1])
namedtuple
标准的元组使用数值索引来访问其成员
bob = ('Bob', 30, 'male') print ('Representation:', bob) jane = ('Jane', 29, 'female') print ('\nField by index:', jane[0]) print ('\nFields by index:') for p in [ bob, jane ]: print ('%s is a %d year old %s' % p)
Representation: ('Bob', 30, 'male') Field by index: Jane Fields by index: Bob is a 30 year old male Jane is a 29 year old femal
记住每个索引对应的值是很容易出错的,尤其是在元组有多个元素的情况下。namedtuple为每个成员分配了名字。
import collections Person = collections.namedtuple('Person', 'name age gender') print ('Type of Person:', type(Person)) bob = Person(name='Bob', age=30, gender='male') print ('\nRepresentation:', bob) jane = Person(name='Jane', age=29, gender='female') print ('\nField by name:', jane.name) print ('\nFields by index:') for p in [ bob, jane ]: print ('%s is a %d year old %s' % p)
Type of Person: <type 'type'> Representation: Person(name='Bob', age=30, gender='male') Field by name: Jane Fields by index: Bob is a 30 year old male Jane is a 29 year old female
字段名称解析,无效值会导致ValueError
import collections try: collections.namedtuple('Person', 'name class age gender') except ValueError, err: print (err) try: collections.namedtuple('Person', 'name age gender age') except ValueError, err: print (err)
Type names and field names cannot be a keyword: 'class' Encountered duplicate field name: 'age'
OrderedDict
OrderedDict是字典子类,记得其内容被添加的顺序
import collections print ('Regular dictionary:') d = {} d['a'] = 'A' d['b'] = 'B' d['c'] = 'C' d['d'] = 'D' d['e'] = 'E' for k, v in d.items(): print( k, v) print ('\nOrderedDict:') d = collections.OrderedDict() d['a'] = 'A' d['b'] = 'B' d['c'] = 'C' d['d'] = 'D' d['e'] = 'E' for k, v in d.items(): print (k, v)
Regular dictionary:
a A
c C
b B
e E
d D
OrderedDict:
a A
b B
c C
d D
e E
参考来源:https://pymotw.com/2/collections/index.html#module-collections