二、collections
collections是对Python现有的数据类型的补充,在使用collections中的对象要先导入import collections模块
1、Counter——计数器
1.1 Counter说明及定义
计数器是对字典的补充,继承自字典对象,也就是说除了具有字典的所有方法外,还有很多扩展的功能
定义Counter对象
Counter接受一个序列对象如列表、元祖、字符串等,返回成员出现的以成员为key出现次数value的字典(按照出现的次数的倒序排列)
>>> c = collections.Counter("adfawreqewradfa") >>> c Counter({'a': 4, 'd': 2, 'e': 2, 'r': 2, 'w': 2, 'f': 2, 'q': 1}) >>> c2 = collections.Counter(['zhang', 'tom', 'peter', 'zhang']) >>> c2 Counter({'zhang': 2, 'peter': 1, 'tom': 1})
1.2 Couter常用方法
1)most_common——返回前几个的元素和对应出现的次数(按照出现次数的倒序排列)
代码:
1 def most_common(self, n=None): 2 '''List the n most common elements and their counts from the most 3 common to the least. If n is None, then list all element counts. 4 5 >>> Counter('abcdeabcdabcaba').most_common(3) 6 [('a', 5), ('b', 4), ('c', 3)] 7 8 ''' 9 # Emulate Bag.sortedByCount from Smalltalk 10 if n is None: 11 return sorted(self.items(), key=_itemgetter(1), reverse=True) 12 return _heapq.nlargest(n, self.items(), key=_itemgetter(1))
示例:
Counter({'a': 4, 'd': 2, 'e': 2, 'r': 2, 'w': 2, 'f': 2, 'q': 1}) >>> c.most_common(3) [('a', 4), ('d', 2), ('e', 2)] >>> c.most_common(2) [('a', 4), ('d', 2)]
2)elements——返回所有元素,迭代器对象
代码:
1 def elements(self): 2 '''Iterator over elements repeating each as many times as its count. 3 4 >>> c = Counter('ABCABC') 5 >>> sorted(c.elements()) 6 ['A', 'A', 'B', 'B', 'C', 'C'] 7 8 # Knuth's example for prime factors of 1836: 2**2 * 3**3 * 17**1 9 >>> prime_factors = Counter({2: 2, 3: 3, 17: 1}) 10 >>> product = 1 11 >>> for factor in prime_factors.elements(): # loop over factors 12 ... product *= factor # and multiply them 13 >>> product 14 1836 15 16 Note, if an element's count has been set to zero or is a negative 17 number, elements() will ignore it. 18 19 ''' 20 # Emulate Bag.do from Smalltalk and Multiset.begin from C++. 21 return _chain.from_iterable(_starmap(_repeat, self.items())) 22 23 # Override dict methods where necessary
示例:
>>> c = collections.Counter("adfawreqewradfa") >>> c Counter({'a': 4, 'd': 2, 'e': 2, 'r': 2, 'w': 2, 'f': 2, 'q': 1}) >>> c.elements() <itertools.chain object at 0x7f63c8b0beb8> >>> list(c.elements()) ['d', 'd', 'q', 'e', 'e', 'r', 'r', 'w', 'w', 'f', 'f', 'a', 'a', 'a', 'a']
注意:返回的是一个迭代器对象,可以通过内置方法将其转化为列表对象,也可以字节通过for in进行遍历
3)update——添加一个新的成员,如果存在计数器的值进行累加,如果不存在将新建一个成员
代码:
1 def update(*args, **kwds): 2 ''' 3 类似字典的update方法,添加一个成员的同时计数器会进行累加 4 Like dict.update() but add counts instead of replacing them. 5 6 Source can be an iterable, a dictionary, or another Counter instance. 7 8 >>> c = Counter('which') 9 >>> c.update('witch') # add elements from another iterable 10 >>> d = Counter('watch') 11 >>> c.update(d) # add elements from another counter 12 >>> c['h'] # four 'h' in which, witch, and watch 13 4 14 15 ''' 16 # The regular dict.update() operation makes no sense here because the 17 # replace behavior results in the some of original untouched counts 18 # being mixed-in with all of the other counts for a mismash that 19 # doesn't have a straight-forward interpretation in most counting 20 # contexts. Instead, we implement straight-addition. Both the inputs 21 # and outputs are allowed to contain zero and negative counts. 22 23 if not args: 24 raise TypeError("descriptor 'update' of 'Counter' object " 25 "needs an argument") 26 self, *args = args 27 if len(args) > 1: 28 raise TypeError('expected at most 1 arguments, got %d' % len(args)) 29 iterable = args[0] if args else None 30 if iterable is not None: 31 if isinstance(iterable, Mapping): 32 if self: 33 self_get = self.get 34 for elem, count in iterable.items(): 35 self[elem] = count + self_get(elem, 0) 36 else: 37 super(Counter, self).update(iterable) # fast path when counter is empty 38 else: 39 _count_elements(self, iterable) 40 if kwds: 41 self.update(kwds)
示例:
>>> c = collections.Counter(['zhang', 'peter', 'tom', 'zhang']) >>> c Counter({'zhang': 2, 'peter': 1, 'tom': 1}) >>> c.update('peter') >>> c Counter({'zhang': 2, 'e': 2, 'peter': 1, 't': 1, 'r': 1, 'tom': 1, 'p': 1}) # 注意参数是一个序列对象,如果传的是一个字符串,字符串的每一个字符都会被当成一个元素 >>> c = collections.Counter(['zhang', 'peter', 'tom', 'zhang']) >>> c.update(['zhang']) >>> c Counter({'zhang': 3, 'peter': 1, 'tom': 1})
4)subtract——减去一个成员,计数器减1
代码:
1 def subtract(*args, **kwds): 2 '''Like dict.update() but subtracts counts instead of replacing them. 3 Counts can be reduced below zero. Both the inputs and outputs are 4 allowed to contain zero and negative counts. 5 6 Source can be an iterable, a dictionary, or another Counter instance. 7 8 >>> c = Counter('which') 9 >>> c.subtract('witch') # subtract elements from another iterable 10 >>> c.subtract(Counter('watch')) # subtract elements from another counter 11 >>> c['h'] # 2 in which, minus 1 in witch, minus 1 in watch 12 0 13 >>> c['w'] # 1 in which, minus 1 in witch, minus 1 in watch 14 -1 15 16 ''' 17 if not args: 18 raise TypeError("descriptor 'subtract' of 'Counter' object " 19 "needs an argument") 20 self, *args = args 21 if len(args) > 1: 22 raise TypeError('expected at most 1 arguments, got %d' % len(args)) 23 iterable = args[0] if args else None 24 if iterable is not None: 25 self_get = self.get 26 if isinstance(iterable, Mapping): 27 for elem, count in iterable.items(): 28 self[elem] = self_get(elem, 0) - count 29 else: 30 for elem in iterable: 31 self[elem] = self_get(elem, 0) - 1 32 if kwds: 33 self.subtract(kwds)
示例:
>>> c = collections.Counter(['zhang', 'peter', 'tom', 'zhang']) >>> c Counter({'zhang': 2, 'peter': 1, 'tom': 1}) >>> c.subtract(['zhang']) >>> c Counter({'peter': 1, 'tom': 1, 'zhang': 1}) >>> c.subtract(['zhang']) >>> c.subtract(['zhang']) >>> c Counter({'peter': 1, 'tom': 1, 'zhang': -1})
注意:如果成员已经不存在了或者说为0了,计数器会继续递减,也就是说计数器有0和负数的概念的,但是使用elements显示的时候却没有该成员,如果计时器是0或者负数能说明这个成员出现过而已,另外如果为负数的时候,添加成员,成员不会真的添加到elements显示的成员中,直到计数器大于0为止
>>> list(c.elements()) ['peter', 'tom'] >>> c.update(['zhang']) >>> list(c.elements()) ['peter', 'tom'] >>> c Counter({'peter': 1, 'tom': 1, 'zhang': 0})
2、OrderedDict——有序字典
2.1 有序字典说明及定义
我们知道字典的是无顺序的,orderedDict就是对字典的扩展,使其有序,并且根据添加顺序进行排序
>>> oc = collections.OrderedDict()
当然我们也可以通过一个现有的字典进行初始化一个有序字典
>>> old_dic = {'a':1, 'b':2, 'c':3} >>> new_dic = collections.OrderedDict(old_dic)>>> new_dic OrderedDict([('b', 2), ('c', 3), ('a', 1)])
说明:由于old_dic是无序的,所以初始化的OrderedDict顺序并不是我们看到的定义old_dic时候的顺序,只是后面再添加成员的时候顺序是有保障的
>>> new_dic['d'] = 4 >>> new_dic['e'] = 5 >>> new_dic OrderedDict([('b', 2), ('c', 3), ('a', 1), ('d', 4), ('e', 5)])
2.2 常用方法
1)clear——清空字典
代码:
1 def clear(self): # real signature unknown; restored from __doc__ 2 """ 3 清空字典 4 od.clear() -> None. Remove all items from od. """ 5 pass
示例:
>>> dic = collections.OrderedDict({'a':1, 'b':2, 'c':3}) >>> dic OrderedDict([('b', 2), ('c', 3), ('a', 1)]) >>> dic.clear() >>> dic OrderedDict()
2)keys——返回所有key组成的迭代对象
代码:
1 def keys(self, *args, **kwargs): # real signature unknown 2 pass
示例:
>>> dic = collections.OrderedDict({'a':1, 'b':2, 'c':3}) >>> dic.keys() KeysView(OrderedDict([('b', 2), ('c', 3), ('a', 1)]))
注意:返回的一个可迭代的对象,同样可以使用for in方法进行循环遍历,与原生字典不同的是有序字典返回的keys也是有序的
3)values——返回所有value组成的迭代对象
代码:
1 def values(self, *args, **kwargs): # real signature unknown 2 pass
示例:
>>> dic = collections.OrderedDict({'a':1, 'b':2, 'c':3}) >>> dic.values() ValuesView(OrderedDict([('b', 2), ('c', 3), ('a', 1)]))
说明:同样是有序的
4)items——返回key和value组成的迭代对象
代码:
1 def items(self, *args, **kwargs): # real signature unknown 2 pass
示例:
>>> dic.items() ItemsView(OrderedDict([('b', 2), ('c', 3), ('a', 1)]))
5)pop——删除指定key的元素
代码:
1 def pop(self, k, d=None): # real signature unknown; restored from __doc__ 2 """ 3 删除指定key的元素,并返回key所对应的值 4 k:要删除的元素的key 5 d:如果key不存在返回的默认值 6 od.pop(k[,d]) -> v, remove specified key and return the corresponding 7 value. If key is not found, d is returned if given, otherwise KeyError 8 is raised. 9 """ 10 pass
示例:
>>> dic = collections.OrderedDict({'a':1, 'b':2, 'c':3}) >>> dic OrderedDict([('b', 2), ('c', 3), ('a', 1)]) >>> dic.pop('b') 2 >>> dic OrderedDict([('c', 3), ('a', 1)]) >>> dic.pop('d', 10) 10
6)popitem——删除末尾的元素
代码:
1 def popitem(self): # real signature unknown; restored from __doc__ 2 """ 3 删除末尾的元素,并返回删除的元素的key和value 4 od.popitem() -> (k, v), return and remove a (key, value) pair. 5 Pairs are returned in LIFO order if last is true or FIFO order if false. 6 """ 7 pass
示例:
>>> dic = collections.OrderedDict({'a':1, 'b':2, 'c':3}) >>> dic OrderedDict([('b', 2), ('c', 3), ('a', 1)]) >>> dic.popitem() ('a', 1)
说明:与原生字典不同的是,由于字典是有序的,所以删除不是随机的,而是删除排在最后的
7)setdefault——设置默认值
代码:
1 def setdefault(self, k, d=None): # real signature unknown; restored from __doc__ 2 """ 3 设置某个键的默认值,使用get方法如果该键不存在返回的值 4 od.setdefault(k[,d]) -> od.get(k,d), also set od[k]=d if k not in od """ 5 pass
示例:同原生字典
8)update——将另一个字典更新到当前字典
代码
1 def update(self, *args, **kwargs): # real signature unknown 2 pass
示例:同原生字典,不同的是有序和无序
9)move_to_end——将一个存在的元素移动到字典的末尾
代码:
1 def move_to_end(self, *args, **kwargs): # real signature unknown 2 """ 3 移动一个元素到字典的末尾,如果该元素不存在这回抛出KeyError异常 4 Move an existing element to the end (or beginning if last==False). 5 6 Raises KeyError if the element does not exist. 7 When last=True, acts like a fast version of self[key]=self.pop(key). 8 """ 9 pass
示例:
>>> dic = collections.OrderedDict({'a':1, 'b':2, 'c':3}) >>> dic OrderedDict([('b', 2), ('c', 3), ('a', 1)]) >>> dic.move_to_end('b') >>> dic OrderedDict([('c', 3), ('a', 1), ('b', 2)])
3、defaultdict——默认字典
defaultdict是对字典的扩展,它默认个给字典的值设置了一种默认的数据类型,其他的均与原生字典一样
>>> ddic = collections.defaultdict(list) # 定义的时候需要指定默认的数据类型,这里指定的是列表类型 >>> ddic['k1'].append('a') # 尽管当前key还没有值,但是它默认已经是列表类型的类型,所以直接可以是用列表的append方法 >>> ddic defaultdict(<class 'list'>, {'k1': ['a']})
4、namedtuple——可命名元祖
可命名元祖是元祖的扩展,包含所有元祖的方法的同时可以给每个元祖的元素命名,访问的时候也不需要在通过索引进行访问,直接通过元素名即可访问
>>> MytupleClass = collections.namedtuple('MytupleClass',['x', 'y', 'z']) >>> mytup = MytupleClass(11,22,33) >>> mytup.x 11 >>> mytup.y 22 >>> mytup.z 33
5、deque——双向队列
deque是一个线程安全的双向队列,类似列表,不同的是,deque是线程安全,并且是双向的也就是两边都可以进出
4.1 定义
d = collections.deque()
4.2 常用方法
1)append——从右边追加一个元素到队列的末尾
代码:
1 def append(self, *args, **kwargs): # real signature unknown 2 """ 3 从右边追加一个元素到队列的末尾 4 Add an element to the right side of the deque. """ 5 pass
示例:
>>> d = collections.deque([1, 2, 3]) >>> d deque([1, 2, 3]) >>> d.append(4) >>> d deque([1, 2, 3, 4])
2)appendleft——从左边追加一个元素到队列的末尾
代码:
1 def appendleft(self, *args, **kwargs): # real signature unknown 2 """ 3 从左边追加一个元素到队列的末尾 4 Add an element to the left side of the deque. """ 5 pass
示例:
>>> d = collections.deque([1, 2, 3]) >>> d deque([1, 2, 3]) >>> d.appendleft(4) >>> d deque([4, 1, 2, 3])
3)clear——清空队列
代码:
1 def clear(self, *args, **kwargs): # real signature unknown 2 """ 3 清空队列 4 Remove all elements from the deque. """ 5 pass
示例:
>>> d = collections.deque([1, 2, 3]) >>> d deque([1, 2, 3]) >>> d.clear() >>> d deque([])
4)count——返回某个成员重复出现的次数
代码:
def count(self, value): # real signature unknown; restored from __doc__ """ 返回某个元素出现的次数 D.count(value) -> integer -- return number of occurrences of value """ return 0
示例:
>>> d = collections.deque([1, 2, 3, 2]) >>> d.count(2) 2
5)extend——从队列右边扩展一个可迭代的对象
代码:
1 def extend(self, *args, **kwargs): # real signature unknown 2 """ 3 从队列右边扩展一个可迭代的对象 4 Extend the right side of the deque with elements from the iterable """ 5 pass
示例:
>>> d = collections.deque([1, 2, 3]) >>> d deque([1, 2, 3]) >>> d.extend([4, 5]) >>> d deque([1, 2, 3, 4, 5])
6)extendleft——从队列左侧扩展一个可迭代的对象
代码:
1 def extendleft(self, *args, **kwargs): # real signature unknown 2 """ 3 从队列左侧扩展一个可迭代对象 4 Extend the left side of the deque with elements from the iterable """ 5 pass
示例:
>>> d = collections.deque([1, 2, 3]) >>> d deque([1, 2, 3]) >>> d.extendleft([4, 5]) >>> d deque([5, 4, 1, 2, 3])
7)index——查找并返回索引
代码:
1 def index(self, value, start=None, stop=None): # real signature unknown; restored from __doc__ 2 """ 3 查找元素是否存在,如果不存在将会抛出ValueError异常,如果存在返回第一找到的索引位置 4 value:要查找的元素 5 start:查找的开始所以你能 6 stop:查找的结束索引 7 D.index(value, [start, [stop]]) -> integer -- return first index of value. 8 Raises ValueError if the value is not present. 9 """ 10 return 0
说明:使用方法同列表,需要说明的是虽然是双向列表,但索引还是从左到右编码的
8)insert——插入索引
还没有实现
>>> d = collections.deque([1, 2, 3]) >>> d.insert(0, 4) Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'collections.deque' object has no attribute 'insert'
9)pop——从队列右侧末尾删除一个元素,并返回该元素
代码:
1 def pop(self, *args, **kwargs): # real signature unknown 2 """ 3 从队列右侧删除一个元素,并返回该元素 4 Remove and return the rightmost element. """ 5 pass
示例:
>>> d = collections.deque([1, 2, 3]) >>> d.pop() 3
10)popleft——从队列左侧删除一个元素,并返回该元素
代码:
1 def popleft(self, *args, **kwargs): # real signature unknown 2 """ 3 从队列的左侧删除一个元素,并返回该元素 4 Remove and return the leftmost element. """ 5 pass
示例:
>>> d = collections.deque([1, 2, 3]) >>> d.popleft() 1
11)remove——删除一个元素
代码:
1 def remove(self, value): # real signature unknown; restored from __doc__ 2 """ 3 从队列左侧开始查找,并删除找到的第一个匹配的元素 4 D.remove(value) -- remove first occurrence of value. """ 5 pass
示例:
>>> d = collections.deque([1, 2, 3, 2]) >>> d deque([1, 2, 3, 2]) >>> d.remove(2) >>> d deque([1, 3, 2])
12)reverse——翻转队列
代码:
1 def reverse(self): # real signature unknown; restored from __doc__ 2 """ 3 翻转队列 4 D.reverse() -- reverse *IN PLACE* """ 5 pass
示例:
>>> d = collections.deque([1, 2, 3]) >>> d.reverse() >>> d deque([3, 2, 1])
13)rotate——旋转队列
双向队列的旋转可以理解为,双向队列的首位是相连的环,旋转就是元素移动了多少个位置,如下图所示,或者说从左边取出元素追加到右边,追加了多少次
代码:
1 def rotate(self, *args, **kwargs): # real signature unknown 2 """ 3 队列旋转,默认移动1位 4 Rotate the deque n steps to the right (default n=1). If n is negative, rotates left. """ 5 pass
示例:
>>> d = collections.deque([1, 2, 3, 4, 5]) >>> d deque([1, 2, 3, 4, 5]) >>> d.rotate(2) >>> d deque([4, 5, 1, 2, 3])