Four common data structures in standard library collections

collections library is a part of the standard library. There are many data structures in it. Many modifications and improvements have been made on the basis of lists, dictionaries and tuples.

Let's talk about the most useful ones today.


It implements a queue that can be operated at both ends, which is equivalent to a double ended queue. It can specify how many elements can be stored at most, which is very similar to the list of basic data types in Python.

from collections import deque
a = deque(maxlen=3)

The above code defines a double ended queue with a size of 3. When you insert the fourth element, the first element of the queue will be deleted.

a = deque(maxlen=3)
a.append(1) # a = [1]
a.append(2) # a = [1, 2]
a.append(3) # a = [1, 2, 3] FULL
a.append(4) # a = [2,3,4]

Because this is a double ended queue, you can insert elements at the beginning of the queue or delete elements at the beginning and end of the queue. Their time complexity is O(1):

  • append(x) inserts an X at the end of the queue
  • appendleft(x) inserts an X at the head of the queue
  • pop() deletes an element at the end of the queue and returns
  • popleft() deletes an element at the head of the queue and returns
a = deque(maxlen = 10)
a.append(1) # a = [1]
a.append(2) #a = [1, 2] insert 2 at the end of the queue
a.appendleft(3)#a = [3, 1, 2] insert {3 at the head of the queue
x = a.pop() #a = [3, 1], x = 2 delete queue tail element 2
y = a.popleft() #a = [1], y = 3 = delete queue header element 3


This library provides named tuples that can be accessed by the specified name, for example:

from collections import namedtuple
Point = namedtuple("Point", ['x','y','z'])
p = Point(3,4,5)
print(p.x, p.y, p.z) #Output: 3, 4, 5

The namedtuple function takes the first parameter as the name of the new element group, and the second parameter is the name mapping of the elements in the tuple, which can be a string list or a string separated by spaces or commas.

Point = namedtuple("Point", "x y z")
Point = namedtuple("Point", "x,y,z")

It can also be initialized in this way, which is very flexible:

p1 = Point(3,4,5)
p2 = Point(x=3, y=4, z=5)
p3 = Point._make([3,4,5])

You can also use namedtuple to set the default values:

PointDef = namedtuple("PointDef", "x, y, z", defaults = [0,0,0])
p = PointDef(x=1) # p is (1,0,0)

If you define three names but provide two default values, only the last two will be given default values:

Point = namedtuple("Point", "x y z",defaults ="[0, 0])
# output: {"y": 0, "z": 0}


Counter counter is very useful, especially when you need to count the number of elements in a list or iteratable object:

from collections import Counter
c = Counter("aaabbccdaaa")
#Output: Counter({'a': 6, 'b': 2, 'c': 2, 'd': 1})

It is also convenient to make statistics on the top several frequencies, such as the two elements with the highest frequency:

#output: [('a', 6), ('b', 2)]

You can also dynamically add or delete strings, and then make statistics:

c = Counter("abbc") # {"a":1, "b":2, "c":1}
c.update("bccd") # {"a":1, "b":3, "c":3, "d":1}
c.subtract("bbc") # {"a":1, "b":1, "c":2, "d":1}


defaultdict is similar to dict, but the default data type of values of dict can be provided, such as:

from collections import defaultdict
toAdd =[("key1", 3), ("key2", 5), ("key3", 6), ("key2", 7)]
d = defaultdict(list)
for key, val in toAdd:
print(d) # {"key1":[3], "key2":[5, 7], "key3":[6]}

If you use dict, you might write:

d = dict()
for key, val in toAdd:
  if key in d:
    d[key] = [val]

Or something like this:

d = dict()
for key, val in toAdd:
  d.setdefault(key, []).append(val)

In short, defaultdict is simple and fast.


This article shares several data results commonly used in the four collections libraries. If it is helpful to you, please like it and pay attention to support

Keywords: Python data structure

Added by steply on Mon, 10 Jan 2022 08:31:16 +0200