Detailed explanation of Python collections module

When I first learned Python, I seemed to use the built-indictlisttupleThat's enough. But once you start writing more complex logic - such as the need for efficient double-ended insertion, elegant handling of missing keys, and detailed frequency statistics - the native container is a bit stretched: either the performance lags behind, or the code is filled with null and void logic.

At this time in the standard librarycollectionsModules are the perfect rescue solution. It has built-in 7+ collection classes optimized for specific scenarios. It is seamlessly compatible with native containers and provides a large number of advanced functions out of the box. Let’s break down the most core and most frequent tools.


1. namedtuple: immutable tuple with name

Ordinary tuples have a flaw: the elements are all accessed through index. If you look at the code two weeks later, you will probably not be able to remember whether index 0 is the x coordinate or the number.

namedtupleis specifically designed to save readability: it istupleA subclass of Tuple, each field has an attribute name, while retaining all the advantages of tuples such as immutability and memory efficiency.

Basic usage

from collections import namedtuple

# 定义方式 1:字段用列表或元组
Point = namedtuple("Point", ["x", "y"])

# 定义方式 2:字段用空格分隔的字符串(更简洁)
Student = namedtuple("Student", "name age grade")

# 像普通元组一样实例化,但可以通过属性名取值
p = Point(1.2, 3.4)
s = Student("小明", 10, 4)

print(p.x)      # 1.2
print(s.grade)  # 4

Core Features

  • Fully compatible with tuples: supports indexing, slicing,len()inforLoops and other tuple operations
  • Extremely memory efficient: almost the same as ordinary tuples, faster and more memory-saving than custom lightweight classes
  • Immutable: field value cannot be modified

Advanced Tips

# 1. 为字段设置默认值(Python 3.7+ 推荐使用 defaults 参数)
Point = namedtuple("Point", ["x", "y", "z"], defaults=[0, 0])
p1 = Point(5)          # Point(x=5, y=0, z=0)

# 2. 从序列或字典批量创建
coords = (7, 8, 9)
p2 = Point._make(coords)       # _make() 要求参数是长度匹配的可迭代对象

info = {"name": "小红", "age": 11, "grade": 5}
s2 = Student(**info)

# 3. 搭配类型注解(Python 3.6+ 推荐用 typing.NamedTuple)
from typing import NamedTuple

class TypedPoint(NamedTuple):
    x: float
    y: float = 0.0

tp = TypedPoint(2.5)
print(tp)   # TypedPoint(x=2.5, y=0.0)

2. deque: Double-ended queue (the optimal solution for queue/stack)

listAlthough queues can also be simulated (insert(0)append()) or stack (append()pop()), but the complexity of inserting/deleting from the head is O(n)**, and it will become significantly slower when there are more elements.

deque(double-ended queue) is specially designed for fast operation on both ends: left and rightappend / popBoth are O(1) and thread-safe.

Basic usage

from collections import deque

d = deque(["a", "b", "c"])

# 右侧操作(和 list 完全一致)
d.append("d")
d.pop()            # 弹出并返回 'd'

# 左侧操作(list 不具备的高效方法)
d.appendleft("z")
d.popleft()        # 弹出并返回 'z'

print(d)           # deque(['a', 'b', 'c'])

Core Features

  • O(1) on both ends: the first choice for queue/stack instead of list
  • Thread Safety: You can operate with peace of mind in a multi-threaded environment
  • Settingable maximum length: When maxlen is exceeded, the earliest added element will be automatically extruded

Advanced Tips

# 1. 限制长度,天然适配滑动窗口
window = deque(maxlen=3)
window.extend([1, 2, 3])
window.append(4)      # 自动挤掉最左侧的 1
print(window)         # deque([2, 3, 4], maxlen=3)

# 2. 旋转操作
nums = deque([1, 2, 3, 4, 5])
nums.rotate(1)        # 右移 1 位:最后一个移到最前
print(nums)           # deque([5, 1, 2, 3, 4])
nums.rotate(-2)       # 左移 2 位:前两个移到最后
print(nums)           # deque([2, 3, 4, 5, 1])

3. defaultdict: Dictionary with built-in guarantee

Use nativedictWhen accessing a non-existent key, the most annoying thing is to directly throwKeyError. For example, to count word frequency, you have to write:

if key in d:
    d[key] += 1
else:
    d[key] = 1

defaultdictIt is here to solve this pain point. it isdictA subclass of , you need to pass in a default value factory function when initializing - when accessing a non-existent key, this factory will be automatically called to generate a default value and fill in the dictionary.

Basic usage

from collections import defaultdict

# 场景 1:词频统计(工厂函数用 int,默认返回 0)
freq = defaultdict(int)
for char in "hello world":
    freq[char] += 1
print(freq)   # defaultdict(<class 'int'>, {'h': 1, 'e': 1, ..., 'd': 1})

# 场景 2:按首字母分组(工厂函数用 list,默认返回空列表)
groups = defaultdict(list)
words = ["apple", "banana", "apricot", "blueberry"]
for w in words:
    groups[w[0]].append(w)
print(groups)  # defaultdict(<class 'list'>, {'a': ['apple', 'apricot'], ...})

Core Features

  • Eliminate KeyError: Save a bunch ofif-elseDetermination code
  • Fully compatible with dict:keys()values()items()Use as usual
  • Factory functions are very flexible: any callable object without parameters will do, such asintlistset, or even functions written by yourself

Advanced Tips

# 自定义工厂,轻松创建嵌套字典
def nested_factory():
    return defaultdict(int)

# 二维计数:统计每个班级的男女生人数
class_gender = defaultdict(nested_factory)
class_gender["一班"]["男"] += 1
class_gender["一班"]["女"] += 2
class_gender["二班"]["男"] += 3

print(class_gender["一班"]["男"])  # 1
print(class_gender["三班"]["女"])  # 0(自动生成,不会报错)

4. Counter: Dictionary designed for counting

Although statistical frequency can be useddefaultdict(int)realized, butCounterArmed to the teeth with this scene: built-inmost_common(), multi-set mathematical operations and other exclusive functions, making the code shorter and more semantic.

Basic usage

from collections import Counter

# 从多种可迭代对象初始化
c1 = Counter("gallahad")                     # 字符串
c2 = Counter([1, 2, 2, 3, 3, 3])            # 列表
c3 = Counter({"a": 3, "b": 1})               # 甚至直接传字典

print(c1)        # Counter({'a': 3, 'l': 2, 'g': 1, ...})
print(c3["c"])   # 访问不存在的键返回 0,不会报错

Core exclusive functions

# 1. 获取最常见的 N 个元素
print(c2.most_common(2))   # [(3, 3), (2, 2)]

# 2. 多重集运算(+、-、&、|)
c4 = Counter(a=3, b=2, c=1)
c5 = Counter(a=1, b=3, d=1)

print(c4 + c5)   # 加法:所有键的计数相加(只保留正数)
print(c4 - c5)   # 减法:只保留相减后大于 0 的键
print(c4 & c5)   # 交集:取相同键的最小计数值
print(c4 | c5)   # 并集:取相同键的最大计数值

5. OrderedDict: A dictionary with more controllable order

Starting with Python 3.7, plaindictIt has been officially guaranteed to maintain the insertion order of keys. ThatOrderedDictWhat other sense of existence is there?

It mainly has two commondictUnique skills that cannot be done:

  1. Flexible sequential operations, such asmove_to_end()Move the specified key to the beginning or end
  2. Order-sensitive equality judgment: Only when the order and value of the key are the same are they considered equal.

Commonly used exclusive functions

from collections import OrderedDict

od = OrderedDict.fromkeys("abcde")   # 键:a、b、c、d、e,值都是 None

# 移动元素
od.move_to_end("b")                  # 把 b 移到最后
print(list(od.keys()))               # ['a', 'c', 'd', 'e', 'b']
od.move_to_end("d", last=False)      # 把 d 移到最前
print(list(od.keys()))               # ['d', 'a', 'c', 'e', 'b']

# 顺序敏感的相等性
d1 = {"a":1, "b":2}
d2 = {"b":2, "a":1}
print(d1 == d2)                      # 普通 dict:True

od1 = OrderedDict([("a",1), ("b",2)])
od2 = OrderedDict([("b",2), ("a",1)])
print(od1 == od2)                    # OrderedDict:False

6. Other utility classes

UserDict / UserList / UserString

These are wrapper classes for containers and do not directly inherit from the built-indictliststr, but the usage is almost exactly the same. Why are they recommended? Because directly inheriting built-in containers sometimes leads to pitfalls (for example, some built-in methods do not call the ones we rewrite).__setitem__), while inheritingUser*Classes can completely avoid such problems and make custom behavior more reliable.

An example of a custom dictionary "missing key hint" demonstrates its simplicity:

from collections import UserDict

class HintDict(UserDict):
    def __missing__(self, key):
        return f"💡 提示:找不到键 '{key}'"

hd = HintDict({"name": "小明"})
print(hd["name"])   # 小明
print(hd["age"])    # 💡 提示:找不到键 'age'

Summarize

BundlecollectionsThe core characters are recorded in a table for easy selection according to the scene at any time:

Class nameCore functionsTypical scenarios
namedtupleImmutable tuple with attribute nameLightweight immutable object (coordinates, configuration items)
dequeDeque with O(1) at both endsQueue, stack, sliding window
defaultdictDictionary with default values ​​Statistical frequencies, grouping, nested dictionaries
CounterDictionary specializing in countingWord frequency statistics, multi-set operations
OrderedDictDictionary supporting fine-grained sequential operationsLRU cache, order-sensitive configuration
ChainMapCombine views of multiple dictionaries (save memory)Configure priority management (Environment Variables > User Configuration > Default)
UserXWrapper for secure custom containersDevelop custom dictionary/list/string classes

These tools can significantly improve the readability and performance of your code. Next time you encounter similar needs, don’t rush to reinvent the wheel.collectionsDig around and chances are a ready-made solution is waiting for you there.