Python itertools module tutorial

Introduction

When dealing with sequence splicing, permutation and combination, and grouping and filtering, the first reaction of many Python developers is to write nested loops, temporary lists, and complex conditional judgments—the result is often complex code, large memory usage, and average readability.

In fact, there is an "efficiency tool" hidden in the Python standard library for a long time: itertools. It provides a series of utility functions that return iterators, specifically designed to handle iterable objects efficiently. These tools can be used independently or combined like building blocks to help you write concise and elegant code without taking up a lot of memory.

This article will help you start with the three most commonly used tools and master them step by step.itertoolscore usage.


Infinite iterator

itertoolsThere are 3 built-in iterators that can "generate data infinitely". They have no default termination point. You must manually add conditions to control the end when using them, otherwise it will enter an infinite loop.

1. count() - infinite generator of arithmetic sequence

count(start=0, step=1)fromstartTo start, followstep(Supports negative numbers and floating point numbers) Output values ​​with infinite steps. Suitable for generating natural numbers, odd sequences, timestamps and other regular sequence sequences.

import itertools

# 从3开始步长为2的无限奇数序列(取前10个)
odds = itertools.count(3, 2)
for idx, num in enumerate(odds):
    print(num, end=" ")
    if idx == 9:
        break
# 输出: 3 5 7 9 11 13 15 17 19 21

2. cycle() - sequence infinite looper

cycle(iterable)An iterable object (string, list, tuple, etc.) will be repeated infinitely. It is very convenient to use it to implement carousels and infinite loop marks.

import itertools

# 模拟红绿灯循环:红3秒、绿2秒、黄1秒,运行20秒后停止
lights = itertools.cycle([("红", 3), ("绿", 2), ("黄", 1)])
total = 0
for color, duration in lights:
    print(f"{color}{duration}秒")
    total += duration
    if total >= 20:
        break

3. repeat() - single element repeater

repeat(object[, times])Repeatedly output the same object, default to infinite repetition, specifiedtimesThen it becomes a finite iterator. Commonly used to initialize placeholder lists and test batch processing functions.

import itertools

# 初始化100个空字典占位符
placeholders = list(itertools.repeat({}, 100))

# 测试打印(必须手动 break)
for i, msg in enumerate(itertools.repeat("hello")):
    print(msg)
    if i == 2:
        break

Finite iterator

Next are tools for processing existing iterable objects and quick "cropping/splicing/filtering". They all stop automatically when a certain condition is encountered or elements are exhausted.

1. islice() - safe iterator slicing

islice(iterable, start, stop, step)Similar to list slicing, but operates directly on iterators and does not copy data. It is suitable for intercepting the first N items of a large data stream, which is like adding a "safety guardrail" to an infinite sequence.

import itertools

# 从无限自然数中取第5到第9个元素(索引从0开始)
nums = itertools.count(1)
slice_nums = itertools.islice(nums, 5, 10)
print(list(slice_nums))   # 输出: [6, 7, 8, 9, 10]

# 取前5个元素
first_five = list(itertools.islice(itertools.count(100, 5), 5))
print(first_five)        # 输出: [100, 105, 110, 115, 120]

2. takewhile() - conditional pre-filter

takewhile(predicate, iterable)Return elements from the beginning until the predicate function returns for the first timeFalse. with built-infilter()different:filter()will filter out all elements that meet the conditions, andtakewhile()It will terminate when it encounters the first one that is not satisfied.

import itertools

# 从自然数中取出所有 ≤8 且为偶数的数
naturals = itertools.count(1)
even_le8 = itertools.takewhile(lambda x: x <= 8 and x % 2 == 0, naturals)
print(list(even_le8))   # 输出: [2, 4, 6, 8]

3. chain() - seamless splicing of multiple iterable objects

chain(*iterables)Connect multiple iterable objects (regardless of whether the types are consistent) into an iterator in sequence, without creating an intermediate list, saving memory.

import itertools

# 拼接字符串、列表、range 对象
combined = itertools.chain("AB", ["C", "D"], range(5, 8))
print(list(combined))   # 输出: ['A', 'B', 'C', 'D', 5, 6, 7]

4. groupby() - adjacent element grouper

groupby(iterable, key=None)Put adjacent andkeyElements with the same value are grouped into the same group and returned(key, 组迭代器)iterator. Pay special attention to the word "adjacent": if they are the samekeyThe elements are not consecutive and will be divided into different groups. Usually you need to press the data firstkeySort** and then group.

import itertools

# 基础用法:按相邻原元素分组
text = "AAABBBCCAAA"
for key, group in itertools.groupby(text):
    print(key, list(group))
# A ['A', 'A', 'A']
# B ['B', 'B', 'B']
# C ['C', 'C']
# A ['A', 'A', 'A']

# 进阶用法:忽略大小写分组(先排序)
unsorted = "AaaBBbcCAAa"
sorted_chars = sorted(unsorted, key=lambda c: c.upper())
for key, group in itertools.groupby(sorted_chars, key=lambda c: c.upper()):
    print(key, list(group))
# A ['A', 'a', 'a', 'A', 'A', 'a']
# B ['B', 'B', 'b']
# C ['c', 'C']

Combining iterators

The combined iterator isitertoolsThe most commonly used and practical part of it is specially designed to generate permutation, combination, Cartesian product and other results without the need to write complex recursions or nested loops by hand.

1. product() - Cartesian product generator

product(*iterables, repeat=1)Computes the Cartesian product of multiple iterable objects. If you only have one iterable object but want to multiply itself multiple times, you can passrepeatSimplified writing of parameters.

import itertools

# 颜色与尺码的笛卡尔积
colors = ["红", "蓝"]
sizes = ["S", "M", "L"]
variants = list(itertools.product(colors, sizes))
print(variants)
# [('红', 'S'), ('红', 'M'), ('红', 'L'), ('蓝', 'S'), ('蓝', 'M'), ('蓝', 'L')]

# 掷3次骰子的所有可能结果(1..6自身相乘3次)
dice = list(itertools.product(range(1, 7), repeat=3))
print(dice[:5])   # 输出前5个组合
# [(1, 1, 1), (1, 1, 2), (1, 1, 3), (1, 1, 4), (1, 1, 5)]

2. permutations() - full permutation generator without duplication

permutations(iterable, r=None)Generate r length-free permutations of iterable objects (different orders are considered different results). If not specifiedr, the default is full-length arrangement.

import itertools

# 从 "ABC" 中取2个元素的所有排列
per = list(itertools.permutations("ABC", 2))
print(per)
# [('A', 'B'), ('A', 'C'), ('B', 'A'), ('B', 'C'), ('C', 'A'), ('C', 'B')]

3. combinations() - non-duplicate combination generator

combinations(iterable, r)Generate r length non-duplicate combinations (different orders are considered the same combination, and each element is only used once), must be specifiedrparameter.

import itertools

# 从 "ABC" 中取2个元素的所有组合
comb = list(itertools.combinations("ABC", 2))
print(comb)
# [('A', 'B'), ('A', 'C'), ('B', 'C')]

Practical case: Approximate calculation of pi using infinite series

The Leibniz series is a very classic method of approximate calculation of π: 1 - 1/3 + 1/5 - 1/7 + 1/9 - …will gradually approachπ/4

Next we useitertoolsThe tools in , implement this series summation efficiently and memory-friendly.

import itertools

def approximate_pi(N):
    """
    使用莱布尼茨级数近似计算 π
    :param N: 取级数的前 N 项
    :return: π 的近似值
    """
    # 1. 用 count() 生成无限奇数序列
    odds = itertools.count(1, 2)
    
    # 2. 用 islice() 截取前 N 项
    first_N_odds = itertools.islice(odds, N)
    
    # 3. 构造每一项的值:符号由奇偶决定,数值为 4 / 奇数
    terms = (4 / odd * (-1)**idx for idx, odd in enumerate(first_N_odds))
    
    # 4. 求和
    return sum(terms)

# 不同项数的精度对比
print(f"前 10 项: {approximate_pi(10):.10f}")    # 约 3.0418396189
print(f"前 100 项: {approximate_pi(100):.10f}")  # 约 3.1315929036
print(f"前 10000 项: {approximate_pi(10000):.10f}") # 约 3.1414926536

This code doesn't create any huge lists, odd sequences are generated on demand, the whole calculation is very memory friendly, and the logic is clear and readable - here it isitertoolscharm.


Performance considerations

itertoolsThe efficiency mainly comes from two points:

  1. Return an iterator instead of calculating all results at once: When processing millions or even larger data sets, the memory footprint is extremely low.
  2. The bottom layer is implemented in C: much faster than the equivalent pure Python loop.

In actual use, you can further combine iterators with generator expressions to avoid creating intermediate temporary lists and make the pipeline more extreme.


Summarize

itertoolsIt is Python’s “Swiss Army Knife” for processing iterable objects, mainly covering three directions:

  • Generate infinite regular sequences:countcyclerepeat
  • Cut, splice, filter, and group existing sequences:islicetakewhilechaingroupby
  • Generate mathematical results such as permutation and combination, Cartesian product:productpermutationscombinations

As long as you are familiar with these tools and can flexibly combine them according to your needs, you can make your code simpler and more efficient, and write in a truly Pythonic style.