Python binary data processing tutorial: Detailed explanation of struct module
When dealing with scenarios such as network protocol parsing, embedded device communication, and binary file reading, Python’s built-inbytes / bytearrayAlthough it can store raw bytes, operating directly on hexadecimal characters is boring and error-prone. At this time,structThe module is your Swiss Army Knife - it uses "format strings" that humans can understand at a glance to easily achieve bidirectional conversion between Python native types and binary data.
1. First understand: binary storage in Python
There are two main types in Python that specifically store binary data:
bytes: immutable byte sequence, passedb''Literal creationbytearray: Variable byte sequence, one of which can be modified
After understanding these two types, let’s look atstructHow to convert them to and from Python normal types.
2. struct module core
2.1 The two most commonly used functions
structThe usage is very concentrated. Remembering the following two core functions is enough to handle 90% of the scenarios:
Don’t forget to import it before using:
2.2 Core Rules: Format String
The format string isstructThe baton that determines how the data is interpreted. A complete format string consists of optional byte order + optional alignment and a required type specifier.
Byte order characters - strongly recommended never to omit!
Omitting byte order makesstructUse the native rules of the current running platform (@), which may cause your program to have inconsistent results across Windows, macOS, and Linux. It is recommended to write down the fixed order directly.
Common data type specifiers
The following types cover most application scenarios, and remembering them will help you cope with daily tasks:
3. Try it out: basic practical operation
3.1 Packing and unpacking of single integers
Let’s first demonstrate using big endian order, which is common in network protocols and many binary file formats:
It's that simple: a format string turns a human-readable number into a compact binary, and can also be changed back.
3.2 Mixing multiple data types
packandunpackSupports combining multiple type specifiers in order in a format string, corresponding to parameters one-to-one:
4. Practical practice: parsing BMP file headers
Just talk without practicing the tricks. Below we usestructWrite a utility function to extract width, height, bit depth and other information from the binary header of BMP images.
The BMP file header is in little endian order (<) storage, which is a common byte order under Windows. The first 30 bytes contain key information such as image size.
After running, you can see the width, height and bit depth of the output, and the entire parsing process is clear and intuitive.
5. Pitfall avoidance guide and best practices
- Be sure to specify the byte order
Never use the default
@, otherwise data interpretation will be messed up across platforms. - Processing
struct.error
When unpacking, if the supplied byte length does not match the expected format string, an exception will be thrown. For cultivationtry/exceptWrapping Habit. - Big data cooperation
memoryview
When dealing with binary buffers larger than tens of MB, usememoryviewSlicing can avoid memory copying and improve performance. - Check official documentation
For less common formats such as Pascal string
p、?Boolean, etc.), refer directly to Python struct 官方文档.
6. What else can be done?
Got itstruct, the following tasks will become easier:
- Parse TCP/UDP network headers (IPv4, DNS, etc.)
- Read the binary format of older versions of Excel
- Interact with dynamic libraries written in C/C++ and pass binary structures
- Write communication protocols with embedded devices such as Arduino and Raspberry Pi
If you encounter a more complex nested binary structure, you can also usestruct.calcsizeCalculate the number of bytes occupied by the format string and perform step-by-step analysis; or use a third-party library such asconstructto describe higher-level protocols.
Through this tutorial, you can already use it skillfullystructHandles common binary data. Next time we meet densely packed people\xBytes, it is better to write a line of format string directly, so thatstructHelp you get everything done!

