数据验证利器 Pydantic

一、Pydantic 简介

Pydantic 是 Python 的数据验证库,使用 Python 类型提示进行数据验证和序列化。它是 FastAPI 的核心依赖。

核心功能

功能说明
数据验证自动验证输入数据类型和约束
序列化模型与 JSON/dict 互转
文档生成自动生成 OpenAPI Schema
编辑器支持完整的 IDE 智能提示

二、基础模型定义

简单模型

from pydantic import BaseModel

class User(BaseModel):
    id: int
    name: str
    email: str
    age: int | None = None  # 可选字段

# 创建实例
user = User(id=1, name="张三", email="zhangsan@example.com")
print(user)
# id=1 name='张三' email='zhangsan@example.com' age=None

# 转换为字典
print(user.model_dump())
# {'id': 1, 'name': '张三', 'email': 'zhangsan@example.com', 'age': None}

# 转换为 JSON
print(user.model_dump_json())
# '{"id":1,"name":"张三","email":"zhangsan@example.com","age":null}'

自动类型转换

Pydantic 会尝试转换类型:

user = User(id="1", name="张三", email="test@example.com")
# id 从字符串 "1" 自动转为整数 1 ✅

user = User(id="abc", name="张三", email="test@example.com")
# ValidationError: id 必须是整数 ❌

三、字段验证

使用 Field 添加约束

from pydantic import BaseModel, Field

class Product(BaseModel):
    name: str = Field(min_length=1, max_length=100)
    price: float = Field(gt=0, description="价格必须大于0")
    quantity: int = Field(ge=0, default=0)
    
    model_config = {
        "json_schema_extra": {
            "examples": [{
                "name": "iPhone 15",
                "price": 5999.0,
                "quantity": 100
            }]
        }
    }

Field 常用参数

参数说明
default默认值
default_factory默认值工厂函数
gt / ge大于 / 大于等于
lt / le小于 / 小于等于
min_length / max_length字符串长度限制
pattern正则匹配
description字段描述
title字段标题
examples示例值

字符串验证

from pydantic import BaseModel, Field

class User(BaseModel):
    username: str = Field(
        min_length=3,
        max_length=20,
        pattern=r"^[a-zA-Z0-9_]+$",  # 只允许字母数字下划线
        description="用户名,3-20个字符"
    )
    email: str = Field(
        pattern=r"^[\w.-]+@[\w.-]+\.\w+$",
        description="有效的邮箱地址"
    )

数值验证

class Order(BaseModel):
    amount: int = Field(gt=0, le=1000)      # 1-1000
    price: float = Field(ge=0.01, le=99999.99)  # 价格范围
    discount: float = Field(ge=0, le=1, default=0)  # 0-1 折扣率

四、嵌套模型

模型嵌套

from pydantic import BaseModel

class Address(BaseModel):
    province: str
    city: str
    street: str
    zip_code: str

class User(BaseModel):
    name: str
    age: int
    address: Address  # 嵌套模型

# 创建
user = User(
    name="张三",
    age=25,
    address={
        "province": "广东",
        "city": "深圳",
        "street": "科技园路",
        "zip_code": "518000"
    }
)
print(user.address.city)  # 深圳

列表嵌套

class Item(BaseModel):
    name: str
    price: float

class Order(BaseModel):
    order_id: str
    items: list[Item]  # Item 列表
    total: float

order = Order(
    order_id="ORD-001",
    items=[
        {"name": "iPhone", "price": 5999},
        {"name": "iPad", "price": 3999}
    ],
    total=9998
)

字典嵌套

class Config(BaseModel):
    settings: dict[str, str | int | bool]
    
config = Config(settings={"debug": True, "port": 8000, "host": "localhost"})

五、自定义验证器

field_validator(字段验证器)

from pydantic import BaseModel, field_validator

class User(BaseModel):
    name: str
    password: str
    
    @field_validator("name")
    @classmethod
    def name_must_not_contain_spaces(cls, v: str) -> str:
        if " " in v:
            raise ValueError("用户名不能包含空格")
        return v.title()  # 首字母大写
    
    @field_validator("password")
    @classmethod
    def password_strength(cls, v: str) -> str:
        if len(v) < 8:
            raise ValueError("密码至少8位")
        if not any(c.isdigit() for c in v):
            raise ValueError("密码必须包含数字")
        return v

model_validator(模型验证器)

验证多个字段之间的关系:

from pydantic import BaseModel, model_validator

class Event(BaseModel):
    start_time: datetime
    end_time: datetime
    
    @model_validator(mode="after")
    def check_time_order(self) -> "Event":
        if self.end_time <= self.start_time:
            raise ValueError("结束时间必须晚于开始时间")
        return self

六、在 FastAPI 中使用

定义请求模型

from fastapi import FastAPI
from pydantic import BaseModel, Field

app = FastAPI()

class UserCreate(BaseModel):
    username: str = Field(min_length=3, max_length=20)
    email: str = Field(pattern=r"^[\w.-]+@[\w.-]+\.\w+$")
    password: str = Field(min_length=8)
    
    model_config = {
        "json_schema_extra": {
            "examples": [{
                "username": "zhangsan",
                "email": "zhangsan@example.com",
                "password": "password123"
            }]
        }
    }

@app.post("/users/", response_model=UserCreate)
async def create_user(user: UserCreate):
    # user 已自动验证
    return user

定义响应模型

class UserResponse(BaseModel):
    id: int
    username: str
    email: str
    created_at: datetime
    # 不返回密码字段

@app.post("/users/", response_model=UserResponse)
async def create_user(user: UserCreate):
    # 返回时自动过滤掉不需要的字段
    return {
        "id": 1,
        "username": user.username,
        "email": user.email,
        "created_at": datetime.now()
    }

七、模型配置

model_config

from pydantic import BaseModel, ConfigDict

class User(BaseModel):
    model_config = ConfigDict(
        str_strip_whitespace=True,     # 自动去除字符串空格
        str_min_length=1,              # 字符串最小长度
        validate_assignment=True,      # 赋值时也验证
        extra="forbid",                # 禁止额外字段
        populate_by_name=True,         # 允许字段名填充
    )
    
    name: str
    age: int

常用配置项

配置说明
str_strip_whitespace自动去除字符串首尾空格
validate_assignment属性赋值时也进行验证
extra="forbid"禁止传入未定义的字段
extra="ignore"忽略未定义的字段
populate_by_name允许用字段名而非别名填充

八、模型方法

from_orm(从 ORM 对象创建)

from pydantic import BaseModel

class UserResponse(BaseModel):
    id: int
    name: str
    email: str
    
    model_config = ConfigDict(from_attributes=True)

# 从 SQLAlchemy 模型创建
db_user = User(id=1, name="张三", email="test@example.com")
user = UserResponse.model_validate(db_user)

model_copy(复制模型)

user2 = user.model_copy(update={"name": "李四"})

model_dump(导出字典)

user.model_dump()                    # 全部字段
user.model_dump(exclude={"password"}) # 排除字段
user.model_dump(include={"id", "name"})  # 只包含字段

九、完整示例

from datetime import datetime
from typing import Annotated
from fastapi import FastAPI
from pydantic import BaseModel, Field, field_validator, ConfigDict

app = FastAPI()

# 请求模型
class ItemCreate(BaseModel):
    model_config = ConfigDict(str_strip_whitespace=True)
    
    name: Annotated[str, Field(min_length=1, max_length=100)]
    description: str | None = None
    price: Annotated[float, Field(gt=0)]
    quantity: Annotated[int, Field(ge=0)] = 0
    
    @field_validator("name")
    @classmethod
    def name_not_empty(cls, v: str) -> str:
        if not v.strip():
            raise ValueError("名称不能为空")
        return v

# 响应模型
class ItemResponse(BaseModel):
    id: int
    name: str
    description: str | None
    price: float
    quantity: int
    created_at: datetime

@app.post("/items/", response_model=ItemResponse)
async def create_item(item: ItemCreate):
    return {
        "id": 1,
        **item.model_dump(),
        "created_at": datetime.now()
    }

十、小结

要点内容
基础模型继承 BaseModel,使用类型提示定义字段
字段验证使用 Field() 添加约束条件
嵌套模型模型作为字段类型,支持 list[Model]
自定义验证@field_validator 字段级,@model_validator 模型级
FastAPI 集成请求模型自动验证,响应模型自动过滤