pydantic-guide - Complete manual of FastAPI data verification and model definition | Daoman PythonAI

#pydantic-guide: Building type-safe Python applications

📂 Stage: Stage 1 - Rapid Foundation Building (Basics) 🔗 Related chapters: fastapi-intro-advantages · environment-setup · request-body-handling

Table of contents

Pydantic Overview

Pydantic is the "data verification minister" of FastAPI. It uses Python's type hints to check, clean and convert data, so that we no longer have to write a bunch of verification logic by hand. Whether it is an API request body, a configuration file, or a database record, type safety verification can be automatically obtained by defining the model.

Why choose Pydantic?

  1. Type safety: Based on Python type hints, declarative data verification, errors are immediately discovered.
  2. Automatic conversion: Input string"25"will be converted to an integer as needed25, reducing the hassle of manual conversion.
  3. Flexible constraints: can passFieldEasily add length, range, regularity and other constraints.
  4. Excellent performance: The core verification logic is written in Rust (pydantic-core) and is fast.
  5. Clear error message: Once verification fails, specific fields and reasons will be returned to facilitate debugging.

Installation and basic usage

# 安装Pydantic(推荐使用v2版本)
pip install pydantic

# 基础使用示例
from pydantic import BaseModel
from typing import Optional

class User(BaseModel):
    name: str
    age: int
    email: Optional[str] = None

# 数据校验成功
user = User(name="John", age=30, email="john@example.com")
print(user.model_dump())  # {'name': 'John', 'age': 30, 'email': 'john@example.com'}

# 自动类型转换:字符串"25"会变成整数25
user_int_age = User(name="Jane", age="25", email="jane@example.com")
print(user_int_age.age, type(user_int_age.age))  # 25 <class 'int'>

After defining the model, just pass in the data like creating a normal class, and Pydantic will automatically verify it according to the type and constraints. Data that does not meet the requirements will be thrown directlyValidationError, allowing you to discover hidden dangers as soon as possible.

Basic model definition

Simple model

from pydantic import BaseModel, Field
from typing import Optional, List
from datetime import datetime, timezone

class Person(BaseModel):
    """基础人员模型"""
    name: str
    age: int
    email: Optional[str] = None

class Document(BaseModel):
    """文档模型,演示默认值和工厂函数"""
    title: str
    content: str
    tags: List[str] = Field(default_factory=list)  # 使用工厂函数避免可变默认值问题
    created_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
    views: int = 0
    metadata: dict = Field(default_factory=dict)

# 创建实例
person = Person(name="Alice", age=28, email="alice@example.com")
doc = Document(title="Sample Document", content="This is a sample document.")

print(person.name)       # Alice
print(doc.created_at)    # 当前UTC时间(例如:2025-12-03 12:34:56.789012+00:00)

A few things to note:

  • useOptional[str]Indicates that the field is acceptablestrorNone
  • Recommended when the default value needs to be a variable type (such as a list, dictionary) or needs to be dynamically generatedField(default_factory=...)Avoid multiple instances sharing the same object.
  • For time fielddatetime.now(timezone.utc)Generate time zone aware times, than the deprecatedutcnowMore standardized.

Field validation and constraints

useFieldPut a "tightening spell" on the field

from pydantic import BaseModel, Field

class Product(BaseModel):
    """产品模型:展示各种字段约束"""
    name: str = Field(..., min_length=3, max_length=100, description="产品名称")
    price: float = Field(..., gt=0, le=10000, description="产品价格")
    category: str = Field(..., pattern=r"^[a-zA-Z_][a-zA-Z0-9_]*$", description="产品分类")
    stock: int = Field(default=0, ge=0, description="库存数量")
    rating: float = Field(default=0.0, ge=0, le=5.0, description="评分")

# 创建一个合法的产品实例
product = Product(
    name="Laptop",
    price=1299.99,
    category="electronics",
    stock=10,
    rating=4.5
)
print(product.model_dump())

Parameter description:

  • ...Indicates that this field is required and has no default value.
  • gtgeltleCorresponds to greater than, greater than or equal to, less than, less than or equal to.
  • min_length/max_lengthLimit the length of the string.
  • patternReceive a regular expression and verify the string format.
  • descriptionProvides field descriptions, often used to automatically generate API documentation.

Data that violates constraints will trigger immediatelyValidationError, and clearly inform which field caused the error.

Custom validator

When the built-in constraints are not enough, you can write your own validation logic. Pydantic provides@field_validatorand@model_validatorTwo decorators.

Single field validator

from pydantic import BaseModel, field_validator
import re

class UserValidation(BaseModel):
    """用户模型:演示自定义单字段验证"""
    username: str
    email: str
    password: str
    confirm_password: str
    
    @field_validator('username')
    def validate_username(cls, v):
        # 长度和字符限制
        if len(v) < 3 or len(v) > 20:
            raise ValueError('用户名长度必须在3-20字符之间')
        if not re.match(r'^[a-zA-Z0-9_]+$', v):
            raise ValueError('用户名只能包含字母、数字和下划线')
        return v.lower()  # 可以转换值,比如统一转为小写
    
    @field_validator('email')
    def validate_email(cls, v):
        email_regex = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
        if not re.match(email_regex, v):
            raise ValueError('邮箱格式不正确')
        return v.lower()
    
    @field_validator('confirm_password')
    def passwords_match(cls, v, info):
        # info.data 可以获取当前已验证的其他字段值
        if 'password' in info.data and v != info.data['password']:
            raise ValueError('密码确认不匹配')
        return v

validator method receiving class (cls),value(v) and optionalinfoparameter.info.dataIt is a dictionary that contains validated sibling field data, which is very suitable for scenarios where two fields need to be compared.

Cross-field validator

If you need to verify multiple fields at the same time (such as date range, capacity limit), you can use@model_validator

from pydantic import BaseModel, model_validator
from datetime import date
from typing import Optional

class Booking(BaseModel):
    """预订模型:演示跨字段验证"""
    customer_name: str
    check_in: date
    check_out: date
    room_type: str
    adults: int
    children: int = 0
    
    @model_validator(mode='after')
    def validate_booking_logic(self):
        # 日期逻辑校验
        if self.check_in >= self.check_out:
            raise ValueError('退房日期必须晚于入住日期')
        if self.check_in < date.today():
            raise ValueError('不能预订过去的日期')
        
        # 人数容量校验
        total_guests = self.adults + self.children
        room_capacity = {'single': 2, 'double': 4, 'suite': 6, 'deluxe': 8}
        if total_guests > room_capacity.get(self.room_type, 2):
            raise ValueError(f'{self.room_type}房型最多容纳{room_capacity[self.room_type]}人')
        
        return self

mode='after'Indicates that it will be executed after the initial verification of all fields is completed. At this timeselfAlready available, you can access any field and make complex decisions.

Nested models and complex structures

There is rarely flat data in real business. Pydantic supports model nesting and inheritance to help you build a clear data structure.

Nested model

from pydantic import BaseModel, Field
from typing import List, Optional
from datetime import datetime, timezone

class Address(BaseModel):
    """地址子模型"""
    street: str = Field(..., min_length=5, max_length=100)
    city: str = Field(..., min_length=2, max_length=50)
    country: str = Field(..., min_length=2, max_length=50)
    postal_code: str = Field(..., pattern=r"^[0-9]{5}(-[0-9]{4})?$")

class Employee(BaseModel):
    """员工模型:内嵌地址和技能列表"""
    employee_id: int = Field(..., gt=0)
    first_name: str = Field(..., min_length=2, max_length=50)
    last_name: str = Field(..., min_length=2, max_length=50)
    position: str
    hire_date: datetime
    address: Address
    skills: List[str] = Field(default=[], max_items=20)

# 创建嵌套模型实例
address = Address(
    street="123 Main St",
    city="Anytown",
    country="USA",
    postal_code="12345"
)

employee = Employee(
    employee_id=1001,
    first_name="John",
    last_name="Doe",
    position="Software Engineer",
    hire_date=datetime.now(timezone.utc),
    address=address,
    skills=["Python", "FastAPI", "Docker"]
)

print("员工信息:", employee.model_dump())

Pydantic automatically verifies nested models recursively, and any level of error will report the exact error path.

Model inheritance

from pydantic import BaseModel, Field
from typing import List
from datetime import datetime, timezone

class BaseModelExtended(BaseModel):
    """可复用的基础模型,包含通用字段"""
    created_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
    updated_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
    is_active: bool = True
    
    model_config = {
        "validate_assignment": True,  # 赋值时也验证
        "extra": "forbid"             # 禁止额外字段
    }

class User(BaseModelExtended):
    """用户模型,继承基础模型"""
    user_id: int = Field(..., gt=0)
    username: str = Field(..., min_length=3, max_length=50)
    email: str = Field(..., pattern=r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$")
    roles: List[str] = Field(default_factory=list)

class AdminUser(User):
    """管理员用户,继承普通用户"""
    permissions: List[str] = Field(default=["read", "write"])
    is_super_admin: bool = False

Inheritance allows models to reuse common fields and configurations, avoiding duplication of definitions.

Data conversion and serialization

Pydantic can not only verify, but also easily export models to dictionary, JSON and other formats, and can also reconstruct models from these formats.

from pydantic import BaseModel, Field, field_serializer
from typing import List, Dict
from datetime import datetime, timezone

class SerializationModel(BaseModel):
    """演示序列化的模型"""
    id: int
    name: str
    metadata: Dict[str, str] = Field(default_factory=dict)
    tags: List[str] = Field(default_factory=list)
    created_at: datetime = Field(default_factory=lambda: datetime.now(timezone.utc))
    
    @field_serializer('created_at')
    def serialize_datetime(self, dt: datetime, _info):
        # 自定义 datetime 的序列化形式
        return dt.isoformat()

# 创建模型实例
model_instance = SerializationModel(
    id=1,
    name="Test Item",
    metadata={"category": "test", "priority": "high"},
    tags=["important", "urgent"]
)

# 导出为字典(所有字段转为 Python 原生类型)
dict_data = model_instance.model_dump()
print("字典形式:", dict_data)

# 导出为 JSON 字符串
json_data = model_instance.model_dump_json()
print("JSON 形式:", json_data)

# 从字典重新构建模型
reconstructed = SerializationModel.model_validate(dict_data)
print("从字典重建:", reconstructed.model_dump())

use@field_serializerYou can customize the output format of a field during serialization, for example,datetimeThe object is formatted as an ISO 8601 string. By default, Pydantic hasdatetimeSimilar processing is done, but through it you can implement more special logic.

Practical application cases

API request model

from pydantic import BaseModel, Field, field_validator
from typing import Optional, Dict, Any
from datetime import datetime, timezone
import re

class APIRequestModel(BaseModel):
    """API 请求模型示例"""
    api_key: str = Field(..., min_length=32, max_length=64, description="API密钥")
    user_id: Optional[int] = Field(None, gt=0, description="用户ID")
    action: str = Field(..., description="API动作")
    parameters: Dict[str, Any] = Field(default_factory=dict, description="请求参数")
    timestamp: datetime = Field(default_factory=lambda: datetime.now(timezone.utc),
                                description="请求时间戳")
    priority: int = Field(default=1, ge=1, le=5, description="请求优先级")
    
    @field_validator('api_key')
    def validate_api_key(cls, v):
        if not re.match(r'^[A-Za-z0-9]{32,64}$', v):
            raise ValueError('API密钥格式无效')
        return v
    
    @field_validator('action')
    def validate_action(cls, v):
        allowed_actions = {
            'create_user', 'update_user', 'delete_user',
            'get_user', 'list_users'
        }
        if v not in allowed_actions:
            raise ValueError(f'不允许的操作: {v}')
        return v

This model defines the structure of a standard API request, which not only verifies the field format, but also limits the set of executable operations, and can effectively intercept malicious or erroneous requests.

Configuration management model

from pydantic import BaseModel, Field
from typing import Optional
import os

class DatabaseConfig(BaseModel):
    """数据库配置"""
    url: str = Field(..., description="数据库连接URL")
    pool_size: int = Field(default=5, ge=1, le=100, description="连接池大小")
    echo: bool = Field(default=False, description="是否打印SQL语句")

class AppConfig(BaseModel):
    """应用配置,支持从环境变量加载"""
    app_name: str = Field(default='MyApp', description="应用名称")
    debug: bool = Field(default=False, description="调试模式")
    host: str = Field(default='127.0.0.1', description="监听主机")
    port: int = Field(default=8000, ge=1, le=65535, description="监听端口")
    database: DatabaseConfig = Field(default_factory=DatabaseConfig, description="数据库配置")
    secret_key: str = Field(..., min_length=32, description="密钥")
    
    @classmethod
    def from_env(cls):
        """从环境变量创建配置"""
        return cls(
            app_name=os.getenv('APP_NAME', 'MyApp'),
            debug=os.getenv('DEBUG', 'false').lower() == 'true',
            host=os.getenv('HOST', '127.0.0.1'),
            port=int(os.getenv('PORT', '8000')),
            secret_key=os.getenv('SECRET_KEY', ''),
            database=DatabaseConfig(
                url=os.getenv('DATABASE_URL', 'sqlite:///./test.db'),
                pool_size=int(os.getenv('DB_POOL_SIZE', '5')),
                echo=os.getenv('DB_ECHO', 'false').lower() == 'true'
            )
        )

By defining the configuration model in Pydantic, you can centrally verify the integrity and format of the configuration when the application starts, avoiding hidden configuration errors at runtime.from_envFactory methods make loading from environment variables simple and clear.

When using Pydantic for data validation, it is recommended to clearly define field constraints, use type hints, use validators appropriately, and perform adequate error handling in a production environment. For high-frequency verification scenarios, consider using cached verification results, precompiled regular expressions, and batch verification to optimize performance.

Summarize

Pydantic transforms Python type hints into real data validation capabilities. Through clearly defined models, rich field constraints, and flexible custom validators, you can build a robust and maintainable data layer with minimal code. In FastAPI applications, Pydantic is the core cornerstone of request/response and configuration management. Mastering it will greatly improve your development efficiency and code quality.