Web Vision Application: Detailed explanation of FastAPI, image processing and AI service deployment

📂 Stage: Stage 2 - Deep Learning Vision Basics (CNN) 🔗 Related chapters: 推理加速框架 · 边缘计算初探


Introduction

Web vision applications are the core bridge connecting deep learning models and ordinary users. With it, users do not need to configure a complex Python/CUDA environment at all. They can experience powerful AI functions such as style migration and target detection by opening the browser; and developers can also quickly iterate and commercialize deployment through a unified API interface.

This article will take you step by step to complete a usable Web visual application from five practical dimensions: FastAPI framework basics, PyTorch model service, RESTful API design, front-end and back-end minimalist interaction, Docker container deployment.


1. Rapid architecture for Web vision applications

A typical web visual application can usually be broken down into the following five clear layers:

  1. Front-end layer — graphical interface, image compression upload, real-time result display
  2. API layer — request routing, parameter verification, cross-domain processing, background tasks
  3. Model Service Layer — Image pre-processing/post-processing, GPU/CPU inference, result caching
  4. Storage Layer — Temporary images, logging, hotspot result caching
  5. Deployment layer — Containerization, load balancing, health monitoring

Once you understand these layers, you will be able to tailor and expand your services more freely.


2. FastAPI: the preferred framework for high-performance APIs

FastAPI is an asynchronous web framework based on Python 3.7+. Its core advantages are particularly suitable for AI service scenarios:

  • 🚀 Extremely high performance — The bottom layer is based on Starlette + Pydantic, and the performance is close to NodeJS/Go
  • 📝 Automatic Documentation — Automatically generate Swagger UI / ReDoc interactive documentation
  • 🔍 Type Safety — Native Python type hints, automatically verify requests and responses
  • Async Support — Nativeasync/awaitAvoid inference or IO blocking the main thread

2.1 The simplest visual API skeleton

First build a minimal API skeleton to receive image uploads, verify files, and return responses in a unified format.

from fastapi import FastAPI, File, UploadFile, HTTPException
from pydantic import BaseModel
from typing import Optional
from datetime import datetime
import uuid

# 初始化 FastAPI 应用
app = FastAPI(
    title="AI 风格迁移 API",
    description="极简风格迁移Web服务",
    version="1.0.0",
    docs_url="/docs",
    redoc_url="/redoc"
)

# 统一的响应格式(后续所有接口复用)
class StandardResponse(BaseModel):
    success: bool
    message: str
    data: Optional[dict] = None
    timestamp: datetime = datetime.now()
    request_id: str = str(uuid.uuid4())

# 根路径
@app.get("/", response_model=StandardResponse)
async def root():
    return StandardResponse(
        success=True,
        message="欢迎使用AI视觉服务,请访问 /docs 查看API文档",
        data={"api_version": "1.0.0"}
    )

# 临时图像上传验证接口
@app.post("/api/v1/upload", response_model=StandardResponse)
async def upload_temp_image(file: UploadFile = File(...)):
    # 1. 检查文件大小(限制 10MB)
    file_size = len(await file.read())
    if file_size > 10 * 1024 * 1024:
        raise HTTPException(status_code=400, detail="文件大小不能超过10MB")
    # 2. 重置文件指针,以便后续读取
    await file.seek(0)
    # 3. 检查文件类型
    allowed_mime = ["image/jpeg", "image/png", "image/jpg"]
    if file.content_type not in allowed_mime:
        raise HTTPException(status_code=400, detail="仅支持JPEG/PNG格式图像")
    
    return StandardResponse(
        success=True,
        message="临时图像上传成功",
        data={"filename": file.filename, "size": file_size}
    )

This code already has an accessible, testable API skeleton. Next, we plug in the real AI model.


3. PyTorch model servitization practice

Train the local.pthTo turn the model into an API-callable service, three core issues need to be solved: Model loading lock, Image preprocessing standardization, and Inference result post-processing.

3.1 Visual model base class encapsulation

Let’s first encapsulate a generalVisionModelBase class, all style transfer, classification, and detection models can inherit it.

import torch
import torchvision.transforms as transforms
from PIL import Image
import numpy as np
import os
import threading
from typing import Optional

class VisionModel:
    """视觉模型服务基类"""
    def __init__(
        self,
        model_path: str,
        device: Optional[str] = None,
        input_size: tuple = (224, 224),
        mean: list = [0.485, 0.456, 0.406],
        std: list = [0.229, 0.224, 0.225]
    ):
        # 1. 自动选择设备(优先 GPU)
        self.device = device or ("cuda" if torch.cuda.is_available() else "cpu")
        # 2. 模型加载锁(防止多线程同时加载/推理冲突)
        self._model_lock = threading.Lock()
        # 3. 预处理与后处理参数
        self.input_size = input_size
        self.mean = mean
        self.std = std
        # 4. 加载模型并设为推理模式
        self.model = self._load_model(model_path)
        self.model.eval()
        # 5. 构建预处理与后处理变换
        self.preprocess_transform = self._build_preprocess()
        self.postprocess_transform = self._build_postprocess()

    def _load_model(self, model_path: str):
        """加载 PyTorch 模型(本地测试用,生产环境建议转 ONNX/TensorRT)"""
        if not os.path.exists(model_path):
            raise FileNotFoundError(f"模型文件不存在: {model_path}")
        with self._model_lock:
            model = torch.load(model_path, map_location=self.device)
        return model.to(self.device)

    def _build_preprocess(self):
        """构建图像预处理变换"""
        return transforms.Compose([
            transforms.Resize(self.input_size),
            transforms.ToTensor(),
            transforms.Normalize(mean=self.mean, std=self.std)
        ])

    def _build_postprocess(self):
        """构建图像后处理变换(风格迁移用,分类/检测需要重写)"""
        mean_inv = [-m/s for m, s in zip(self.mean, self.std)]
        std_inv = [1/s for s in self.std]
        return transforms.Compose([
            transforms.Normalize(mean=mean_inv, std=std_inv),
            transforms.Lambda(lambda x: torch.clamp(x, 0, 1)),
            transforms.ToPILImage()
        ])

    def preprocess(self, image: Image.Image) -> torch.Tensor:
        """预处理单张图像"""
        return self.preprocess_transform(image).unsqueeze(0).to(self.device)

    def postprocess(self, output: torch.Tensor) -> Image.Image:
        """后处理单张风格迁移结果"""
        output = output.squeeze(0).cpu()
        return self.postprocess_transform(output)

    def infer(self, image: Image.Image) -> Image.Image:
        """加锁推理(多线程安全)"""
        with torch.no_grad():          # 关闭梯度计算,节省显存
            with self._model_lock:
                input_tensor = self.preprocess(image)
                output_tensor = self.model(input_tensor)
        return self.postprocess(output_tensor)

3.2 Instantiate a style transfer model

Assume we already have a trained model filemodels/starry_night.pth, you can directly inherit the base class and instantiate it:

# 实例化风格迁移模型(使用更大的输入尺寸)
try:
    style_model = VisionModel(
        model_path="models/starry_night.pth",
        input_size=(512, 512)  # 风格迁移建议用更大分辨率
    )
    print(f"✅ 风格迁移模型加载成功,运行设备: {style_model.device}")
except Exception as e:
    print(f"❌ 模型加载失败: {str(e)}")
    style_model = None

In this way, the model service layer is ready.


4. Complete implementation of RESTful style migration API

Combine the previous API skeleton and model, add temporary file management and exception-handling, and get a complete style migration interface.

from fastapi.responses import FileResponse
import shutil
import tempfile

# ...(前面的 FastAPI 初始化、StandardResponse、VisionModel 代码)

@app.post("/api/v1/style-transfer", response_model=StandardResponse)
async def style_transfer(file: UploadFile = File(...)):
    if not style_model:
        raise HTTPException(status_code=500, detail="模型未加载成功,请联系管理员")

    # 创建临时目录
    temp_dir = tempfile.mkdtemp()
    temp_input = os.path.join(temp_dir, file.filename)
    temp_output = os.path.join(temp_dir, f"result_{uuid.uuid4().hex[:8]}.jpg")

    try:
        # 1. 保存用户上传的原始图片
        with open(temp_input, "wb") as f:
            shutil.copyfileobj(file.file, f)
        
        # 2. 读取图片并进行推理
        image = Image.open(temp_input).convert("RGB")
        result_img = style_model.infer(image)
        
        # 3. 保存结果
        result_img.save(temp_output, quality=95)
        
        # 4. 返回结果文件(响应完成后自动清理临时目录)
        return FileResponse(
            path=temp_output,
            filename=f"starry_night_{file.filename}",
            media_type="image/jpeg",
            background=shutil.rmtree(temp_dir)  # 后台清理
        )

    except Exception as e:
        shutil.rmtree(temp_dir)   # 出错也要清理
        raise HTTPException(status_code=500, detail=f"推理失败: {str(e)}")

Production environment tip: In actual online systems, it is not recommended to store images locally on the server. Object storage (OSS/S3) should be used with a signed download link.


5. Minimalist front-end interactive interface

An HTML page that supports drag-and-drop uploading and real-time preview does not rely on the front-end framework at all and can be directly mounted under FastAPI static files.

<!-- static/index.html -->
<!DOCTYPE html>
<html lang="zh-CN">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>AI 梵高风格迁移</title>
    <style>
        :root {
            --primary: #4f46e5;
            --primary-hover: #4338ca;
            --bg: #f8fafc;
            --card: #ffffff;
            --text: #1e293b;
        }
        * { margin: 0; padding: 0; box-sizing: border-box; font-family: 'Segoe UI', sans-serif; }
        body { background: var(--bg); color: var(--text); padding: 2rem; max-width: 1200px; margin: 0 auto; }
        h1 { text-align: center; margin-bottom: 2rem; color: var(--primary); }
        .card { background: var(--card); border-radius: 1rem; padding: 2rem; box-shadow: 0 4px 6px -1px rgba(0,0,0,0.1); }
        .upload-area { border: 2px dashed #94a3b8; border-radius: 0.75rem; padding: 3rem; text-align: center; cursor: pointer; transition: all 0.3s; margin-bottom: 2rem; }
        .upload-area:hover, .upload-area.drag-over { border-color: var(--primary); background: #f1f5f9; }
        .controls { display: flex; gap: 1rem; justify-content: center; margin-bottom: 2rem; }
        button { background: var(--primary); color: white; border: none; padding: 0.75rem 2rem; border-radius: 0.5rem; font-size: 1rem; cursor: pointer; transition: background 0.3s; }
        button:hover:not(:disabled) { background: var(--primary-hover); }
        button:disabled { background: #94a3b8; cursor: not-allowed; }
        .preview-grid { display: grid; grid-template-columns: repeat(auto-fit, minmax(300px, 1fr)); gap: 2rem; }
        .preview-box h3 { text-align: center; margin-bottom: 1rem; font-weight: 500; }
        .preview-box img { width: 100%; height: auto; border-radius: 0.5rem; box-shadow: 0 2px 4px rgba(0,0,0,0.05); display: none; }
        .loading { text-align: center; padding: 2rem; color: #64748b; display: none; }
        .error { color: #dc2626; text-align: center; padding: 1rem; background: #fee2e2; border-radius: 0.5rem; margin-bottom: 1rem; display: none; }
    </style>
</head>
<body>
    <div class="card">
        <h1>🎨 AI 梵高《星月夜》风格迁移</h1>
        
        <div class="error" id="error"></div>
        <div class="loading" id="loading">正在将您的照片变成梵高风格...</div>

        <div class="upload-area" id="uploadArea">
            <p>点击或拖拽照片到这里(JPEG/PNG,≤10MB)</p>
            <input type="file" id="fileInput" accept="image/*" style="display: none;">
        </div>

        <div class="controls">
            <button id="processBtn" onclick="processImage()" disabled>开始迁移</button>
        </div>

        <div class="preview-grid">
            <div class="preview-box">
                <h3>📷 原始照片</h3>
                <img id="originalImg">
            </div>
            <div class="preview-box">
                <h3>🖼️ 风格化结果</h3>
                <img id="resultImg">
            </div>
        </div>
    </div>

    <script>
        // 关键 DOM 元素
        const fileInput = document.getElementById('fileInput');
        const uploadArea = document.getElementById('uploadArea');
        const processBtn = document.getElementById('processBtn');
        const originalImg = document.getElementById('originalImg');
        const resultImg = document.getElementById('resultImg');
        const loading = document.getElementById('loading');
        const error = document.getElementById('error');

        // 点击上传区域触发文件选择
        uploadArea.addEventListener('click', () => fileInput.click());
        fileInput.addEventListener('change', handleFile);

        // 拖拽上传支持
        ['dragenter', 'dragover'].forEach(e => uploadArea.addEventListener(e, (ev) => {
            ev.preventDefault(); uploadArea.classList.add('drag-over');
        }));
        ['dragleave', 'drop'].forEach(e => uploadArea.addEventListener(e, (ev) => {
            ev.preventDefault(); uploadArea.classList.remove('drag-over');
        }));
        uploadArea.addEventListener('drop', (ev) => {
            const files = ev.dataTransfer.files;
            if (files.length) fileInput.files = files, handleFile({target: {files}});
        });

        function handleFile(e) {
            const file = e.target.files[0];
            if (!file) return;
            // 预览原始图片
            const reader = new FileReader();
            reader.onload = (ev) => {
                originalImg.src = ev.target.result;
                originalImg.style.display = 'block';
                resultImg.style.display = 'none';
            };
            reader.readAsDataURL(file);
            processBtn.disabled = false;
            error.style.display = 'none';
        }

        async function processImage() {
            const file = fileInput.files[0];
            if (!file) return;

            loading.style.display = 'block';
            processBtn.disabled = true;
            error.style.display = 'none';

            try {
                const formData = new FormData();
                formData.append('file', file);
                const response = await fetch('/api/v1/style-transfer', {
                    method: 'POST',
                    body: formData
                });
                if (!response.ok) throw new Error(await response.text());
                // 显示结果图片
                const blob = await response.blob();
                const url = URL.createObjectURL(blob);
                resultImg.src = url;
                resultImg.style.display = 'block';
            } catch (err) {
                error.textContent = `错误: ${err.message}`;
                error.style.display = 'block';
            } finally {
                loading.style.display = 'none';
                processBtn.disabled = false;
            }
        }
    </script>
</body>
</html>

Mount this static page in FastAPI: existapp = FastAPI(...)Add later:

from fastapi.staticfiles import StaticFiles

# 挂载静态文件目录(访问 http://localhost:8000/ 即可看到前端页面)
app.mount("/", StaticFiles(directory="static", html=True), name="static")

6. docker-container-deployment (basis of production environment)

6.1 Writing Dockerfile

# 基础镜像:CPU 版本用 python:3.9-slim
# 如果需要 GPU,建议更换为 nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu22.04 并安装对应 PyTorch 版本
FROM python:3.9-slim

WORKDIR /app

# 安装系统级依赖(仅 CPU 推理需要)
RUN apt-get update && apt-get install -y --no-install-recommends \
    gcc \
    g++ \
    libglib2.0-0 \
    libsm6 \
    libxext6 \
    libxrender-dev \
    libgomp1 \
    && rm -rf /var/lib/apt/lists/*

# 复制依赖文件并安装 Python 包
COPY requirements.txt .
RUN pip install --no-cache-dir --upgrade pip && pip install --no-cache-dir -r requirements.txt

# 复制应用代码
COPY . .

# 创建模型目录(实际模型建议通过卷挂载或 CI/CD 下载)
RUN mkdir -p models

EXPOSE 8000

# 启动命令(生产环境推荐 Gunicorn + UvicornWorker,这里先用单 worker 演示)
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "1"]

6.2 Writing requirements.txt

fastapi==0.104.1
uvicorn==0.24.0.post1
python-multipart==0.0.6
torch==2.1.0+cpu
torchvision==0.16.0+cpu
pillow==10.1.0

Note: The above listed is the CPU version of PyTorch. Please use the corresponding official index source when installing:

pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu

If the production environment uses GPU, just install the default GPU version directly through PyPI, and ensure that the base image contains CUDA.

Run the container:

# 构建镜像
docker build -t style-transfer-api .

# 运行容器(若本地有模型可映射)
docker run -d -p 8000:8000 -v $(pwd)/models:/app/models style-transfer-api

Open browser to visithttp://localhost:8000, you can see the complete visual interface.


Web vision application development is an entry-level skill for AI engineering. It is recommended to master the basics of FastAPI first, and then gradually learn the necessary functions for the production environment such as model optimization (ONNX/TensorRT), object storage (OSS/S3), Redis cache, and current limiting. In this way, you can build a truly stable and commercially available AI service.

Summarize

This article takes you through the entire process of Web visual application development quickly by building a minimalist Web service from scratch for the style migration of Van Gogh's "Starry Night":

  1. Use FastAPI to build a type-safe, high-performance API that automatically generates documents
  2. Encapsulate the general PyTorch visual model service base class to achieve safe multi-threaded inference
  3. Implement drag-and-drop upload and real-time preview front-end interface, directly mounted on FastAPI static files
  4. Write Dockerfile to containerize the entire service and lay the foundation for production deployment.

On this basis, you can continue to add:

  • 🔒 API Authentication and Authorization (JWT/API Key)
  • 📊 Monitoring Alarm (Prometheus + Grafana)
  • 🗄️ Object Storage (Alibaba Cloud OSS/Amazon S3)
  • Model Acceleration (ONNX Runtime/TensorRT)

I hope this article can help you easily take the first step in AI service-based deployment!