FastAPI nginx-gunicorn-production complete guide

📂 Stage: Stage 5 - Engineering and Deployment (Practical) 🔗 Related chapters: docker-container-deployment · Pydantic Settings多环境配置

Table of contents

Overview of production deployment architecture

Why do we need Nginx + Gunicorn architecture?

Use directlyuvicornAlthough running FastAPI is simple, it can expose many problems in a production environment:

  • Single Process Bottleneck: by defaultuvicornOnly use one worker process. Once a request is blocked (such as a long database query), other requests will be stuck.
  • Lack of Load Balancing: A single instance cannot fully utilize multi-core CPUs and cannot distribute traffic among multiple servers.
  • Static file service efficiency is low: FastAPI itself is not suitable for processing a large number of static resources, and handing it over to Nginx can greatly improve performance.
  • Security and SSL Termination: Production environments must have HTTPS enabled, handling SSL directly at the application layer adds unnecessary complexity.

With the introduction of Nginx + Gunicorn, these pain points can be solved: Nginx, as a reverse proxy and SSL terminator, can efficiently handle static files, load balancing and security hardening; Gunicorn manages multiple Uvicorn worker processes, fully utilizes multi-core CPUs, and provides process monitoring and automatic restart mechanisms.

Core component responsibilities

ComponentsResponsibilities
NginxReverse proxy, SSL termination, static file serving, load balancing, security header injection
GunicornWSGI server, responsible for process management, worker scheduling, and graceful restart
UvicornASGI server, running as Gunicorn's worker class, providing asynchronous support

Typical production architecture

用户请求 → DNS → CDN → Nginx (反向代理 + SSL)

                      ├─ Gunicorn Worker 1 ─┐
                      ├─ Gunicorn Worker 2 ─┤
                      ├─ Gunicorn Worker 3 ─┤ → FastAPI 应用
                      ├─ ...                ┘

                      └─ 静态资源 /static/
                      
        监控系统 ← 日志收集

In this architecture, Nginx serves as the first line of defense to receive all external requests, and then distributes dynamic requests to the back-end Gunicorn worker process through a reverse proxy, while static files are directly responded to by Nginx, which greatly improves overall performance.


Nginx reverse proxy configuration

Detailed explanation of core configuration

The Nginx configuration file below already contains the most commonly used options in a production environment. You can save it as/etc/nginx/sites-available/daoman-api

# 1. 定义上游服务器组(FastAPI 应用)
upstream daoman_api {
    # 第一台后端服务器,权重 3,最多失败 2 次后标记为不可用,30 秒后重试
    server 127.0.0.1:8000 weight=3 max_fails=2 fail_timeout=30s;
    server 127.0.0.1:8001 weight=3 max_fails=2 fail_timeout=30s;
    # 启用长连接,减少握手开销
    keepalive 32;
}

# 2. HTTP → HTTPS 强制跳转
server {
    listen 80;
    server_name your-domain.com;
    return 301 https://$server_name$request_uri;
}

# 3. HTTPS 服务器块(主要配置)
server {
    listen 443 ssl http2;
    server_name your-domain.com;

    # SSL 证书路径
    ssl_certificate /etc/ssl/certs/your-domain.crt;
    ssl_certificate_key /etc/ssl/private/your-domain.key;
    ssl_protocols TLSv1.2 TLSv1.3;   # 禁用旧版不安全的协议

    # 安全相关响应头
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-XSS-Protection "1; mode=block" always;
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;

    # 访问日志与错误日志
    access_log /var/log/nginx/daoman-access.log main;
    error_log /var/log/nginx/daoman-error.log warn;

    # Gzip 压缩,减少传输数据量
    gzip on;
    gzip_min_length 1024;
    gzip_types text/plain text/css application/json application/javascript;

    # 动态请求代理到 Gunicorn
    location / {
        proxy_pass http://daoman_api;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;

        # 超时设置,避免请求长时间挂起
        proxy_connect_timeout 60s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
    }

    # 静态文件直接由 Nginx 提供
    location /static/ {
        alias /var/www/daoman/static/;
        expires 30d;      # 浏览器缓存 30 天
        access_log off;   # 静态文件访问不记录日志
    }

    # WebSocket 支持
    location /ws/ {
        proxy_pass http://daoman_api;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_read_timeout 86400;  # WebSocket 连接保持 24 小时
    }
}

Configuration and online steps

After completing the configuration, enable and reload Nginx with the following commands:

# 创建软链接到 sites-enabled 目录
sudo ln -s /etc/nginx/sites-available/daoman-api /etc/nginx/sites-enabled/

# 测试配置文件是否有语法错误
sudo nginx -t

# 重新加载配置(平滑重启)
sudo systemctl reload nginx

Tip: If you want Nginx to take effect immediately with the new configuration, you can userestart,butreloadDoes not disrupt existing connections and is more suitable for production environments.


Gunicorn High Performance Configuration

Write Gunicorn configuration file

Create in the project root directorygunicorn.conf.py, and adjust the parameters to suit your server.

import multiprocessing

# 监听地址与端口
bind = "0.0.0.0:8000"

# Worker 数量推荐公式:CPU 核心数 × 2 + 1
workers = multiprocessing.cpu_count() * 2 + 1

# 使用 Uvicorn 的 worker 类,处理 ASGI 异步请求
worker_class = "uvicorn.workers.UvicornWorker"

# 每个 worker 最大并发连接数
worker_connections = 1000

# 每个 worker 处理 1000 个请求后自动重启,避免内存泄漏
max_requests = 1000
max_requests_jitter = 100   # 随机延迟重启,防止所有 worker 同时重启

# 请求超时时间(秒)
timeout = 300

# 预加载应用代码,可提升 worker 启动速度并节省内存
preload_app = True

# 日志配置
accesslog = "/var/log/gunicorn/access.log"
errorlog = "/var/log/gunicorn/error.log"
loglevel = "info"

Parameter Description:

  • workers: Generally set toCPU 核心数 × 2 + 1, if there is a lot of I/O waiting in the application, it can be increased appropriately, but it should not be too much, otherwise context switching will reduce efficiency.
  • preload_app = True: Load the application when starting the main process, and then fork out the worker process, which can share read-only data in memory and greatly reduce memory usage.
  • max_requests + max_requests_jitter: Regularly recycle workers to prevent memory fragmentation or accumulation of minor leaks from causing service lags.

Startup script integrated with Systemd

In order to run Gunicorn reliably in a production environment, it is recommended to use systemd for management.

Startup scriptstart.sh(optional, for manual testing):

#!/bin/bash
export ENV="production"
mkdir -p /var/log/gunicorn
exec gunicorn main:app --config gunicorn.conf.py

Systemd service files/etc/systemd/system/daoman-api.service

[Unit]
Description=Daoman FastAPI Service
After=network.target

[Service]
User=www-data
Group=www-data
WorkingDirectory=/opt/daoman
Environment="PATH=/opt/daoman/venv/bin"
ExecStart=/opt/daoman/venv/bin/gunicorn main:app --config gunicorn.conf.py
Restart=always

[Install]
WantedBy=multi-user.target

Deploy and start the service:

sudo systemctl daemon-reload
sudo systemctl enable --now daoman-api

Gunicorn now runs automatically on system startup and will automatically restart after a crash.


SSL certificate configuration and HTTPS

Production environments must have HTTPS enabled. It is recommended to use the free certificate provided by Let's Encrypt and combine it with the Certbot tool to achieve automatic renewal.

Install Certbot and obtain the certificate

# 更新软件源并安装 Certbot 及 Nginx 插件
sudo apt update && sudo apt install certbot python3-certbot-nginx -y

# 申请证书(Certbot 会自动修改 Nginx 配置)
sudo certbot --nginx -d your-domain.com --agree-tos --email your-email@example.com

After execution, Certbot will verify domain name ownership and write the certificate path to the Nginx configuration file (as abovessl_certificateandssl_certificate_keyinstruction).

Automatic renewal

Let's Encrypt certificates are valid for 90 days and require periodic renewal. Certbot usually automatically adds scheduled tasks, you can manually check and add:

# 测试续期流程(不影响实际证书)
sudo certbot renew --dry-run

# 添加 crontab 定时任务(每天中午 12 点续期)
(crontab -l 2>/dev/null; echo "0 12 * * * /usr/bin/certbot renew --quiet") | crontab -

After the renewal is successful, Certbot will automatically reload Nginx to make the new certificate effective.


Load balancing and high availability

Nginx load balancing strategy

If your application is deployed on multiple servers, just modifyupstreamSimple load balancing can be achieved using blocks.

upstream daoman_backend {
    server backend1.daoman.com:8000 weight=3;
    server backend2.daoman.com:8000 weight=3;
    server backend3.daoman.com:8000 weight=2;
    ip_hash;          # 基于客户端 IP 的会话保持
    keepalive 32;
}
  • weight: The greater the weight, the more requests are allocated.
  • ip_hash: Requests from the same client IP are always forwarded to the same backend, suitable for applications that require session persistence.
  • least_conn(minimum connections) orrandom(random) available asip_hashalternatives, chosen based on business needs.

Application health check

In order for Nginx to detect unavailable backend servers in time, a health check endpoint needs to be implemented in FastAPI.

from fastapi import FastAPI
from datetime import datetime

app = FastAPI()

@app.get("/health")
async def health_check():
    """
    返回应用状态,Nginx 可通过此接口检测后端是否健康
    """
    return {
        "status": "healthy",
        "timestamp": datetime.now().isoformat()
    }

In Nginx you can passhealth_checkActive detection (requires commercial version ornginx-healthcheckmodule), for community edition Nginx, usually combined withfail_timeoutandmax_failsPassive detection is enough.


Security hardening measures

Nginx security configuration

in the foregoingserverAdd more security-related instructions to the block:

# 隐藏 Nginx 版本号,减少信息泄漏
server_tokens off;

# 限制客户端请求体大小,防止大文件上传攻击
client_max_body_size 10M;

# 禁止访问以点号开头的隐藏文件(如 .env)
location ~ /\. {
    deny all;
    access_log off;
}

# 禁止访问常见的备份、日志、环境配置文件
location ~* \.(bak|log|env)$ {
    deny all;
    access_log off;
}

Application layer security

FastAPI providesTrustedHostMiddlewareMiddleware to defend against HTTP Host header attacks.

from fastapi.middleware.trustedhost import TrustedHostMiddleware

app.add_middleware(
    TrustedHostMiddleware,
    allowed_hosts=["your-domain.com", "*.your-domain.com"]
)

This middleware will verify the requestHostWhether the header is in the whitelist, if not, a 400 error will be returned, effectively preventing security issues caused by malicious requests forging Host headers.


##Performance Optimization Strategy {#Performance Optimization Strategy}

1. Gunicorn built-in optimization

  • preload_app = True: Preload applications to reduce memory usage and speed up worker startup.
  • worker_connectionsandtimeout: Adjust appropriately according to business scenarios to avoid long-suspended connections occupying resources.
  • max_requests: Avoid memory leaks caused by workers running for a long time.

2. Nginx static file cache

rightlocation /static/we have setexpires 30d, for more complex caching requirements, you can enable Nginx proxy caching:

# http 块中声明缓存区
proxy_cache_path /tmp/nginx_cache levels=1:2 keys_zone=my_cache:10m max_size=1g inactive=60m;

server {
    ...
    location /api/cached/ {
        proxy_pass http://daoman_api;
        proxy_cache my_cache;
        proxy_cache_valid 200 5m;          # 仅缓存 200 响应,有效期 5 分钟
        proxy_cache_use_stale error timeout;  # 出错时使用过期缓存
    }
}

Note: Caching is only suitable for interfaces that do not change frequently. Dynamic data must be used with caution, otherwise data inconsistency may result.

3. Application layer caching

Used in FastAPIaiocacheAdd memory caching for time-consuming operations.

from aiocache import cached, Cache
from aiocache.serializers import JsonSerializer

# 使用内存缓存,过期时间 300 秒
cache = Cache(Cache.MEMORY, serializer=JsonSerializer(), ttl=300)

@app.get("/api/data")
@cached(cache=cache)
async def get_data():
    # 模拟一个耗时查询
    import time
    time.sleep(0.5)
    return {"data": "cached data"}

In this way, the same request will only execute a slow query once within 5 minutes, and subsequent requests will directly return the cached results, effectively reducing database pressure and improving response speed.


Monitoring and Log Management

Structured log

In a production environment, it is recommended to output logs in JSON format to facilitate centralized collection and analysis.

import logging
import json
from datetime import datetime

class JSONFormatter(logging.Formatter):
    def format(self, record):
        log_entry = {
            "timestamp": datetime.utcnow().isoformat(),
            "level": record.levelname,
            "message": record.getMessage()
        }
        return json.dumps(log_entry)

logger = logging.getLogger()
handler = logging.StreamHandler()
handler.setFormatter(JSONFormatter())
logger.addHandler(handler)
logger.setLevel(logging.INFO)

Place the above code at the top of the entry file of the FastAPI application, alllogging.info()All calls will output standard JSON logs, making it easy to use ELK, Loki and other tools for analysis.

Basic system monitoring script

usepsutilWrite a simple resource monitoring script to send an alert when CPU or memory exceeds a threshold.

import psutil
import time

def monitor():
    while True:
        cpu = psutil.cpu_percent(interval=1)
        mem = psutil.virtual_memory().percent
        disk = psutil.disk_usage('/').percent
        print(f"CPU: {cpu}%, Mem: {mem}%, Disk: {disk}%")
        if cpu > 80 or mem > 85:
            print("⚠️  资源告警!")
        time.sleep(60)

if __name__ == "__main__":
    monitor()

For enterprise-level projects, it is recommended to integrate professional monitoring systems such as Prometheus + Grafana, which can more comprehensively display key indicators such as QPS, error rate, and latency.


Troubleshooting and Debugging

Common command line troubleshooting tools

# 检查服务运行状态
sudo systemctl status nginx daoman-api

# 实时查看 Nginx 和 Gunicorn 的错误日志
sudo tail -f /var/log/nginx/daoman-error.log
sudo tail -f /var/log/gunicorn/error.log

# 查看端口占用情况(确认 Gunicorn 是否正在监听 8000)
sudo netstat -tlnp | grep :8000

# 使用 curl 测试 API 是否可访问
curl https://your-domain.com/health

One-click troubleshooting script

Integrate common inspection commands into one scripttroubleshoot.sh, which is convenient for quickly locating the problem:

#!/bin/bash
echo "🔍 服务状态:"
sudo systemctl status nginx daoman-api --no-pager

echo -e "\n🔍 最近 20 条 Nginx 错误日志:"
sudo tail -n 20 /var/log/nginx/daoman-error.log

echo -e "\n🔍 最近 20 条 Gunicorn 错误日志:"
sudo tail -n 20 /var/log/gunicorn/error.log

echo -e "\n🔍 API 连通性测试:"
if curl -s https://your-domain.com/health; then
    echo "连接成功"
else
    echo "连接失败"
fi

runbash troubleshoot.shYou will get a brief troubleshooting report.


Automated deployment script

Encapsulating code pulling, dependency installation, service restart and other operations into a deployment script can significantly reduce manual operation errors.

#!/bin/bash
set -e
APP_DIR="/opt/daoman"
BACKUP_DIR="/opt/backups"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)

echo "🚀 开始部署..."

# 1. 备份当前版本
sudo mkdir -p $BACKUP_DIR
sudo tar -czf $BACKUP_DIR/daoman_$TIMESTAMP.tar.gz $APP_DIR/ || true

# 2. 拉取最新代码(此处以 main 分支为例)
cd $APP_DIR
sudo git fetch origin && sudo git reset --hard origin/main

# 3. 更新 Python 依赖
sudo $APP_DIR/venv/bin/pip install -r requirements.txt

# 4. 重启应用服务
sudo systemctl restart daoman-api
sudo systemctl reload nginx

# 5. 验证部署结果
sleep 5
if curl -s https://your-domain.com/health > /dev/null; then
    echo "✅ 部署成功!"
else
    echo "❌ 部署失败,尝试回滚..."
    sudo tar -xzf $BACKUP_DIR/daoman_$TIMESTAMP.tar.gz -C /
    sudo systemctl restart daoman-api
fi

Usage Suggestions:

  • Make sure the deploying user has the appropriate permissions (can usesudoexecute necessary commands).
  • In the production environment, it is recommended to trigger this script in conjunction with a CI/CD platform (such as GitHub Actions, Jenkins) to achieve fully automated deployment.

Summary

This article comprehensively explains the process of building a FastAPI production environment from architectural design to actual deployment. The key points are summarized below:

  1. Architecture Selection: Nginx is responsible for exposing services, SSL termination and static files. Gunicorn + Uvicorn provides high-performance multi-process asynchronous processing capabilities.
  2. Gunicorn Worker Process: FollowCPU 核心数 × 2 + 1formula configuration and enablepreload_appandmax_requestsKeep services stable.
  3. HTTPS Certificate: Easily automate renewals with Let's Encrypt free certificates and the Certbot tool.
  4. Security hardening: Add security response headers, limit request body size, disable access to sensitive files, and use TrustedHostMiddleware to defend against Host header attacks.
  5. Log and Monitoring: Output structured JSON logs, write simple system resource monitoring scripts, and detect hidden dangers in advance.
  6. Automated deployment: Integrate code updates, dependency installation, service restarts and health checks through bash scripts to improve deployment efficiency and reliability.

This solution can already cover the production needs of most small and medium-sized web applications. You can flexibly adjust various parameters according to business scale and further integrate more advanced orchestration tools such as Docker and Kubernetes.


🔗 Related tutorials