Django deployment best practices - production environment deployment and operation and maintenance

📂 Stage: Part 3 - Advanced Topics 🎯 Difficulty Level: Advanced ⏰ Estimated study time: 6-8 hours 🎒 Prerequisite knowledge: 性能优化, 安全最佳实践


Table of contents


Deploy core architecture and principles

Minimalist production architecture

Students who are new to production deployment are easily intimidated by various heavy-duty components - load balancers, automatic scaling, service grids... But in fact, a stable and efficient single-machine architecture can fully support thousands or even tens of thousands of requests per day, and the cost of understanding and maintenance is extremely low.

Let’s start with the most classic Single-machine Docker scalable architecture. The entire structure is clearly layered, and it will be easy to expand horizontally in the future:

Internet

Nginx (反向代理、SSL终结、静态文件加速)

Gunicorn Workers (WSGI 应用服务器)

PostgreSQL (主数据库)

Redis (缓存、异步任务队列 Broker)

Celery (可选:异步任务处理)
  • Nginx: Stands at the forefront, handles all client requests, is responsible for HTTPS encryption, static file response, and forwards dynamic requests to the back-end Gunicorn.
  • Gunicorn Workers: Where django application code is run, multiple processes process web requests in parallel.
  • PostgreSQL: Data persistence, strong passwords and regular backups must be configured in the production environment.
  • Redis: lightweight but powerful in-memory data storage, used for caching, session persistence, and message queues.
  • Celery (optional): handles time-consuming asynchronous tasks, such as sending emails and generating reports, without blocking user requests.

The biggest advantage of this architecture is that it is simple and easy to maintain, and it can be run with Docker Compose. Later, when it comes to Docker Swarm or Kubernetes, it only requires upgrading the definition file, and the idea is completely smooth.

5 iron rules

In a production environment, there are some bottom lines that must not be broken. Remember these 5 points to help you avoid 90% of big pitfalls:

Must-remember principles
  1. Safety first:DEBUGMust be set toFalse, complex passwords are set for both the database and Redis, and all traffic is forced to go through HTTPS.
  2. High Availability Prototype: Gunicorn multi-Worker + process will automatically restart if it exits unexpectedly to ensure that the service remains online.
  3. Quick expansion: Currently deployed on a single machine, the architecture design and configuration should allow for horizontal expansion possibilities (such as stateless applications, shared storage and databases).
  4. Basic monitoring must be in place: log concentration, process survival detection, and basic resource monitoring (CPU, memory, disk) are indispensable.
  5. Automation and semi-automation combination: First write a Shell script or a command line, and you can manually complete deployment and backup with one click, and then slowly introduce the CI/CD pipeline. Don’t think of full automation right away. :::

Lightweight environment preparation

We assume that you already have a clean Ubuntu 20.04/22.04 server with no Nginx, Python or other web services installed. The following steps can help you quickly build a safe and clean application running environment.

Server initialization script

Directly copy the following script and save it asinit_server.sh,userootThe user can execute it. It only installs the dependencies necessary for the production environment, without extra tools:

:::code-group

#!/bin/bash
set -e  # 遇到错误立即退出

# 更新系统并升级所有软件包
sudo apt update && sudo apt upgrade -y

# 安装生产核心依赖
sudo apt install -y \
    python3-pip python3-venv python3-dev \
    build-essential libpq-dev libssl-dev \
    nginx supervisor git curl

# 创建专门的非root应用用户(安全必做)
sudo useradd --system --home /app --shell /bin/bash app
sudo mkdir -p /app /var/log/myapp /var/run/myapp
sudo chown -R app:app /app /var/log/myapp /var/run/myapp

# 防火墙只开放必需的端口
sudo ufw allow ssh
sudo ufw allow 'Nginx Full'   # 包含 80 和 443
sudo ufw --force enable

echo "✅ 服务器初始化完成!请切换到 app 用户继续部署:sudo -iu app"

A few key points explained:

  • python3-venv: Used to create a virtual environment.
  • libpq-dev: compilepsycopg2Required to connect to PostgreSQL.
  • supervisor: If you don’t want to rely on Docker’s restart strategy, you can use it to manage the Gunicorn process.
  • Firewall: Only SSH and Web traffic are allowed to enter, and all other ports are closed.
  • Non-root users: From a security perspective, application code should never be executed as root.

Apply environment variable template

Django production configuration is usually injected through environment variables to avoid hardcoding sensitive information in the code. Please create it in your project root directory.envFile (remember Do not submit to Git repository):

:::code-group

# 基础安全配置
DEBUG=False
SECRET_KEY=请生成一个足够长且随机的字符串
ALLOWED_HOSTS=yourdomain.com,www.yourdomain.com

# 数据库连接
DB_NAME=myapp
DB_USER=myappuser
DB_PASSWORD=请设置高强度数据库密码
DB_HOST=db          # Docker Compose中的服务名
DB_PORT=5432
DB_CONN_MAX_AGE=60  # 数据库连接保持时间,减少频繁建立连接的开销

# Redis 连接
REDIS_URL=redis://redis:6379/0

:::

Security Warning

production environmentSECRET_KEYIt must be random enough (at least 50 characters, including uppercase and lowercase letters, numbers, and symbols), and cannot be shared with the development environment. you can useopenssl rand -base64 48Generate a strong random value.


Docker Compose full stack deployment

If you want to get your entire application stack running locally or on a server in a matter of minutes, Docker Compose is your best choice. Below we build the containerized deployment of the Django application step by step.

Prepare to apply Dockerfile

This Dockerfile follows best practices such as multi-layer construction and minimum permissions, which can effectively reduce the image size and improve security:

:::code-group

FROM python:3.11-slim

# 设置环境变量,避免生成 .pyc 文件,并使日志输出不缓冲
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
# 指定django使用的生产配置模块
ENV DJANGO_SETTINGS_MODULE=myproject.settings.production

# 单层安装系统依赖并清理缓存,减小镜像层大小
RUN apt-get update && apt-get install -y --no-install-recommends \
    gcc libpq-dev libjpeg-dev zlib1g-dev \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app

# 先复制依赖文件,利用Docker缓存加速后续构建
COPY requirements.txt .
RUN pip install --no-cache-dir --upgrade pip setuptools wheel && \
    pip install --no-cache-dir -r requirements.txt

# 复制应用代码
COPY . .

# 创建非root用户并更改文件所有权
RUN useradd --create-home --shell /bin/bash app && \
    chown -R app:app /app
USER app

# 收集静态文件到 STATIC_ROOT
RUN python manage.py collectstatic --noinput

EXPOSE 8000
CMD ["gunicorn", "--config", "gunicorn.conf.py", "myproject.wsgi:application"]

:::

Production environment dependency list

Save the following content asrequirements.txt, use locked versions to avoid unexpected problems caused by dependency upgrades:

:::code-group

django>=4.2,<5.0
gunicorn==21.2.0
psycopg2-binary==2.9.7
redis==5.0.1
celery==5.3.4
django-redis==5.4.0
Pillow==10.0.1
python-decouple==3.8
whitenoise==6.6.0

:::

Complete Docker Compose configuration

This one belowdocker-compose.ymlMultiple services are defined: PostgreSQL database, Redis cache, web application and Nginx reverse proxy. Through volume mounting and network bridging, they can communicate efficiently:

:::code-group

version: '3.8'

services:
  db:
    image: postgres:15-alpine
    volumes:
      - postgres_data:/var/lib/postgresql/data/
      - ./postgres/init:/docker-entrypoint-initdb.d/ # 可选:初始化SQL脚本目录
    environment:
      - POSTGRES_DB=${DB_NAME}
      - POSTGRES_USER=${DB_USER}
      - POSTGRES_PASSWORD=${DB_PASSWORD}
    networks:
      - app-network
    restart: unless-stopped
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${DB_USER} -d ${DB_NAME}"]
      interval: 10s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    # 开启AOF持久化并设置密码(如果设置了REDIS_PASSWORD环境变量)
    command: redis-server --appendonly yes --requirepass ${REDIS_PASSWORD:-}
    volumes:
      - redis_data:/data
    networks:
      - app-network
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5

  web:
    build: .
    ports:
      - "127.0.0.1:8000:8000"     # 仅允许来自本机的Nginx访问
    environment:
      - DB_HOST=db
      - REDIS_URL=${REDIS_URL}
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_healthy
    volumes:
      - static_volume:/app/static
      - media_volume:/app/media
    networks:
      - app-network
    restart: unless-stopped

  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx/nginx.conf:/etc/nginx/nginx.conf               # 主配置
      - ./nginx/conf.d:/etc/nginx/conf.d                       # 虚拟主机配置目录
      - static_volume:/app/static:ro                           # 只读挂载静态文件
      - media_volume:/app/media:ro                             # 只读挂载用户上传文件
      - ./ssl:/etc/ssl:ro                                      # SSL证书目录
    depends_on:
      - web
    networks:
      - app-network
    restart: unless-stopped

volumes:
  postgres_data:
  redis_data:
  static_volume:
  media_volume:

networks:
  app-network:
    driver: bridge

:::

**Why is it designed like this? **

  • web container port mapped to127.0.0.1: Even if the server's public IP is exposed to port 8000, it cannot be accessed. Only local Nginx can forward it, providing an extra layer of security.
  • Health Check and Dependency Conditions: Ensure that the database and Redis are ready before starting the web service to avoid errors caused by the startup sequence.
  • Volumes: Static files and media files are shared between the web and nginx through named volumes, which are both persistent and isolated.
  • Restart strategyunless-stopped: Unless explicitly stopped, the process will automatically restart if it crashes, basically meeting high availability.

Basic but efficient Nginx configuration

Nginx is the "facade" of the entire request link. Good configuration can greatly improve the response speed and can also block many malicious scans.

Main configuration filenginx.conf

user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;

events {
    worker_connections 1024;
    use epoll;            # Linux下高性能事件模型
}

http {
    include /etc/nginx/mime.types;
    default_type application/octet-stream;
    
    # 自定义日志格式,包含请求处理时间
    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                    '$status $body_bytes_sent "$request_time"';
    
    access_log /var/log/nginx/access.log main;
    
    # 基础性能调优
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    client_max_body_size 16M;   # 允许上传的最大文件大小
    
    # 开启Gzip,减少传输数据量
    gzip on;
    gzip_vary on;
    gzip_min_length 1024;
    gzip_types text/plain text/css text/javascript application/json application/javascript image/svg+xml;
    
    include /etc/nginx/conf.d/*.conf;   # 引入站点配置
}

Site virtual host configuration

We will handle HTTP to HTTPS redirection separately, as well as the official service of HTTPS. Please note the replacementyourdomain.comfor your actual domain name.

# nginx/conf.d/myapp.conf
upstream django {
    server web:8000;   # 指向 docker-compose 中的 web 服务
}

# 强制HTTPS重定向
server {
    listen 80;
    listen [::]:80;
    server_name yourdomain.com www.yourdomain.com;
    return 301 https://$server_name$request_uri;
}

server {
    listen 443 ssl http2;
    listen [::]:443 ssl http2;
    server_name yourdomain.com www.yourdomain.com;
    
    # SSL证书路径(建议使用Certbot通过Let's Encrypt自动申请)
    ssl_certificate /etc/ssl/certs/fullchain.pem;
    ssl_certificate_key /etc/ssl/private/privkey.pem;
    ssl_protocols TLSv1.2 TLSv1.3;     # 只允许安全的TLS协议版本
    ssl_session_cache shared:SSL:10m;
    
    # 重要安全响应头
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-XSS-Protection "1; mode=block" always;
    
    # 静态文件(缓存1年,访问日志可关掉以减轻I/O)
    location /static/ {
        alias /app/static/;
        expires 1y;
        add_header Cache-Control "public, immutable";
        access_log off;
    }
    
    # 媒体文件(禁止执行任何脚本)
    location /media/ {
        alias /app/media/;
        expires 30d;
        add_header Cache-Control "public";
        location ~* \.(php|py|sh)$ { deny all; return 404; }
    }
    
    # 代理到Gunicorn
    location / {
        proxy_pass http://django;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_read_timeout 60s;
        proxy_buffering on;   # 开启缓冲提升后端响应速度
    }
    
    # 健康检查简单端点,监控系统可以用
    location /health/ {
        access_log off;
        return 200 "healthy\n";
        add_header Content-Type text/plain;
    }
}

This configuration is ready for production, but remember to apply for an SSL certificate and put it in the corresponding path (./ssldirectory), otherwise Nginx will fail to start because it cannot find the certificate.


Gunicorn configuration and optimization

Gunicorn is the de facto standard for Python WSGI servers, and configuration files make tuning very easy.

# gunicorn.conf.py
import multiprocessing

# 核心配置
bind = "0.0.0.0:8000"
workers = multiprocessing.cpu_count() * 2 + 1   # 经典公式:CPU核数 × 2 + 1
worker_class = "sync"                            # 普通应用就用同步Worker,如果IO密集可考虑 gevent
worker_connections = 1000
timeout = 30                                     # Worker处理请求的超时秒数
max_requests = 1000                              # 处理1000个请求后Worker自动重启,避免内存泄漏
max_requests_jitter = 100                        # 随机增减最多100个请求,防止所有Worker同时重启

# 性能优化
preload_app = True                               # 预加载应用代码,减少每个Worker的内存占用
tmp_upload_dir = "/dev/shm"                      # 临时文件放在内存文件系统中,加速上传处理

Some suggestions for fine-tuning:

  • Number of workers:CPU核数 × 2 + 1Suitable for most scenarios. If your views are mostly CPU intensive (rare), drop to核数 + 1, to avoid too many context switches.
  • sync vs gevent: For most standard Django applications,syncThe model is simple and reliable; only in scenarios with a large number of third-party API calls or WebSocket do you need to switch to gevent, but you need to install additional dependencies and pay attention to code compatibility.
  • max_requests: Forcing the Worker to restart regularly can effectively prevent memory growth caused by long-term running (even if Python does not have obvious memory leaks, fragments will accumulate).

PostgreSQL Production Security and Backup

The database is the heart of the business, and the production environment must be "stable" and "cannot be lost." In addition to setting a strong password in Docker Compose earlier, we must also configure automatic backup.

Database backup management commands

The following is a backup script that can be called by the django management command. You can also run it as a Python script alone, or even configure it into a Celery scheduled task.

# myapp/management/commands/backup_db.py
import os
import datetime
import subprocess
from django.conf import settings
from django.core.management.base import BaseCommand

class Command(BaseCommand):
    help = "备份 PostgreSQL 数据库,并自动清理旧备份"

    def handle(self, *args, **options):
        timestamp = datetime.datetime.now().strftime('%Y%m%d_%H%M%S')
        backup_dir = '/backups'
        os.makedirs(backup_dir, exist_ok=True)
        backup_file = f"{backup_dir}/myapp_{timestamp}.sql.gz"

        try:
            # 构造 pg_dump 命令
            cmd = [
                'pg_dump',
                '-h', settings.DATABASES['default']['HOST'],
                '-U', settings.DATABASES['default']['USER'],
                '-d', settings.DATABASES['default']['NAME'],
            ]
            # 通过环境变量传递数据库密码
            env = os.environ.copy()
            env['PGPASSWORD'] = settings.DATABASES['default']['PASSWORD']

            # 使用管道直接压缩,减少I/O
            with open(backup_file, 'wb') as f:
                proc1 = subprocess.Popen(cmd, env=env, stdout=subprocess.PIPE)
                proc2 = subprocess.Popen(['gzip'], stdin=proc1.stdout, stdout=f)
                proc1.stdout.close()
                proc2.wait()

            # 删除7天前的旧备份
            cutoff = datetime.datetime.now() - datetime.timedelta(days=7)
            for filename in os.listdir(backup_dir):
                file_path = os.path.join(backup_dir, filename)
                if os.path.getmtime(file_path) < cutoff.timestamp():
                    os.remove(file_path)

            self.stdout.write(self.style.SUCCESS(f"备份成功: {backup_file}"))
        except Exception as e:
            self.stderr.write(self.style.ERROR(f"备份失败: {str(e)}"))

you can passpython manage.py backup_dbRun it manually, or integrate it into the Celery Beat scheduled schedule later (for example, execute it at 3 a.m. every day), and mount the backup directory to the host or cloud storage.

:::tip production tips

  • backup directory (/backups) should be mapped to a path on the host through the volume of Docker Compose, or synchronized to a remote object storage (such as AWS S3) regularly.
  • Conduct recovery drills regularly to ensure that the backup files are actually usable. :::

Summary of this chapter

Today we have completely walked through the Django stand-alone production environment deployment closed loop, which mainly covers the following links:

  1. Understand minimalist production architecture: Nginx → Gunicorn → PostgreSQL/Redis, and keep in mind the 5 iron rules.
  2. Lightweight environment preparation: server initialization, creation of non-root users, firewall configuration, and secure environment variable settings.
  3. Docker Compose full stack deployment: onedocker-compose.ymlConnect the web, database, cache, and reverse proxy all in series, and have a lot of health checks and automatic restarts.
  4. Nginx and Gunicorn configuration optimization: practical skills such as HTTPS, security headers, static file acceleration, and worker number optimization.
  5. PostgreSQL automatic backup: Compressed backup and old file cleanup are implemented through django management commands, which is simple and reliable.

:::warning Final checklist before going live

  • Apply for and configure a free, auto-renewable Let's Encrypt SSL certificate.
  • Close completelyDEBUGand check allALLOWED_HOSTS
  • Modify the default administrator password and set a strong password for the database.
  • usedocker compose logsand/health/The endpoint confirms that all services are functioning properly.
  • If you need more sophisticated process guarding (not relying on Docker restart), you can additionally configure Supervisor.
  • Connect to basic monitoring (such as Prometheus + Grafana or simple Uptime Kuma) to have a general understanding of CPU, memory, disk and request error rates. :::

After completing this, your Django application will already have the ability to run stably at the production level. In the future, CI/CD pipelines, horizontal expansion and service splitting can be gradually added on this basis, but the core skeleton is reliable enough to support early business.