FastAPI nginx-gunicorn-production complete guide
📂 Stage: Stage 5 - Engineering and Deployment (Practical) 🔗 Related chapters: docker-container-deployment · Pydantic Settings多环境配置
Table of contents
Overview of production deployment architecture
Why do we need Nginx + Gunicorn architecture?
Use directlyuvicornAlthough running FastAPI is simple, it can expose many problems in a production environment:
- Single Process Bottleneck: by default
uvicornOnly use one worker process. Once a request is blocked (such as a long database query), other requests will be stuck. - Lack of Load Balancing: A single instance cannot fully utilize multi-core CPUs and cannot distribute traffic among multiple servers.
- Static file service efficiency is low: FastAPI itself is not suitable for processing a large number of static resources, and handing it over to Nginx can greatly improve performance.
- Security and SSL Termination: Production environments must have HTTPS enabled, handling SSL directly at the application layer adds unnecessary complexity.
With the introduction of Nginx + Gunicorn, these pain points can be solved: Nginx, as a reverse proxy and SSL terminator, can efficiently handle static files, load balancing and security hardening; Gunicorn manages multiple Uvicorn worker processes, fully utilizes multi-core CPUs, and provides process monitoring and automatic restart mechanisms.
Core component responsibilities
Typical production architecture
In this architecture, Nginx serves as the first line of defense to receive all external requests, and then distributes dynamic requests to the back-end Gunicorn worker process through a reverse proxy, while static files are directly responded to by Nginx, which greatly improves overall performance.
Nginx reverse proxy configuration
Detailed explanation of core configuration
The Nginx configuration file below already contains the most commonly used options in a production environment. You can save it as/etc/nginx/sites-available/daoman-api。
Configuration and online steps
After completing the configuration, enable and reload Nginx with the following commands:
Tip: If you want Nginx to take effect immediately with the new configuration, you can use
restart,butreloadDoes not disrupt existing connections and is more suitable for production environments.
Gunicorn High Performance Configuration
Write Gunicorn configuration file
Create in the project root directorygunicorn.conf.py, and adjust the parameters to suit your server.
Parameter Description:
workers: Generally set toCPU 核心数 × 2 + 1, if there is a lot of I/O waiting in the application, it can be increased appropriately, but it should not be too much, otherwise context switching will reduce efficiency.preload_app = True: Load the application when starting the main process, and then fork out the worker process, which can share read-only data in memory and greatly reduce memory usage.max_requests+max_requests_jitter: Regularly recycle workers to prevent memory fragmentation or accumulation of minor leaks from causing service lags.
Startup script integrated with Systemd
In order to run Gunicorn reliably in a production environment, it is recommended to use systemd for management.
Startup scriptstart.sh(optional, for manual testing):
Systemd service files/etc/systemd/system/daoman-api.service:
Deploy and start the service:
Gunicorn now runs automatically on system startup and will automatically restart after a crash.
SSL certificate configuration and HTTPS
Production environments must have HTTPS enabled. It is recommended to use the free certificate provided by Let's Encrypt and combine it with the Certbot tool to achieve automatic renewal.
Install Certbot and obtain the certificate
After execution, Certbot will verify domain name ownership and write the certificate path to the Nginx configuration file (as abovessl_certificateandssl_certificate_keyinstruction).
Automatic renewal
Let's Encrypt certificates are valid for 90 days and require periodic renewal. Certbot usually automatically adds scheduled tasks, you can manually check and add:
After the renewal is successful, Certbot will automatically reload Nginx to make the new certificate effective.
Load balancing and high availability
Nginx load balancing strategy
If your application is deployed on multiple servers, just modifyupstreamSimple load balancing can be achieved using blocks.
weight: The greater the weight, the more requests are allocated.ip_hash: Requests from the same client IP are always forwarded to the same backend, suitable for applications that require session persistence.least_conn(minimum connections) orrandom(random) available asip_hashalternatives, chosen based on business needs.
Application health check
In order for Nginx to detect unavailable backend servers in time, a health check endpoint needs to be implemented in FastAPI.
In Nginx you can passhealth_checkActive detection (requires commercial version ornginx-healthcheckmodule), for community edition Nginx, usually combined withfail_timeoutandmax_failsPassive detection is enough.
Security hardening measures
Nginx security configuration
in the foregoingserverAdd more security-related instructions to the block:
Application layer security
FastAPI providesTrustedHostMiddlewareMiddleware to defend against HTTP Host header attacks.
This middleware will verify the requestHostWhether the header is in the whitelist, if not, a 400 error will be returned, effectively preventing security issues caused by malicious requests forging Host headers.
##Performance Optimization Strategy {#Performance Optimization Strategy}
1. Gunicorn built-in optimization
preload_app = True: Preload applications to reduce memory usage and speed up worker startup.worker_connectionsandtimeout: Adjust appropriately according to business scenarios to avoid long-suspended connections occupying resources.max_requests: Avoid memory leaks caused by workers running for a long time.
2. Nginx static file cache
rightlocation /static/we have setexpires 30d, for more complex caching requirements, you can enable Nginx proxy caching:
Note: Caching is only suitable for interfaces that do not change frequently. Dynamic data must be used with caution, otherwise data inconsistency may result.
3. Application layer caching
Used in FastAPIaiocacheAdd memory caching for time-consuming operations.
In this way, the same request will only execute a slow query once within 5 minutes, and subsequent requests will directly return the cached results, effectively reducing database pressure and improving response speed.
Monitoring and Log Management
Structured log
In a production environment, it is recommended to output logs in JSON format to facilitate centralized collection and analysis.
Place the above code at the top of the entry file of the FastAPI application, alllogging.info()All calls will output standard JSON logs, making it easy to use ELK, Loki and other tools for analysis.
Basic system monitoring script
usepsutilWrite a simple resource monitoring script to send an alert when CPU or memory exceeds a threshold.
For enterprise-level projects, it is recommended to integrate professional monitoring systems such as Prometheus + Grafana, which can more comprehensively display key indicators such as QPS, error rate, and latency.
Troubleshooting and Debugging
Common command line troubleshooting tools
One-click troubleshooting script
Integrate common inspection commands into one scripttroubleshoot.sh, which is convenient for quickly locating the problem:
runbash troubleshoot.shYou will get a brief troubleshooting report.
Automated deployment script
Encapsulating code pulling, dependency installation, service restart and other operations into a deployment script can significantly reduce manual operation errors.
Usage Suggestions:
- Make sure the deploying user has the appropriate permissions (can use
sudoexecute necessary commands). - In the production environment, it is recommended to trigger this script in conjunction with a CI/CD platform (such as GitHub Actions, Jenkins) to achieve fully automated deployment.
Summary
This article comprehensively explains the process of building a FastAPI production environment from architectural design to actual deployment. The key points are summarized below:
- Architecture Selection: Nginx is responsible for exposing services, SSL termination and static files. Gunicorn + Uvicorn provides high-performance multi-process asynchronous processing capabilities.
- Gunicorn Worker Process: Follow
CPU 核心数 × 2 + 1formula configuration and enablepreload_appandmax_requestsKeep services stable. - HTTPS Certificate: Easily automate renewals with Let's Encrypt free certificates and the Certbot tool.
- Security hardening: Add security response headers, limit request body size, disable access to sensitive files, and use TrustedHostMiddleware to defend against Host header attacks.
- Log and Monitoring: Output structured JSON logs, write simple system resource monitoring scripts, and detect hidden dangers in advance.
- Automated deployment: Integrate code updates, dependency installation, service restarts and health checks through bash scripts to improve deployment efficiency and reliability.
This solution can already cover the production needs of most small and medium-sized web applications. You can flexibly adjust various parameters according to business scale and further integrate more advanced orchestration tools such as Docker and Kubernetes.
🔗 Related tutorials

