Django caching strategy - High-performance application optimization guide | Daoman PythonAI

#django caching strategy - A guide to optimizing high-performance applications

📂 Stage: Part 2 - Advanced Features 🎯 Difficulty level: Intermediate ⏰ Estimated study time: 4-5 hours

Table of contents

Basic concepts of caching

Caching is a key technology to improve the performance of web applications. It avoids repeated calculations and database queries by storing calculation results or data copies.

Caching principle

The working principle of caching is simple: when a request arrives, it first checks whether the required data is in the cache. If there is, return it directly; if not, perform the calculation or query, and then store the result in the cache for the next request.

The main advantages of caching include:

  • Reduce database load
  • Reduce response time
  • Improve system throughput
  • Improve user experience

At the same time, caching also brings some challenges, such as data consistency issues, cache avalanches, penetration, breakdown, etc.

Cache level

Django provides multiple levels of caching, from low to high:

  1. Database query cache
  2. Template fragment caching
  3. View level caching
  4. Page level caching
  5. Application layer caching
  6. Proxy layer caching

##django cache architecture {#django cache architecture}

Cache backend

Django supports a variety of caching backends, including:

  • DummyCache - empty cache (for development environments)
  • LocMemCache - local memory cache
  • FileBasedCache - file system cache
  • DatabaseCache - database cache
  • MemcachedCache - Memcached cache
  • RedisCache - Redis cache (requires third-party package)

Selection principle:

  • Development environment: LocMemCache
  • Small applications: LocMemCache or FileBasedCache
  • Production environment: Redis or Memcached

Caching API

Django provides a simple and easy-to-use caching API:

from django.core.cache import cache

# 设置缓存
cache.set('key', 'value', timeout=300)  # 5分钟后过期

# 获取缓存
value = cache.get('key')
if value is None:
    # 缓存未命中,从数据库获取
    value = expensive_database_query()
    cache.set('key', value, timeout=300)

# 批量操作
cache.set_many({'a': 1, 'b': 2, 'c': 3}, timeout=300)
values = cache.get_many(['a', 'b', 'c'])

Cache backend configuration

settings.py configuration

# settings.py - 缓存配置
import os

# 基础缓存配置
CACHES = {
    # 默认缓存
    'default': {
        'BACKEND': 'django.core.cache.backends.redis.RedisCache',
        'LOCATION': 'redis://127.0.0.1:6379/1',
        'OPTIONS': {
            'CLIENT_CLASS': 'django_redis.client.DefaultClient',
        },
        'KEY_PREFIX': 'myapp',
        'TIMEOUT': 300,  # 5分钟默认超时
    },
    
    # 临时缓存
    'temporary': {
        'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',
        'LOCATION': 'temp-cache',
        'TIMEOUT': 60,  # 1分钟
    },
}

Redis cache configuration

Redis is one of the most commonly used cache backends in production environments. You need to install dependencies first:

pip install django-redis redis

Cache type and usage

View level cache

Django provides simple decorators to cache views:

from django.views.decorators.cache import cache_page
from django.http import JsonResponse
from .models import Product

# 函数视图缓存
@cache_page(60 * 15)  # 缓存15分钟
def product_list_view(request):
    products = Product.objects.filter(is_active=True).select_related('category')
    
    return JsonResponse({
        'products': [
            {
                'id': p.id,
                'name': p.name,
                'price': float(p.price),
            }
            for p in products
        ]
    })

Template fragment cache

In the template we can usecacheTemplate tag to cache specific fragments:

<!-- 在模板中使用 -->
{% load cache %}
{% cache 500 sidebar user.username %}
    <!-- 侧边栏内容,缓存500秒 -->
    <div class="sidebar">
        <h3>欢迎, {{ user.username }}!</h3>
        <p>这是侧边栏内容...</p>
    </div>
{% endcache %}

Cache Strategy Mode

Cache-Aside mode

Cache-Aside is one of the most commonly used caching strategies:

class CacheAsidePattern:
    """Cache-Aside模式实现"""
    
    @staticmethod
    def get_data_with_cache(model_class, pk, timeout=300):
        """使用Cache-Aside模式获取数据"""
        cache_key = f"model:{model_class._meta.label}:{pk}"
        
        # 1. 先从缓存获取
        data = cache.get(cache_key)
        if data is not None:
            return data
        
        # 2. 缓存未命中,从数据库获取
        try:
            instance = model_class.objects.get(pk=pk)
            # 3. 将数据存入缓存
            cache.set(cache_key, instance, timeout)
            return instance
        except model_class.DoesNotExist:
            # 数据不存在,也缓存None值,避免频繁查询
            cache.set(cache_key, None, 60)
            return None

Cache performance optimization

Cache compression

For large data sets, cache compression can be used to save memory:

import pickle
import zlib
from django.core.cache import cache

class CompressedCache:
    """压缩缓存类"""
    
    @staticmethod
    def set_compressed(key, value, timeout=300):
        """设置压缩缓存"""
        # 序列化数据
        serialized_data = pickle.dumps(value)
        # 压缩数据
        compressed_data = zlib.compress(serialized_data)
        # 存储压缩后的数据
        cache.set(key, compressed_data, timeout)
    
    @staticmethod
    def get_compressed(key, default=None):
        """获取压缩缓存"""
        compressed_data = cache.get(key, default)
        if compressed_data is default:
            return default
        if compressed_data is None:
            return None
        # 解压缩数据
        serialized_data = zlib.decompress(compressed_data)
        # 反序列化
        return pickle.loads(serialized_data)

Frequently Asked Questions and Solutions

Cache Avalanche

Symptoms: A large number of caches expire at the same time, causing a sudden increase in database pressure.

Solution:

  1. Use random expiration time
  2. Hierarchical caching strategy
  3. Background cache update
import random

def set_with_jitter(key, value, base_timeout=300, jitter_range=60):
    """设置带抖动的缓存"""
    actual_timeout = base_timeout + random.randint(-jitter_range, jitter_range)
    cache.set(key, value, actual_timeout)

Cache penetration

Symptoms: Query for data that does not exist and is not in the cache, causing each query to hit the database.

Solution:

  1. Cache empty values
  2. Use bloom filters
def get_with_null_cache(model_class, pk, timeout=300):
    """带空值缓存的获取"""
    cache_key = f"model:{model_class._meta.label}:{pk}"
    
    result = cache.get(cache_key)
    if result is not None:
        if result == "__null__":
            return None
        return result
    
    try:
        instance = model_class.objects.get(pk=pk)
        cache.set(cache_key, instance, timeout)
        return instance
    except model_class.DoesNotExist:
        # 缓存空值,避免频繁查询数据库
        cache.set(cache_key, "__null__", timeout=60)
        return None

Cache breakdown

Symptoms: When hotspot data expires, a large number of concurrent requests query the database at the same time.

Solution:

  1. Use mutex locks
  2. Never expires + background updates

Summary of this chapter

In this chapter, we took an in-depth look at Django caching strategies:

  1. Basic concepts of caching: Understand the working principle and importance of caching
  2. Cache Architecture: Master the overall architecture of the django cache system
  3. Cache backend configuration: Learned the configuration of cache backends such as Redis
  4. Cache types and usage: Understand the usage of cache at different levels
  5. Cache Strategy Mode: Learned Cache-Aside and other modes
  6. Performance Optimization: Mastered optimization technologies such as cache compression
  7. Problem Solutions: Understand the solutions to avalanche, penetration, breakdown and other problems

💡 Core Points: Caching strategies need to be designed according to specific application scenarios and performance requirements. A reasonable caching strategy can significantly improve application performance, but attention must also be paid to avoiding the complexity and potential problems caused by caching.

🏷️ tag cloud:django缓存 缓存策略 性能优化 Redis缓存 缓存层级