Edge Computing: Detailed explanation of Raspberry Pi, mobile phone and edge AI deployment
📂 Stage: Stage 2 - Deep Learning Vision Basics (CNN) 🔗 Related chapters: Web 视觉应用 · 实战项目一:智能人脸考勤系统
Introduction
Edge computing is changing the way we use AI. It moves computing from distant cloud data centers to the devices closest to us - such as mobile phones, cameras, and Raspberry Pis. In this way, AI no longer relies on the network, but can make quick decisions locally.
This change is especially important for deep learning. Real-time tasks (such as face recognition, anomaly detection) can no longer tolerate hundreds of milliseconds of cloud latency, and high-definition videos are not suitable for all uploading to the cloud. More importantly, many scenarios involve private data (faces, medical images), and the law requires that the data cannot leave the device. Edge AI just solves these problems: low latency, privacy protection, bandwidth saving, and can run offline.
This article will take you through the core deployment links of edge AI: from hardware preparation and framework selection to model optimization and actual deployment architecture, allowing you to quickly get started deploying AI to edge devices.
1. What are the advantages of edge AI?
1.1 Cloud vs edge, understand with one table
1.2 Edge AI is not a “replacement of the cloud”, but cloud-edge collaboration
A typical edge AI architecture can be divided into four layers, each with their own division of labor:
- Cloud training layer: Process massive data, do model pre-training, fine-tuning and update the global knowledge base.
- Edge gateway layer (optional): Aggregate data from multiple terminals and do some local caching and coordination.
- Edge device layer: This is the protagonist - Raspberry Pi, mobile phone, smart camera, etc., responsible for real-time reasoning.
- Data collection layer: Microphones, cameras, and sensors are only responsible for collecting raw data.
Below we will focus on the actual deployment of the edge device layer.
2. Raspberry Pi deployment practice: the most user-friendly edge platform
The Raspberry Pi is affordable, has a mature ecosystem, and can fully run Python + PyTorch, making it the best choice for getting started with edge AI.
2.1 Configure the environment with one click
It is recommended to use Raspberry Pi OS 64-bit (Bookworm version) for the best compatibility. Open a terminal and follow these steps:
💡
opencv-python-headlessIt eliminates the dependencies related to the graphical interface and is more refreshing on the screen-less Raspberry Pi.
2.2 Use lightweight model to run inference
MobileNetV2 is a visual model specially designed for mobile and edge devices. We directly use it to demonstrate a complete reasoning process:
⚠️ Remember to download before using
imagenet_labels.txtfile (containing the English names of 1000 categories), otherwise the script will report an error.
On Raspberry Pi 4B, this code can usually complete an inference within 50~100 milliseconds, which is sufficient for many real-time scenarios.
3. TensorFlow Lite: standard for inference on mobile phones and embedded devices
PyTorch is flexible and powerful, but if you want to plug the model into an Android phone, embedded board or even a microcontroller, TensorFlow Lite is a more mature choice. It natively supports INT8/FP16 quantization, and can also call hardware accelerators such as GPU and NPU, making the inference speed very fast.
3.1 Conversion process from PyTorch to TFLite
The PyTorch model must first be transferred through ONNX and then converted into TFLite. Here is an automated conversion function:
converted.tfliteThe file size is usually reduced to 1/4 of the original size, and the inference speed is increased by 2 to 4 times, making it very suitable for mobile phone use.
4. Three key points for edge AI performance optimization
Edge devices have limited computing power, memory, and power consumption, and must be proactively optimized to achieve good results.
4.1 Optimization at the model level (the most obvious effect)
4.2 Hardware and runtime tips
-
Exclusively for Raspberry Pi:
-
Enable the NEON instruction set when compiling OpenCV / PyTorch to leverage the parallel capabilities of the ARM CPU.
-
Set CPU to
performancemode to avoid dynamic underclocking. -
Mobile only:
-
Call TFLite's NNAPI under Android and Core ML under iOS to allow NPU to participate in acceleration.
-
It has been measured that the inference speed of some models on NPU can be increased by more than 5 times.
-
GENERAL TIPS:
-
Be sure to turn off gradient calculation during inference (
with torch.no_grad())。 -
Large file models are loaded using memory mapping to reduce startup memory usage.
-
Image preprocessing and model inference can be split into different threads to create pipelines.
5. What else should be considered during actual deployment?
5.1 How to choose the deployment architecture?
Depending on the scenario, you can choose the following three typical methods:
- Pure Edge Deployment: All inference is done on the device, suitable for privacy-sensitive scenarios such as home security cameras.
- Edge-Cloud Collaboration: The edge performs preliminary screening and sends suspicious results to the cloud for detailed analysis. For example, in industrial quality inspection, suspected defects are first looked for locally, and the final judgment is made in the cloud.
- Edge Caching: If the repetition rate of recognition tasks is high (such as smart shelves in shopping malls), the recognition results of popular products can be cached to greatly reduce the amount of calculation.
5.2 What indicators should be monitored after going online?
Deployment is just the beginning, you still need to continue to observe:
- Inference delay: The time consumption of a single request, usually measured by P50 and P99
- Throughput: How many frames or requests can be processed per second
- Resource usage: CPU, memory, storage usage
- Device Temperature: Plastic-cased devices such as the Raspberry Pi are prone to heat accumulation. Overheating will lead to frequency reduction and inference delays will suddenly soar.
Related tutorials
Summarize
Edge AI is not a “shrunk version” of cloud AI, but a practical way to integrate AI into the real world. It extends intelligence from data centers to small devices around us, making AI truly real-time, private and available offline.
For AI engineers, mastering the three capabilities of model lightweight + edge deployment + performance tuning will capture the core competitiveness of popular tracks such as the Internet of Things, mobile AI, and autonomous driving. Now, let’s start by lighting up a Raspberry Pi!

