title: OpenCV Practical Guide: A complete tutorial from image processing to deep learning - Computer Vision Core Technology | Daoman PythonAI description: A complete OpenCV practical tutorial, covering image processing, computer vision, deep learning DNN module, face detection, target tracking and other core technologies, including a wealth of Python code examples and practical projects. keywords: [OpenCV, computer vision, image processing, Python, face recognition, target detection, deep learning, DNN, image filtering, edge detection]
OpenCV Practical Guide: From pixels to core applications of deep learning
Introduction
When it comes to computer vision, the first thing that many people think of is OpenCV (Open Source Computer Vision Library). This open source tool library has taken root in industry and academia for more than 20 years, providing a complete workflow from basic pixel processing to deep learning model inference. More importantly, it supports multiple languages such as Python, C++, and Java, allowing you to implement powerful visual functions with very little code.
This tutorial is not a quick overview, but only talks about high-frequency practical skills. We will start from environment-setup and use replicable examples to take you through image operations, classic filters, face detection and YOLO target detection one by one, so that you can quickly develop the ability to develop actual projects.
1. OpenCV basic overview
1.1 What is OpenCV?
OpenCV is a lightweight cross-platform vision library designed for real-time applications. It encapsulates a large number of mature image processing and computer vision algorithms, so you don’t have to reinvent the wheel and only need to focus on application logic. Whether you're running Python prototypes on your laptop or deploying C++ modules on embedded devices, OpenCV provides a consistent interface.
1.2 Core features and applications
The functions of OpenCV can be roughly divided into three levels:
- Classic CV algorithm: Like Canny edge detection, morphological operations, contour analysis, etc., suitable for scenes with stable lighting.
- Deep Neural Network Module (
dnn): Supports loading models in ONNX, TensorFlow, Caffe and other formats for efficient inference on CPU/GPU without installing PyTorch or TensorFlow. - Hardware Acceleration: Double the performance of high-load computing through OpenCL or CUDA backend.
It is precisely because of this coverage from traditional to deep learning that OpenCV is widely used in many fields such as face recognition, autonomous driving perception, industrial defect detection, medical image analysis, and augmented reality.
2. Quick environment configuration
2.1 Installation selection
According to your usage scenario, just install the corresponding package, no need to worry:
headlessVersions that remove GUI functionality (e.g.cv2.imshow()), smaller in size and suitable for production environments.
2.2 Verify installation
After the installation is complete, open a Python terminal and enter the following code to confirm the version. If a version number appears, the environment is ready.
3. Basic image operations: pixels, ROI and channels
In OpenCV, an image is essentially a NumPy multidimensional array (shape[高, 宽, 通道]). This means that we can directly use array slicing to complete operations such as cropping and channel separation, which is extremely fast.
The following code demonstrates the complete process of "load-process-display-save" and shows how to operate the region of interest (ROI) and modify a color channel individually.
Key Point: The order of image channels read by OpenCV is BGR. This is due to historical reasons. Special attention needs to be paid to the conversion when collaborating with other libraries (such as matplotlib).
4. Core color space conversion
Color space conversion is the first step in many visual tasks because certain information is easier to separate or analyze in a specific space. The two most commonly used transformations are:
- Gray: Compresses three channels into a single channel, greatly reducing the amount of data. It is the basis for face detection and feature extraction.
- HSV space: Decouple "color" from "brightness" - H (hue) and S (saturation) are insensitive to lighting changes and are very suitable for color filtering.
5. Classic image processing (must learn)
No matter how powerful deep learning is, many practical scenarios still require traditional preprocessing methods to improve robustness. The following three tips can cover 80% of the binarization, edge extraction and noise cleaning needs.
5.1 Adaptive binarization (handling uneven lighting)
Ordinary global thresholding fails when facing shadow or highlight areas. Adaptive Threshold dynamically calculates the threshold based on the neighborhood of each pixel, which is very suitable for scenarios such as document scanning and license plate recognition.
5.2 Canny edge detection (denoising + accurate positioning)
Canny is a classic edge detection algorithm that first suppresses noise through Gaussian blur, and then uses a double threshold strategy to find strong edges. Its output is a single-channel black and white edge map, which is often used in the pre-step of contour discovery and target segmentation.
5.3 Morphological opening and closing operations (cleaning binary images)
For binary images, "opening operation" can remove isolated small white points, and "closing operation" can fill internal small black holes. With the customized structure core, the foreground area can be accurately cleaned.
6. Classic practice: face detection
For devices with limited computing power (such as Raspberry Pi, access control cameras), Haar cascade classifier is still the lightest choice so far. OpenCV has built-in ready-made training models, which can achieve real-time detection with only a few dozen lines of code.
Although Haar is not as accurate as deep learning in complex scenarios, its low latency and GPU-free features make it still active in the embedded field.
7. Modern practice: DNN YOLO target detection
OpenCVdnnThe module is a change in thinking - it does not train the model, it only does efficient inference. Therefore, you can directly import models in Caffe, TensorFlow, ONNX and other formats, completely saying goodbye to the pain of environment configuration. Here we take the lightweight YOLOv3-tiny as an example to implement image target detection.
7.1 Preparation
Please download the following three files in advance and place them in the same directory as the script:
7.2 Implementation code
The whole process is divided into: model loading → preprocessing (blob) → forward calculation → non-maximum suppression (NMS) → drawing results.
**Why choose YOLOv3-tiny? ** Because it strikes an excellent balance between accuracy and speed, it can run in real time even on laptops without a GPU. If you need higher accuracy, just change to the weight file of YOLOv4 or YOLOv7, and the inference process is exactly the same.
8. Learning summary and suggestions
8.1 Core Harvest
Through the above practical exercises, you should have mastered:
- Pixel Operation: With NumPy slicing as the core, efficient processing of ROI and channels.
- Classic CV: Adaptive binarization, Canny and morphological operations, solving most preprocessing needs.
- Modern CV:
dnnModules make model deployment extremely simple, without the need to learn additional inference frameworks.
8.2 Learning Suggestions
- Practice the basics first, then touch DNN: Traditional methods are the cornerstone of understanding visual problems, don’t skip them.
- More hands-on projects: Small projects such as answer card recognition, fruit sorting, parking space detection, etc. can quickly improve proficiency.
- Make good use of official resources: OpenCV 官方文档 contains a large number of tutorials and API instructions. If you encounter problems, check the documentation first.

