Practical project: industrial defect detection
Introduction
Have you ever wondered how products such as mobile phone screens, car parts, and pills are inspected one by one for defects before leaving the factory? By human eyes? Looking at tens of thousands of parts in a day will inevitably make your eyes dazzled. Not to mention, some defects are thinner than a hair.
Industrial defect detection is to use machines instead of human eyes to "find faults" 24 hours a day. What’s behind it is not magic, but computer vision and deep learning. This article is a "trouble-finding guide" prepared for developers, focusing on solving a particularly common problem: There are too many normal samples to use up, but there are very few defective samples. This scenario is called anomaly detection in the industry.
We will start from traditional methods and talk about deep learning solutions such as convolutional autoencoders. We will also give PyTorch code and deployment ideas that can be run directly. Whether you are just getting started or are planning to run the model on the production line, I hope this article can help you.
📂 Stage: Stage 2 - Deep Learning Vision Basics (CNN) 🔗 Related chapters: 实战项目一:智能人脸考勤系统 · 实战项目三:自动驾驶感知
1. What is industrial defect detection?
Simply put, it uses a camera to capture an image of the product, and then uses an algorithm to automatically determine whether it is qualified. Compared with manual inspection, the machine is not tiring, has uniform standards, and can leave complete data records.
1.1 Why does the factory need it?
- More stable quality: Human eyes will have differences in fatigue, emotions, and experience, but machines will not. Once the algorithm is determined, all products will be treated equally.
- Save money and time: Investing a development cost in the early stage can save a lot of quality inspection manpower in the later stage; more importantly, it can intercept defective products at an early stage to avoid greater losses caused by subsequent rework or recalls.
- Guard safety and brand: A defective screw may destroy a piece of equipment, and a batch of products with poor appearance may damage the brand that has been in operation for many years.
1.2 What do common defects look like?
On the production line, there are many kinds of defects, which can be roughly classified into the following categories:
- Surface Defects: Scratches, dents, stains, cracks, uneven color. Such as small scratches on the glass of mobile phones.
- Structural/Dimensional Defects: Out of tolerance dimensions, deformation, missing material, internal bubbles, material delamination.
- Assembly defects: Parts are installed in the wrong position, screws are missing, and solder joints are weakly welded.
The most troublesome issues during detection
- Light and angle are always changing: The workshop environment is not as stable as the laboratory. If the brightness changes within a day and the angle of product placement is slightly different, the image may vary greatly.
- The defect is too small and too similar to the background: For example, there is a small crack on the floor with wood grain, which is difficult for the naked eye and more difficult for the machine to distinguish.
- Too few "defective products": On a stable production line, 99.9% of the products may be normal products, and only a few defective samples can be collected in a month. This makes it difficult to train the traditional classification model that “learns from a large number of defect samples”.
- Both speed and accuracy must be achieved: Several products pass through a high-speed assembly line in one second, and the processing time of a picture is only a few milliseconds, and the false alarm rate must be kept low.
These difficulties determine that we cannot use the ordinary "cat and dog classification" idea to solve the problem, but must use the anomaly detection method.
2. Core technology: How to use anomaly detection?
The core logic is actually not complicated: let the model only learn "what a normal product looks like", and then anything that doesn't look right is classified as an anomaly. ** Just like we have only seen whole apples, and suddenly a bug-eyed one appears, we can immediately realize that it is abnormal.
Depending on the amount of data on hand and the complexity of the product images, there are usually two approaches.
2.1 Traditional method: when there is not much data and simple textures
If the texture of your product is very regular (such as solid-color metal sheets, cloth with simple patterns), and there are only a few hundred or thousands of normal samples, it is more labor-saving to use traditional machine learning, and it does not even require a GPU.
How to do it?
The whole process is divided into three steps:
- Feature extraction: Convert each image into a string of numbers that can describe the "normal appearance", such as texture uniformity, color distribution, edge gradient, etc.
- Dimensionality reduction and standardization: Squeeze the feature dimensions, remove redundant information, and scale to the same scale.
- Train anomaly detection model: Use algorithms such as "isolated forest" or "single-class support vector machine" to circle a "normal area" in the feature space of normal samples. If a new sample falls outside the area, it is an anomaly.
Here is a ready-to-use Python implementation, usingscikit-imageandscikit-learn:
💡 **When to use it? ** When you only have a CPU, the data volume is within a few thousand, and the product texture is not complex, this traditional solution is extremely cost-effective. You can even deploy it directly using packaging tools without writing the dependencies of the deep learning framework.
2.2 Deep learning method: when there is a lot of data and complex textures
If the product surface itself has complex patterns (such as cloth surface, printed packaging), it is difficult for traditional artificial design features to cover all "normal" changes. At this time, the convolutional autoencoder (CAE) is needed.
Why is it useful?
The autoencoder is like a "memory master" and consists of two parts:
- Encoder: Compress the input image step by step into a condensed feature vector (such as only remembering key information).
- Decoder: Then "restore" the image from this condensed feature.
If you only use normal products to train it, then the decoder will only learn "how to reconstruct what normal products look like." When a defective sample is given in, the decoder will still try to restore it to its normal appearance. As a result, the reconstructed image is very different from the original image. We only need to calculate this difference (reconstruction error) to determine whether there is a defect.
Implemented from scratch using PyTorch
📌 Training Tips: Only use normal images during training, but it is best to leave a small part of normal samples to set the threshold (such as the 95th percentile), so that the misjudgment rate is controllable.
3. From experiment to production line: deployment and optimization
Making the code runable is only the first step. To actually get to the pipeline, speed, stability and maintainability must also be considered.
3.1 Practical tips to speed up deployment
- Quantitative Model: PyTorch
torch.quantizationThe model can be compressed from 32-bit floating point numbers to 8-bit integers, reducing the size by about 4 times and increasing the inference speed by 2 to 3 times, making it very suitable for edge devices. - Reduce the input size: If the details of 224×224 are too "luxurious" for you to distinguish defects, you can try 128×128 or even 96×96, and the speed will be visible to the naked eye.
- Out of PyTorch runtime: Use
torch.onnx.exportExport the ONNX model and load it for inference using OpenCV DNN or ONNX Runtime. In this way, the deployment package only requires OpenCV, which is clean and tidy.
3.2 A simple industrial deployment skeleton
The following class demonstrates how to use the trained detector in actual scenarios: single frame detection, database recording, and real-time video stream processing are all available.
4. Learning route and summary
Study suggestions
- Go through the traditional solution first: Don’t rush into deep learning. Use the code of the isolated forest to see if your product images can be distinguished by simple features. It can help you quickly understand the essence of anomaly detection.
- CAE is a commonly used ticket in industry: When traditional methods cannot hold up, convolutional autoencoders are usually the most stable and easiest starting point in deep learning solutions.
- Data quality is more important than model fancyness: Collect normal samples with different lighting, different batches, and different angles as much as possible; a small number of defective samples are only used to help you verify the threshold and do not need to participate in training.
- Take robustness as the first indicator: Be sure to test repeatedly with extreme pictures such as strong light, low light, partial occlusion, etc. before deployment. The production environment will not be polite to you.
Summarize
The core of industrial defect detection is actually just one sentence: Use the fewest defect samples to solve the most practical production problems. The traditional methods and convolutional autoencoders introduced in this article can already cover most common scenarios, and the implementation and deployment costs are relatively controllable. After mastering these basics, you will have the confidence to get started quickly, and it will not be too late to challenge more advanced methods such as VAE, GAN, and PatchCore.

