OpenCV Quick Start: A Complete Guide to Image Reading, Drawing and Geometric Transformation
Introduction
OpenCV (Open Source Computer Vision Library) is the de facto standard open source tool library in the field of computer vision. It not only provides full-link support from low-order pixel operations to high-order visual algorithms, but can also be called across platforms and languages (C++/Python/Java, etc.) at extremely fast speeds (the bottom layer is optimized C/C++).
This article focuses on the first cornerstone of traditional CV: OpenCV basic IO, drawing, and geometric transformation. All codes are accompanied by clear comments and usage scenarios. After reading this, you can directly start writing practical scripts.
📂 Phase: Phase 1 — Cornerstone of Image Processing (Traditional CV) 🔗 Related Chapters: CV 概览与数字图像基础 · 图像增强与滤波
1. environment-setup: Get started with OpenCV in 5 minutes
1.1 Library selection and installation
OpenCV has 3 mainstream Python packages, just choose one according to your needs:
💡 Package Selection Suggestions: If you are not sure whether you will use feature algorithms (such as SIFT, SURF) in the future, you can install it directly
opencv-contrib-python, which is backwards compatible and contains a complete feature set.
1.2 One-click verification installation
Copy this code and run it. If there is no problem, it means the environment is set up:
⚠️ FAQ: If an error is reported
ImportError: libGL.so.1: cannot open shared object file, indicating that your environment lacks OpenGL dependencies and can be executed in the terminalsudo apt install libgl1-mesa-glx(Linux) or installopencv-python-headlessversion (headless mode).
2. Image IO: complete set of logic for reading → display → saving
💡 Core pre-knowledge points
Before opening any image, you must first remember three "counter-intuitive" settings of OpenCV:
- The default color image is BGR format (the reverse order of RGB in Matplotlib and PIL)
- The essence of image data is NumPy uint8 array, pixel value range
0~255 - The order of image shapes is
(高度, 宽度, 通道数), rather than the intuitive width × height
Once you understand these three points, you will avoid any pitfalls when dealing with color conversion, slicing, and drawing later.
2.1 Image reading: Handling loading failure scenarios
Many novices will get stuck here - the file path is written incorrectly or the image does not exist, and the code processes it directly without judgment, resulting in an error. A robust loading function is essential:
🔍 Debugging Tips: If your image path contains Chinese, OpenCV
imreadMay not be read directly. You can use it first at this timenp.fromfile+cv2.imdecodeway to bypass the path encoding problem.
2.2 Image display: two scene selection tools
Scenario 1: Local scripting/debugging (using OpenCV native window)
The native window supports keyboard and mouse events, suitable for quick viewing and interaction:
🖥️ Avoid window stuck:
cv2.waitKey()must follow closelyimshow()Called afterwards, otherwise the window will become unresponsive. Remember to add in the loopcv2.destroyAllWindows()Clean up resources.
Scenario 2: Jupyter Notebook/Lab (using Matplotlib)
The native window pop-up experience in Notebook is extremely poor. It is recommended to use Matplotlib instead. But you must convert BGR to RGB, otherwise the colors will be messed up:
🎨 Color order shorthand: OpenCV has the blue channel first (BGR), while Matplotlib expects the red channel first (RGB). Simple formula: "BGR is for OpenCV, RGB is for Matplotlib".
2.3 Image Saving: Control Compression Quality
When saving, the compression parameters are automatically adjusted according to the suffix name to achieve a balance between image quality and file size:
💡 Compression parameter recommendations: JPEG quality is set to
85-95to achieve a good balance between visual quality and file size; PNG compression level3-6Usually the best choice.
3. Basic drawing: annotate images and draw ROI (region of interest)
OpenCV's drawing function has highly unified parameters, and a common template can handle most scenarios:
🔍 Key details:
- Line width = -1 means filling the interior of the graphic
- The color order is
(蓝, 绿, 红), for example, pure red is(0, 0, 255)- All coordinates are
(x, y), that is, (horizontal direction, vertical direction), which is opposite to the commonly understood table coordinates
3.1 Common graphics drawing codes
Below, common graphics are drawn on a white canvas with detailed notes:
✨ Practical Tips:
- When labeling a target, generally draw a rectangular frame first, and then add a text label above the frame. The position of the text can be determined by
cv2.getTextSize()Dynamically calculated to avoid exceeding the border.- When filling polygons, if you need a translucent effect, you need to create a layer with an alpha channel first, and then mix it. This is just for demonstration.
4. Geometric transformation: image scaling, rotation, translation, affine
The core idea of geometric transformation is to use a transformation matrix to map image coordinates. OpenCV provides two main functions:
cv2.warpAffine(src, M, dsize): Process 2×3 affine transformation matrix (scaling, rotation, translation, shearing)cv2.warpPerspective(src, M, dsize): Process the 3×3 perspective transformation matrix (leaved to subsequent chapters)
In this section we focus on the three most commonly used operations in daily life.
4.1 Image scaling: practical functions that maintain aspect ratio
directcv2.resizeAlthough simple, it can easily deform the image. The following wrapper function can scale while maintaining the aspect ratio and automatically select the appropriate interpolation algorithm:
📐 Scale Interpolation Cheat Sheet:
4.2 Image rotation: advanced version without cropping edges
Use directlycv2.getRotationMatrix2DAfter rotation, the default canvas size remains unchanged and the four corners of the image will be cropped. The following version will automatically calculate the new canvas size, retaining all pixels after rotation:
🔄 Rotation angle description: The angle in OpenCV is counterclockwise as the positive direction. If you want to get the effect of 30 degrees clockwise, pass in
angle=-30That’s it.
4.3 Affine transformation: three points determine any parallel mapping
Affine transformations maintain the parallelism of straight lines (for example, a square becomes a parallelogram). We only need to specify 3 pairs of corresponding points on the original image and the target image, and OpenCV will automatically calculate the transformation matrix:
🎯 Practical Scenario:
- Image correction: For example, a document shot at an angle can be "rightened" by specifying four corner points of the document (using four points to find perspective transformation) or three corner points (affine).
- Data enhancement: When training a deep learning model, three pairs of points are randomly generated for affine transformation to increase sample diversity.
5. Practical gadget: write an image batch resizer in 10 lines of code
Encapsulate the frontload_image、resize、save_imageCombined, a lightweight batch processing script can be implemented:
🧩 Extension ideas: You can easily extend this framework into "batch watermarking", "batch filter" or "batch format conversion", just replace the intermediate image processing functions.
Summary and next steps
This article takes you from scratch to master the most basic and commonly used operations of OpenCV: environment installation, image reading and writing, drawing annotation and geometric transformation. These APIs are the cornerstone of all subsequent advanced vision tasks (filtering, feature extraction, target detection, etc.). Be sure to deepen your understanding through actual code exercises.
The next step is to continue learning 图像增强与滤波 to learn how to denoise, sharpen, and perform more complex pixel-level processing on images.
💬 **Having a problem? ** You can view the official documentation OpenCV Documentation or execute it in the command line
help(cv2.函数名)Get parameter description.

