Python image processing practice

Start with making emoticons: A practical introduction to Python image processing

Imagine this scenario: You want to reduce the cat owner’s beautiful photos to a thumbnail in Moments with one click, automatically add a cute watermark, draw a playful outline for the cat’s face, or even generate an exclusive static holiday card. These seemingly “designer” needs can be easily accomplished with Python. And the library behind it that can turn you into an "image magician" with just a few lines of code is Pillow.

In the actual project of Daoman Python AI, Pillow has always been regarded as the "Swiss Army Knife-level entry library" for data preprocessing and visual tool chains. Play with it thoroughly first, and then try hard-core frameworks such as OpenCV and TensorFlow Lite. It will be much smoother to get started.


1. Two-minute literacy: the "smallest parts" of images

Before typing code, there are two core concepts that must be brushed up quickly. Although the principle is simple, anyone who has stepped on the coordinate pit knows their importance.

1.1 Color model (RGB/RGBA)

The computer screen relies on the superposition of three colors of red, green, and blue to display the colorful world. This is the RGB color model.

  • The intensity of each color light is represented by an integer between 0~255 (256 levels of grayscale). The larger the value, the brighter the color.
  • RGBA has an additional Alpha channel on this basis, which also takes 0~255, representing transparency (0 is completely transparent, 255 is completely opaque). This channel is crucial when compositing watermarks and stickers.

Quick Check Card

Common NameRGB ValueCommon NameRGB Value
pure white(255, 255, 255)pure red(255, 0, 0)
pure green(0, 255, 0)pure blue(0, 0, 255)
medium gray(128, 128, 128)bright yellow(255, 255, 0)
pure black(0, 0, 0)dark purple(128, 0, 128)

1.2 pixels

Continuously enlarge a photo or screenshot to the extreme, and what you will see are densely packed small blocks of color - these are pixels, the smallest editable unit of an image. A 1920×1080 picture has a total of more than 2 million such color blocks. To put it bluntly, all operations of Pillow are helping us process these "small grids" in batches.


2. Get started quickly with Pillow: from installation to "fancy photo editing"

Pillow is a modern replica of PIL (Python Imaging Library). It has good compatibility and a particularly intuitive API design. It is currently the first choice for Python 3.x to process images.

Installation in one step

pip install pillow

2.1 Basic five moves: Open → View information → Crop → Zoom → Transfer

The core module isPIL.Image. The following process is exactly the same as when you use PS or mobile photo editing apps. It is recommended to match your own pictures (for example, name itcat_hero.jpg) run together.

from PIL import Image

# 1. 打开图片(本地路径;如果是网络图片,需要先用 requests 拉取二进制流)
try:
    img = Image.open("cat_hero.jpg")
    img.show()  # 弹出系统默认图片查看器,快速确认图片内容
except FileNotFoundError:
    print("❌ 图片不存在,请检查路径!")

# 2. 获取基础属性
print(f"✅ 图片格式:{img.format}")   # JPEG、PNG、BMP 等
print(f"✅ 图片尺寸(宽, 高):{img.size}")   # 元组 (w, h)
print(f"✅ 颜色模式:{img.mode}")   # RGB、RGBA、L(灰度) 等

# 3. 裁剪(牢牢记住:Pillow 坐标原点在左上角,x 轴向右,y 轴向下)
# crop() 参数格式:(左边界, 上边界, 右边界, 下边界)
# 注意:右边界和下边界是“取不到”的,范围是 [左, 右) 与 [上, 下)
crop_area = (100, 50, 500, 400)
cat_face = img.crop(crop_area)
cat_face.show()
cat_face.save("cat_face_only.png")   # 转为带透明通道的 PNG(原图若是 JPEG,裁剪区域仍无透明)

# 4. 缩放
## 方法一:thumbnail() —— 原地修改,且强制保持宽高比,非常适合生成缩略图
img_copy = img.copy()  # thumbnail 会修改原对象,复制一份保平安
img_copy.thumbnail((200, 200))  # 括号内是“最大允许尺寸”
img_copy.save("cat_thumb.jpg")

## 方法二:resize() —— 返回新图,可拉伸/压缩到任意精确尺寸
cat_face_resized = cat_face.resize((100, 100))
cat_face_resized.show()

# 5. 旋转与翻转(均返回新图,不会修改原图)
## 旋转 45°(逆时针),expand=True 会自动扩大画布,避免边角被切
img_rotated = img.rotate(45, expand=True)
img_rotated.show()

## 水平 / 垂直镜像
img_flip_h = img.transpose(Image.FLIP_LEFT_RIGHT)   # 水平翻转
img_flip_v = img.transpose(Image.FLIP_TOP_BOTTOM)   # 垂直翻转
img_flip_h.save("cat_mirror.jpg")

2.2 Advanced gameplay: filter + sticker synthesis

Pillow has built-in a lot of ready-to-use filters, plus a flexible paste function, allowing you to easily create "cat emoticons".

from PIL import ImageFilter

# 用前面的 cat_face_only.jpg 或 cat_hero.jpg 试试效果
img_blur = img.filter(ImageFilter.GaussianBlur(radius=5))   # 高斯模糊,radius 越大越朦胧
img_contour = img.filter(ImageFilter.CONTOUR)   # 轮廓提取,做表情包常用!
img_sharpen = img.filter(ImageFilter.SHARPEN)   # 锐化
img_contour.show()

Sticker synthesis: add a heart to the cat owner

# 准备一张带透明通道的爱心贴纸(如果没有,代码会自动帮你画一个半透明红圆充数)
try:
    heart_sticker = Image.open("heart.png").convert("RGBA")   # 强制转 RGBA
except FileNotFoundError:
    from PIL import ImageDraw
    # 临时生成一张 100x100 的透明画布,并在上面画个半透明红色圆形
    heart_sticker = Image.new("RGBA", (100, 100), (0, 0, 0, 0))  # 完全透明背景
    heart_draw = ImageDraw.Draw(heart_sticker)
    heart_draw.ellipse((0, 0, 100, 100), fill=(255, 0, 0, 200))  # 半透明红

# 把爱心缩小一点
heart_sticker_resized = heart_sticker.resize((80, 80))
w, h = heart_sticker_resized.size

# 准备画布:复制原图并转 RGBA,以便承载透明元素
canvas = img.convert("RGBA")
# 粘贴时直接把贴纸本身的 Alpha 通道作为 mask,完美保留半透明效果
canvas.paste(heart_sticker_resized,
             (img.size[0] - w - 20, img.size[1] - h - 20),
             mask=heart_sticker_resized)

# 如果要保存为 JPG,需要先转回 RGB(JPG 不支持透明通道)
canvas.convert("RGB").save("cat_with_heart.jpg")
canvas.show()

Tips Paste functionpaste()The third parameter ofmaskIt is a very powerful design: you can use it to precisely control "only the sticker itself, not the background". Particularly friendly to translucent edges.


3. Static drawing: "draw" a greeting card with code

Pillow can not only retouch images, but also create from scratch on a blank canvas.PIL.ImageDrawThe module allows you to use code to draw geometric shapes, write text, easily handle verification codes, holiday greeting cards, or batch generate posters with dates.

from PIL import Image, ImageDraw, ImageFont

# 1. 创建画布(宽, 高, 背景色)
canvas_width, canvas_height = 600, 400
canvas = Image.new("RGB", (canvas_width, canvas_height), (240, 248, 255))  # 浅蓝背景
drawer = ImageDraw.Draw(canvas)

# 2. 绘制装饰(彩色竖线 + 半透明气球)
## 彩色竖线
for x in range(0, canvas_width, 20):
    color = (x % 256, (x + 50) % 256, (x + 100) % 256)
    drawer.line((x, 0, x, canvas_height), fill=color, width=10)

## 半透明气球(生成一个透明图层,画完后再贴上去)
balloon_layer = Image.new("RGBA", canvas.size, (0, 0, 0, 0))
balloon_draw = ImageDraw.Draw(balloon_layer)
balloon_positions = [(100, 150), (300, 100), (500, 200)]
for (x, y) in balloon_positions:
    balloon_draw.ellipse((x-30, y-40, x+30, y+20), fill=(255, 192, 203, 180))  # 半透明粉色
    balloon_draw.line((x, y+20, x, y+100), fill=(128, 128, 128), width=2)

canvas.paste(balloon_layer, mask=balloon_layer)

# 3. 写文字(关键是选对字体,中文需要支持中文的 .ttf 或 .ttc 字体)
# 常用系统字体路径示例:
# Windows: C:/Windows/Fonts/simhei.ttf (黑体), C:/Windows/Fonts/msyh.ttc (微软雅黑)
# macOS: /System/Library/Fonts/STHeiti Light.ttc (华文细黑)
# Linux: /usr/share/fonts/truetype/dejavu/DejaVuSans.ttf
try:
    font_title = ImageFont.truetype("msyh.ttc", 60)
    font_subtitle = ImageFont.truetype("msyh.ttc", 30)
except IOError:
    print("⚠️ 未找到指定中文字体,将使用默认英文字体(中文会显示为方块)")
    font_title = ImageFont.load_default()
    font_subtitle = ImageFont.load_default()

# 居中写出标题和副标题(用 textbbox 精确测量文字占用的矩形区域)
text_title = "Happy Birthday!"
bbox_title = drawer.textbbox((0, 0), text_title, font=font_title)
text_width_title = bbox_title[2] - bbox_title[0]
text_height_title = bbox_title[3] - bbox_title[1]
x_title = (canvas_width - text_width_title) // 2
y_title = 50
drawer.text((x_title, y_title), text_title, fill=(255, 20, 147), font=font_title)

text_subtitle = "To My Lovely Cat 🐱"
bbox_subtitle = drawer.textbbox((0, 0), text_subtitle, font=font_subtitle)
text_width_subtitle = bbox_subtitle[2] - bbox_subtitle[0]
x_subtitle = (canvas_width - text_width_subtitle) // 2
y_subtitle = y_title + text_height_title + 30
drawer.text((x_subtitle, y_subtitle), text_subtitle, fill=(0, 0, 139), font=font_subtitle)

# 4. 保存与预览
canvas.save("cat_birthday_card.jpg")
canvas.show()

NOTE Pillow 9.2.0 and later versions are recommendedtextbbox()To dynamically obtain the width and height of the text area. oldtextsize()The method has been marked as obsolete and is recommended for gradual replacement.


Summary and next steps

Shorthand for practical key points in this article

  1. Coordinate system: The origin is in the upper left corner, x increases to the right, and y increases downward.
  2. Core module:Image(read/write/basic operations),ImageFilter(filter),ImageDraw(drawing),ImageFont(text rendering).
  3. Transparent layer: To retain the transparency effect, remember to convert the image toRGBA;pass when pastingmaskParameter controls the visible area.

What can you learn next?

  • Zero threshold advancement: Use Pillow to batch process photos (such as converting to grayscale with one click, adding watermarks in batches).
  • Enter the hard-core vision field: If you want to do face recognition, target detection, real-time video processing, just get on OpenCV-Python.
  • Deep Learning Preprocessing: Pillow is a commonly used image preprocessing tool in official examples of TensorFlow/Keras and PyTorch. Mastering it will allow you to enter the door of deep learning image tasks more smoothly.

Now, you might as well open the terminal, pick your favorite photo, and use Pillow to add a bit of programmer romance to it!