title: Detailed explanation of Airtest framework: game/Android App image + UI tree dual engine automation date: 2024-06-01 tags: [Automated testing, Airtest, Poco]

Detailed explanation of Airtest framework: game/Android App image + UI tree dual engine automation

When doing automated testing, you will most likely encounter these headache scenarios: Unity/Cocos games do not have public control IDs, niche apps that cannot capture native elements, and hybrid development of "stitched monster" interfaces. At this time, NetEase's open source Airtest dual-engine framework is a life-saving straw - it can not only rely on image recognition to "take screenshots as scripts", but also use Poco to parse the UI tree for precise control, without having to read through hundreds of pages of UI documents.

Starting from scratch, this article will take you to set up a development environment, encapsulate common base classes, and then handwrite a Douyin short video automatic interaction script (like, comment, and slide videos). Finally, it will also share risk control avoidance and image recognition optimization techniques.


1. What is Airtest dual engine?

Airtest Project mainly consists of two parts:

  • Airtest: An automated framework based on image recognition that locates elements through screenshot matching and does not rely on the system control tree. It is especially suitable for game engines (Unity, Cocos2d-x, Egret) and scenarios where the UI tree cannot be obtained.
  • Poco: An automation framework based on UI controls, which directly obtains control properties (text, position, level, etc.) from the UI tree of Android/iOS native or game engines, and performs click, slide, input and other operations. The stability is much higher than image recognition.

The two can be seamlessly mixed in the same script, truly realizing "use whichever is more convenient". For example, in a hybrid app like Douyin, the overall interface is processed by image recognition, and the list control in the comment area is accurately operated by Poco, which is extremely efficient.


2. environment-setup - run through the first script in 5 minutes

Airtest provides a graphical IDE with built-in functions such as recording and playback, device connection, and report viewing, which is suitable for getting started quickly:

  1. Go to Airtest 官网 to download the IDE installation package for the corresponding system.
  2. After installation, open AirtestIDE, which comes with it by default.airtestandpocouiTwo core libraries.
  3. Turn on Developer Mode and USB Debugging on your Android phone, connect it to the computer with a data cable, and click on the IDE device panelConnectYou can see the phone screen.

2.2 Use pip to build script projects (suitable for integrating CI/CD)

If you want to integrate Airtest into an existing automation project or CI pipeline, it is recommended to use pure Python scripting:

pip install airtest
pip install pocoui

Verify installation:

from airtest.core.api import *
from poco.drivers.android.uiautomation import AndroidUiautomationPoco

# 连接设备(USB 连接串号可通过 adb devices 查看)
connect_device("Android:///")
poco = AndroidUiautomationPoco()
print(poco.device)

2.3 Necessary tool chain

  • ADB: Android debugging bridge, Airtest relies on it to control the device. Configure environment variables and ensure terminal executionadb devicesAbility to list devices.
  • Mobile phone settings: In addition to USB debugging, it is recommended to turn on "pointer position" to facilitate viewing coordinates, and turn off interference items such as system automatic rotation and power saving mode.

3. Quick overview of basic operations - how to use the Airtest image engine

The core idea of ​​image recognition: first intercept the "template image" of the target element, search for the image in the device screenshot when the script is running, and perform the operation after finding it.

3.1 Common API

from airtest.core.api import *

# 连接设备
connect_device("Android:///")

# 点击(模板图需提前截图放在项目目录)
touch(Template("like_btn.png"))

# 滑动
swipe((500, 1500), (500, 500))   # 从下往上滑

# 文本输入
text("很棒的视频")

# 按键事件
keyevent("HOME")

# 等待元素出现
wait(Template("comment_input.png"), timeout=10)

# 断言是否存在
assert_exists(Template("success_tip.png"), "点赞成功")

3.2 Image template production skills

  • Capture a small area with obvious features (for example, the like button only has a heart-shaped icon part, without a background color block).
  • Save as PNG format and place it in the projectimages/directory for easy management.
  • If the recognition is unstable, you can usetouch(Template("btn.png", threshold=0.7))Lower the confidence threshold.

4. Poco engine - capture UI tree for precise control

Poco does not need to take screenshots, and obtains the control tree directly through reflection or AccessibilityService.

4.1 Initialize Poco

from poco.drivers.android.uiautomation import AndroidUiautomationPoco

poco = AndroidUiautomationPoco(use_airtest_input=True, screenshot_each_action=False)

Parameter description:

  • use_airtest_input=True: Perform clicks through Airtest device connections to avoid conflicts from multiple input sources.
  • screenshot_each_action=False: No more taking screenshots after each operation to improve speed.

4.2 Common operation examples

# 通过文本查找
poco(text="首页").click()

# 通过 ID 或层级关系
poco("android.widget.ListView").child("android.widget.TextView")[0].click()

# 滑动列表
poco("com.ss.android.ugc.aweme:id/recycler_view").swipe([0, -0.3])

# 获取属性
name = poco(text="视频标题").attr("text")

4.3 Mixed use: Airtest + Poco

# 先用图像识别点击“评论”按钮
touch(Template("comment_icon.png"))

# 再通过 Poco 定位评论输入框输入文字
poco("android.widget.EditText").set_text("加油!")

5. Practical combat: writing automatic interactive scripts for Douyin short videos

Next, we write an automated script for Douyin (the domestic version of TikTok) to simulate daily browsing operations: one-stop viewing of videos, likes, and comments.

Note: This tutorial is only for learning automated testing technology. Please do not use it for violations of platform regulations such as volume brushing and false interactions.

5.1 Scene design

  1. Open Douyin App and wait for the homepage to load.
  2. Slide 3 videos continuously and stay on each video for more than 2 seconds.
  3. Like the 4th video and open the comment box.
  4. Enter a random positive comment and send it.
  5. Exit the comment and continue sliding.

5.2 Directory structure

douyin_auto/
├── main.py              # 主脚本
├── base_operator.py     # 封装基类
├── images/              # 图像模板
│   ├── home_tab.png     # 首页 Tab
│   ├── like_btn.png     # 点赞红心
│   ├── comment_btn.png  # 评论图标
│   └── close_comment.png# 关闭评论
└── requirements.txt

5.3 Encapsulating common base classesbase_operator.py

import random
import time
from airtest.core.api import *
from poco.drivers.android.uiautomation import AndroidUiautomationPoco

class BaseOperator:
    def __init__(self, device_uri="Android:///"):
        self.device = connect_device(device_uri)
        self.poco = AndroidUiautomationPoco(
            use_airtest_input=True,
            screenshot_each_action=False
        )
        self.package = "com.ss.android.ugc.aweme"
    
    def start_app(self):
        """启动抖音"""
        stop_app(self.package)
        time.sleep(1)
        start_app(self.package)
        sleep(5)  # 等待首页加载
    
    def swipe_video(self, direction="up", duration=0.3):
        """滑动视频,up 向上滑看下一个,down 向下滑看上一个"""
        if direction == "up":
            swipe((540, 1600), (540, 400), duration=duration)
        else:
            swipe((540, 400), (540, 1600), duration=duration)
        sleep(2)  # 停留观看时间
    
    def like_current_video(self):
        """点赞当前视频(双引擎保底)"""
        try:
            # 优先用 Poco 查找点赞按钮
            like_btn = self.poco(name="com.ss.android.ugc.aweme:id/ae8")
            if like_btn.exists():
                like_btn.click()
                return
        except Exception:
            pass
        # 降级为图像识别
        touch(Template("images/like_btn.png", threshold=0.7))
    
    def comment_current_video(self, text_content):
        """评论当前视频"""
        # 点击评论图标
        try:
            self.poco(desc="评论").click()
        except Exception:
            touch(Template("images/comment_btn.png"))
        
        sleep(1)
        # 输入评论内容
        self.poco("android.widget.EditText").set_text(text_content)
        sleep(0.5)
        # 发送按钮(不同版本可能是“发送”文本或发送图标)
        send_btn = self.poco(text="发送")
        if send_btn.exists():
            send_btn.click()
        else:
            self.poco(type="android.widget.ImageView", desc="发送").click()
        sleep(1)
        # 关闭评论面板
        self.close_comment_panel()
    
    def close_comment_panel(self):
        """关闭评论面板,回退到短视频界面"""
        keyevent("BACK")
        sleep(0.5)
        # 如果 BACK 键无效,用图像点击关闭按钮
        if exists(Template("images/close_comment.png")):
            touch(Template("images/close_comment.png"))
    
    def random_comment(self):
        """生成随机正面评论"""
        comments = ["太厉害了👍", "学到了", "支持创作者", "已点赞", "这内容绝了",
                    "幽默风趣哈哈", "每天必看", "这个系列能多出吗"]
        return random.choice(comments)
    
    def run_interaction_flow(self, swipe_count=3):
        """完整交互流程"""
        self.start_app()
        # 前 swipe_count 个视频只滑动
        for i in range(swipe_count):
            self.swipe_video("up")
        # 第 swipe_count+1 个视频执行点赞+评论
        sleep(1)
        self.like_current_video()
        sleep(0.5)
        self.comment_current_video(self.random_comment())
        # 继续滑动几个视频
        for i in range(2):
            self.swipe_video("up")

5.4 Main entrancemain.py

from base_operator import BaseOperator

if __name__ == "__main__":
    operator = BaseOperator()
    try:
        operator.run_interaction_flow(swipe_count=3)
        print("自动化流程完成")
    except Exception as e:
        print(f"脚本运行出错: {e}")

5.5 Image template preparation

In actual operation, use the "Screenshot" function of AirtestIDE to obtain the current screen of the device, crop out the following picture and put it inimages/

  • like_btn.png: Like button (red heart box in unliked state)
  • comment_btn.png: Comment icon (bubble shape)
  • close_comment.png:Close/back button of comment page

Try to keep the device resolution consistent with the screenshot to improve recognition rate.


6. Risk control and avoidance - don’t let the platform treat you as a robot

If the automated script behaves too mechanically, it can easily trigger the platform's risk control mechanism (limiting interactions, requiring verification codes, or even banning accounts). Here are some defensive strategies:

6.1 Randomization operation

# 滑动距离加入随机偏移
import random
x1, y1 = 540, 1600
x2, y2 = 540 + random.randint(-50, 50), 400 + random.randint(-30, 30)
swipe((x1, y1), (x2, y2), duration=random.uniform(0.2, 0.5))

# 每条评论间隔随机等待
sleep(random.uniform(3, 8))

6.2 Simulate real user behavior

  • Don't just like but not view: Let the script have the concept of "viewing time", stay on the video for a random time before operating.
  • Occasional "accidental touch" slide back: Real users sometimes slide back to the previous video, and a certain probability can be set to execute it.swipe("down")
  • Interaction frequency limit: Control the total amount of daily operations, execute them in time periods, and avoid continuous high-frequency operations.

6.3 Account environment disguise

  • Use real and commonly used device IDs and IP environments to avoid large-scale operations of multiple accounts with the same IP.
  • Try to retain the original sensor data, GPS and other information of the device (to avoid excessive simulator features), and consider using real device cluster.

7. Image recognition optimization - stability leap from 70% to 99%

Image recognition is the core function of Airtest and the most easily affected by the environment. The following optimization methods can greatly improve the success rate.

7.1 Appropriate threshold and grayscale processing

# 降低阈值(默认 0.7)以适应轻微变化
touch(Template("btn.png", threshold=0.6, rgb=True))

# 使用灰度模板,忽略色彩差异(适用于按钮背景色渐变的情况)
touch(Template("btn.png", threshold=0.8, rgb=False))

7.2 Multi-resolution adaptation

Under different mobile phone resolutions, the size and position of the controls will change. Solution:

  • Prioritize using Poco for positioning. Poco is based on the UI tree and is not affected by resolution.
  • For image templates, you can record the relative coordinates script: first find the fixed anchor point (such as the bottom status bar icon), and then use it as a reference to offset the click. Airtest IDE supports relative coordinate recording based on anchor points.

7.3 Dynamic waiting and retry mechanism

def safe_touch(template, max_retry=3):
    for i in range(max_retry):
        if exists(template):
            touch(template)
            return True
        sleep(1)
    raise Exception(f"未找到元素: {template}")

7.4 Template update strategy

After the interface is revised, the recognition will fail. You can create a template warehouse and check the version before execution. Advanced usage: Write "self-healing" logic to automatically trigger a screenshot callback when image matching fails, and update the template through manual or image algorithms.

try:
    touch(Template("comment.png"))
except TargetNotFoundError:
    snapshot(filename="error_screen.png")
    # 上报到监控平台,通知人工介入

8. Advanced skills and summary

  • Report Generation: The Airtest script will be generated in the current directory after runninglog/, which contains HTML reports and screenshots of each step for easy debugging.
  • Cross-platform: Airtest also supports automation of iOS (requires WebDriverAgent and Mac) and Windows window programs.
  • Embedded pytest: Airtest operations can be encapsulated into pytest test cases and included in the continuous integration pipeline.

In dual-engine mode, we have truly achieved "the best of both worlds": the closed graphics of the game are captured using image recognition, and the dynamic list of the native app is accurately captured using Poco. Coupled with Douyin's actual scripts and risk control optimization, this solution can be directly migrated to actual scenarios such as live broadcast monitoring, content review automation, and social operation tools.

Now, open your IDE, connect your phone, and try usingtouch(Template("xxx.png"))Write the first line of automation code - there is no need to delve into the UI source code, and there is no need to read hundreds of pages of control documents. WYSIWYG automation starts with Airtest.