UI Automation Control Technology - Minimalist Practical Guide

Hello everyone, my name is Cline.

Recently, I have received many private messages from classmates, all asking the same question: **How ​​can we implement App crawlers or simple automated tests on Android quickly? ** We don’t want to read hundreds of pages of framework documents, nor do we want to mess with bottomless custom controller classes. Today’s content will only focus on 3 mainstream tools to help you understand scene comparisons in 5 minutes. Douyin practice can be run in 30 minutes after copying. The whole process does not involve any mathematical formulas and is completely a guide for engineering implementation.

Core Objectives

  1. According to the three dimensions of "lightweight/cross-platform/game", select the most suitable tool within 3 seconds
  2. Master the two core precise positioning methods of "UI tree positioning" and "image recognition"
  3. Copy and run a synchronized version of Douyin likes + comments script with safe avoidance strategy
  4. Remember the three basic anti-crawling points for mobile automation

1. Tool comparison and quick start with the core

Don’t rush to install it yet, just look at the scenario cheat sheet below and you’ll be able to find the tool that best suits your current needs at a glance.

ToolsBest applicable scenariosInstallation difficultyRunning performanceNative image recognitionCore advantages
AppiumFull platform test (Android/iOS/PC application)Medium to highGeneralNeed to integrate OpenCVWebDriver industry standard, multi-language support
AirtestComplex interface that cannot be parsed by the game/UI treeLowMedium✅ Strong native (NetEase self-developed algorithm)Comes with visual IDE + recording, Poco + image dual positioning
uiautomator2Android exclusive Python lightweight projectExtremely lowHighRequires additional installation of opencvPure Python, based on Google's official uiautomator, the first choice for crawlers/small scripts

For the vast majority of quick tasks on the pure Android side, uiautomator2 is the lowest-cost and fastest-getting-started option. This article will use it as the main demonstration.

1.1 Get started with uiautomator2 first (use it in actual combat)

uiautomator2 is a pure Python library that calls Android's official uiautomator framework at the bottom. It has high performance, simple interface, and is very friendly to crawlers and automated testing.

Minimalist installation & initialization

In just two steps, you can let your computer control your Android phone:

# 1. 电脑端安装 Python 库
pip install uiautomator2

# 2. 手机开启「USB调试」后连接电脑,运行一次即可完成手机端代理安装
python -c "import uiautomator2 as u2; d = u2.connect(); d.healthcheck()"

5 lines of core operation demonstration (taking Taobao as an example)

The following code shows the most commonly used operation process: connect the device → launch the app → locate the element → tap to enter → slide and search.

import uiautomator2 as u2
import time

# 1. 连接默认 USB 设备(WiFi 连接可用 u2.connect("192.168.1.100"))
d = u2.connect()
# 2. 启动指定 App(包名可通过 adb shell dumpsys window | grep mCurrentFocus 获取)
d.app_start("com.taobao.taobao")
time.sleep(3)  # 硬等待仅用于演示,实际建议用 exists() 做条件等待

# 3. 通过 resourceId 精准定位搜索框(resourceId 最稳定、最推荐)
search_box = d(resourceId="com.taobao.taobao:id/searchEdit")
# 4. 点击并输入关键词
search_box.click()
search_box.set_text("学生党备用机")
# 5. 上滑页面并点击「搜索」按钮
d.swipe_ext("up", scale=0.8)
d(text="搜索").click()

1.2 Airtest quick start (blind filling game, special scenes)

When interface elements cannot be parsed through the UI tree (such as games, H5 embedded pages, special custom controls), Airtest's native image recognition capabilities are the best choice. At the same time, it also provides the Poco module to parse the UI tree, a two-pronged approach.

Minimalist installation

# 纯 Python 版(方便集成到现有项目)
pip install airtest pocoui

# 推荐额外下载 Airtest IDE,内置录制、Poco Inspector、图像截取等工具,新手体验极佳

5-line core demo: image + UI dual positioning

from airtest.core.api import *
from poco.drivers.android.uiautomation import AndroidUiautomationPoco

# 1. 连接默认 USB 设备
connect_device("Android:///")
# 2. 初始化 Poco 解析器
poco = AndroidUiautomationPoco(use_airtest_input=True)

# 3. 图像识别定位(需提前截图保存为 like.png,建议只截纯图标,不要带动态文字)
# Template(r"like.png", record_pos=(0.7, 0.5), resolution=(1080, 2400)).click()

# 4. Poco UI 树定位(当元素有稳定属性时优先使用)
poco(text="点赞").click()

# 5. 执行滑动操作
swipe((500, 1800), (500, 600), duration=0.8)

2. Core pit avoidance + stable positioning solution

2.1 UI element positioning: select from high to low according to stability

The mobile interface changes frequently, and the positioning strategy should try to rely on those attributes that are not easy to change. The priority is as follows:

  1. resourceId (most stable)

    # uiautomator2 示例
    d(resourceId="com.ss.android.ugc.aweme:id/aweme_like_layout")  # 抖音点赞按钮的容器
    # Appium / Poco 中也都有类似属性,只是具体字段名可能不同
  2. contentDescription/description (accessibility label)

    d(desc="点赞")
  3. textMatches (fuzzy text matching)

    # 适合文本前缀/后缀固定的情况
    d(textMatches=".*备用机推荐.*")
  4. Image recognition (last choice)

  • When taking screenshots, only capture small images that have no dynamic text and only contain core icons.
  • Be sure to setthresholdParameter (recommended 0.7~0.9, the higher the value, the stricter the matching, but it may not be found)
    # Airtest 带阈值示例
    Template(r"like.png", threshold=0.8).click()

To sum up a formula: resourceId comes first, description text comes second, and image recognition comes last.

2.2 Basic anti-crawling gesture: simulate "real person operation"

The platform's risk control system will detect whether the operation is too mechanical, so we need to carve some "human touch" into the script.

  1. Replace hard waiting with conditional waiting and add a little random delay

    # 条件等待:最多等5秒,元素一出现就立刻执行
    search_box.wait(timeout=5)
    
    # 随机延迟:每次操作后留一个「思考间隙」
    import random
    time.sleep(1.2 + random.uniform(-0.3, 0.3))
  2. Randomization of sliding distance, duration, and position

    # uiautomator2 示例:随机化滑动幅度和速度
    d.swipe_ext(
        "up",
        scale=0.7 + random.random() * 0.1,        # 滑动屏幕高度的 70%~80%
        duration=0.8 + random.random() * 0.2     # 耗时 0.8~1.0 秒
    )
  3. Click the position to add a random offset

    def safe_click(element):
        """在元素中心附近随机点选,避免每次都命中同一坐标"""
        if not element.exists(timeout=2):
            return False
        bounds = element.bounds
        x = random.randint(bounds['left'] + 8, bounds['right'] - 8)
        y = random.randint(bounds['top'] + 8, bounds['bottom'] - 8)
        d.click(x, y)
        return True

These three points are the basic defense lines for anti-climbing on the mobile terminal. The cost is extremely low, but the effectiveness is significant.


3. Practical combat: the synchronized version of Douyin with safe avoidance of pitfalls to automatically like + comment

⚠️ Important Reminder This script is only for personal learning and communication. Please do not use it for large-scale brushing or batch crawling. Otherwise, Douyin’s risk control mechanism may be triggered, resulting in traffic restrictions, bans, and even account closures.

3.1 Preparation

  1. Turn on "USB debugging" on your Android phone (the first connection must be authorized through USB; you can switch to WiFi debugging later)
  2. Open Douyin on your mobile phone and stay on the homepage of the recommended stream or the homepage of a certain video.
  3. Use the element positioning tool (Newbies strongly recommend the Poco Inspector that comes with Airtest IDE. Students with Android SDK can also use it.uiautomatorviewer) Get the resourceId and other attributes of the interface element

3.2 Complete executable code

The code has built-in random like probability, random comment probability, and degradation strategies for multiple element positioning. You can copy and run directly.

import uiautomator2 as u2
import time
import random

# ----------------------- 配置区(可根据实际需求修改) -----------------------
DOUYIN_PKG = "com.ss.android.ugc.aweme"
LIKE_PROB = 0.4                # 40% 的概率给视频点赞
COMMENT_PROB = 0.1             # 10% 的概率发送评论
MAX_VIDEOS = 15                # 最多刷 15 个视频后停止
COMMENT_LIST = [
    "很棒的内容~",
    "学到了,收藏收藏!",
    "支持一下博主!",
    "这个好有意思哈哈哈哈",
    "666666"
]

# ----------------------- 连接设备 -----------------------
print("🔗 正在连接安卓设备...")
try:
    d = u2.connect()           # WiFi 调试可写 u2.connect("IP地址")
    print(f"✅ 连接成功!设备序列号:{d.serial}")
except Exception as e:
    print(f"❌ 连接失败,请检查 USB 调试或网络连接:{e}")
    exit()

# ----------------------- 安全操作工具函数 -----------------------
def safe_click(element, timeout=2):
    """带超时和随机偏移的安全点击"""
    if not element.exists(timeout=timeout):
        return False
    bounds = element.bounds
    # 避开元素边缘,防止点到无效区域
    x = random.randint(bounds['left'] + 6, bounds['right'] - 6)
    y = random.randint(bounds['top'] + 6, bounds['bottom'] - 6)
    d.click(x, y)
    return True

def random_delay(base, variance):
    """生成随机延迟,base 为基准秒数,variance 为上下浮动范围"""
    actual_delay = max(0.5, base + random.uniform(-variance, variance))
    time.sleep(actual_delay)

# ----------------------- 主运行逻辑 -----------------------
total_videos = 0
total_likes = 0
total_comments = 0

print(f"\n🚀 脚本启动!目标:最多处理 {MAX_VIDEOS} 个视频")
try:
    while total_videos < MAX_VIDEOS:
        total_videos += 1
        print(f"\n🎬 正在处理第 {total_videos}/{MAX_VIDEOS} 个视频...")
        # 模拟观看时间
        random_delay(2.5, 1)

        # 1. 随机点赞
        if random.random() < LIKE_PROB:
            # 多重降级定位,提高成功率
            like_element = (
                d(resourceId="com.ss.android.ugc.aweme:id/aweme_like_layout")
                or d(desc="赞")
                or d(resourceId="com.ss.android.ugc.aweme:id/dv9")  # 备用 ID(不同版本可能不同)
            )
            if safe_click(like_element):
                total_likes += 1
                print(f"👍 点赞成功!累计点赞:{total_likes}")
                random_delay(0.8, 0.3)

        # 2. 随机评论
        if random.random() < COMMENT_PROB:
            comment_btn = (
                d(resourceId="com.ss.android.ugc.aweme:id/aweme_comment_layout")
                or d(desc="评论")
                or d(resourceId="com.ss.android.ugc.aweme:id/dvb")
            )
            if safe_click(comment_btn):
                random_delay(1.5, 0.5)
                # 定位输入框
                input_box = (
                    d(resourceId="com.ss.android.ugc.aweme:id/chat_input_view")
                    or d(className="android.widget.EditText")
                )
                if input_box.exists(timeout=2):
                    random_comment = random.choice(COMMENT_LIST)
                    input_box.set_text(random_comment)
                    random_delay(0.9, 0.2)
                    # 点击发送按钮
                    send_btn = d(text="发送") or d(desc="发送")
                    if safe_click(send_btn):
                        total_comments += 1
                        print(f"💬 评论成功!内容:{random_comment} | 累计评论:{total_comments}")
                        random_delay(1, 0.3)
                # 返回视频播放页
                d.press("back")
                random_delay(0.8, 0.3)

        # 3. 上滑到下一个视频
        print("⬆️  滑动到下一个视频...")
        d.swipe_ext(
            "up",
            scale=0.7 + random.random() * 0.1,
            duration=0.8 + random.random() * 0.2
        )
        # 等待下一个视频加载完成
        random_delay(1.2, 0.5)

except KeyboardInterrupt:
    print("\n⏸️ 用户手动停止了脚本")
except Exception as e:
    print(f"\n❌ 脚本运行出错:{e}")
finally:
    print("\n📊 本次运行统计:")
    print(f"  处理视频数:{total_videos}")
    print(f"  成功点赞数:{total_likes}")
    print(f"  成功评论数:{total_comments}")

Summarize

  1. **How ​​to choose tools? ** Android lightweight crawler/quick script → uiautomator2 Games / Complex UI cannot be parsed → Airtest Cross-platform (Android + iOS) formal testing → Appium

  2. Element positioning priority resourceId > contentDescription > textMatches > 图像识别

  3. Anti-climbing three-piece set Conditional waiting + random delay, random sliding distance/duration, random offset click coordinates

  4. The last red line All the techniques in this article are only for learning and communication. Do not conduct large-scale brushing or illegal crawling.

As long as you master this set of ideas and template code, most introductory tasks for mobile UI automation can be implemented efficiently without being dragged along by cumbersome framework documents. I hope this minimalist guide really helps you.