Serialization and deserialization of objects

在真实开发中,我们几乎不会把数据存成零散的变量名。当你需要保存一份 AI 模型的超参数配置、向前端响应 JSON 数据,甚至让 Python 与 Java、Go 等异构系统通信时,都会面临同一个问题:**如何用一种通用的格式,无损地交换复杂结构的数据? **

The answer is JSON - currently the most popular "data bridge". This article uses the most direct way to take you to master the serialization and deserialization of JSON in Python, as well as two practical tools that allow you to get twice the result with half the effort.


1. What is JSON: From JS syntax sugar to “data passport”

The full name of JSON is JavaScript Object Notation. It was originally just a simple syntax for writing objects in JavaScript. However, due to its plain text, compact structure, and full cross-language support, it quickly replaced lengthy XML and became the de facto standard for API docking and configuration files.

JS ↔ Python data seamless mapping table

The structure of JSON is naturally similar to Python dictionaries and lists, and supports unlimited levels of nesting. However, for cross-language compatibility, JSON has fewer data types than Python. The corresponding relationship is shown in the table below:

JSON typesPython typesCore description
objectdictThe key must be a double quoted string
arraylistordered variable sequence
stringstrUnified Unicode encoding, Chinese is no problem
numberint / floatAutomatically distinguish between integers and floating point numbers
booleanbooltrueTruefalseFalse(Strictly case sensitive!)
nullNoneEmpty object placeholder in Python

Simply put, JSON is a "lite and universal version" of the Python container type. By mastering this table, you can easily complete the conversion between memory objects and JSON text.


2. Core tools: Python built-injsonmodule

Python comes withjsonThe module provides the bidirectional conversion capability of memory object ↔ JSON (string/file). There are only four core functions. Remember their suffixes to distinguish their uses:

Function nameSuffix meaningInput → OutputApplicable scenarios
dumpss= StringPython object → JSON stringLog printing, HTTP request parameter construction
dumpNo suffix, write to FilePython object → FilePersistent configuration, temporary data storage
loadss= StringJSON string → Python objectParse the string returned by the API, JSON fragment in the log
loadNo suffix, read FileFile → Python objectRead configuration file, local JSON data

Memory Tips: Bringsis to process string (String) withoutsDirect manipulation of file objects.


3. Practical demonstration: serialization and deserialization

The following uses a complex dictionary with nested structures, Chinese, Boolean values, lists and empty fields to practice the complete access process. This example simulates the model configuration of an AI platform (Daoman).

3.1 Serialization: turning objects into JSON

import json

# 模拟道满AI平台的用户模型配置
model_config = {
    "model_name": "Daoman-Lite-7B",
    "version": 1.2,
    "is_active": True,
    "author": "道满团队",
    "supported_tasks": ["文本分类", "关键词提取", "情感分析"],
    "hyperparameters": {
        "max_seq_length": 512,
        "batch_size": 16,
        "learning_rate": 0.00005,
        "dropout": None  # 对应 JSON 的 null
    }
}

# 1. 序列化为带格式的字符串(适合调试、打印、网络传输)
# indent=4: 缩进4个空格,方便阅读
# ensure_ascii=False: 不把中文转成 \uXXXX 编码
json_config_str = json.dumps(model_config, indent=4, ensure_ascii=False)
print("序列化后的字符串输出:")
print(json_config_str)
print("-" * 60)

# 2. 直接序列化到本地文件(持久化存储)
# 写文件时务必指定 encoding='utf-8',否则中文会乱码
with open("daoman_lite_config.json", "w", encoding="utf-8") as f:
    json.dump(model_config, f, indent=4, ensure_ascii=False)
print("配置已保存到 daoman_lite_config.json!")

Run this code and you will see a clearly formatted JSON string, anddaoman_lite_config.jsonThe file has also been written correctly.

3.2 Deserialization: Restore JSON to objects

Next, read back the configuration you just saved, and demonstrate how to restore data from a JSON string.

import json

# 1. 从 JSON 字符串反序列化
json_test_str = '{"name": "张三", "score": 99.5, "is_passed": true, "hobbies": null}'
parsed_dict = json.loads(json_test_str)
print("从字符串还原的字典:", type(parsed_dict), parsed_dict)
print("张三的分数类型:", type(parsed_dict["score"]))   # 自动转成 float
print("-" * 60)

# 2. 从本地 JSON 文件反序列化
with open("daoman_lite_config.json", "r", encoding="utf-8") as f:
    loaded_config = json.load(f)
print("从文件还原的作者:", loaded_config["author"])
print("支持的第一个任务:", loaded_config["supported_tasks"][0])

It can be seen that whether it is a file or a string,json.loadandjson.loadsIt can perfectly restore JSON data to Python's native dictionaries and lists, and the numerical types are automatically matched.


4. Extra meal 1: Performance optimization under massive data

When it comes to processing more than one million JSON data (such as product information captured by crawlers and batch AI inference results), Python's built-injsonModules can become a bottleneck. At this time, you can use ujson - a third-party library called "Lightning Parser", which is usually 3 to 10 times faster than the built-in version.

4.1 pip configure domestic mirror (speed up download)

pipIt is the official package management tool of Python. It is downloaded from foreign PyPI by default. Domestic access is often very slow. It is recommended to configure the Tsinghua image first:

# 永久配置清华镜像(推荐)
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

4.2 Install and use ujson

After installing ujson, you only need to modify one line of import statements, and the rest of the code is fully compatible with the built-injsonThe four core functions of the module.

pip install ujson
# 原代码只需修改导入,其余不动
import ujson as json   # 将 ujson 别名为 json

# 前面定义的 model_config 可以直接使用,所有操作完全兼容
json_str = json.dumps(model_config, indent=4, ensure_ascii=False)
loaded = json.loads(json_str)
print("使用 ujson 解析,性能大幅提升!")

pip Common Command Cheat Sheet

CommandFunction
pip listView all libraries installed in the current environment
pip install 库名Install the specified library (default is the latest version)
pip install 库名==x.y.zInstall the specified version
pip install -U 库名Update the library to the latest version
pip uninstall 库名Uninstall the specified library

5. Extra meal 2: Actual call to the public JSON API

Most APIs today return JSON data over HTTP/HTTPS. Python built-inurllibIt is more cumbersome to use, and it is recommended to use the requests library - it is called "the most elegant HTTP library" and has built-in JSON parsing methods.

5.1 Install requests

pip install requests

5.2 Calling News API (Example)

The following takes the free domestic news API of Tianju Shuxing as an example (you need to apply for a free key by yourself, and the daily call volume is 100 times). in the codeYOUR_API_KEYJust replace it with your own key.

import requests

# 替换成你自己的 API 密钥
API_KEY = "YOUR_API_KEY"
NEWS_API_URL = f"http://api.tianapi.com/guonei/?key={API_KEY}&num=5"  # num=5 取5条

try:
    # 发送 GET 请求
    resp = requests.get(NEWS_API_URL)
    # 检查 HTTP 状态码是否为 200
    resp.raise_for_status()

    # 关键一步:requests 内置 json() 方法,直接将响应解析为 Python 字典
    news_data = resp.json()

    # 解析业务状态码
    if news_data["code"] == 200:
        print("📰 今日国内热点新闻(前5条):")
        print("=" * 80)
        for idx, news in enumerate(news_data["newslist"], 1):
            print(f"{idx}. 标题:{news['title']}")
            print(f"   链接:{news['url']}")
            print(f"   发布时间:{news['ctime']}")
            print("-" * 80)
    else:
        print(f"❌ API 返回错误:{news_data['msg']}")
except requests.exceptions.RequestException as e:
    print(f"❌ 请求失败:{e}")

resp.json()Behind this call, it actually executes the response content.json.loads(), which saves us the step of manual analysis, which is very convenient.


6. Summary

Today we only focus on three things, but they are enough to cover 90% of JSON scenarios in daily development:

  1. Positioning of JSON: a cross-language, plain text, compact “data bridge”
  2. Python’s four built-in json modules:dumps/dump(serialization),loads/load(deserialization)
  3. Two efficiency-improving gadgets:
  • ujson: Make processing large JSON lightning fast
  • requests: Call JSON API gracefully

Finally, I would like to remind you of the two most common mistakes:

  • When writing a JSON file must be specifiedencoding='utf-8', otherwise the Chinese characters will be garbled.
  • JSON keys must be double quoted strings. Although Python's single-quote keys are automatically converted during serialization, when manually spelling a JSON string, you should never write out the single quotes.

Master these, and you can confidently use JSON to store, transmit, and exchange data in a variety of projects.