Python module and package tutorial: from "heap of code" to "structural expert"

When we first start learning Python, we always like to cram all functions, variables, and logic into itmain.py——写个爬虫、做个计算器感觉还挺爽。 But once the code grows to hundreds or thousands of lines, or when you want to reuse a certain tool function that you have written before, trouble comes: variable name conflict (both functions haveget_total), I couldn’t find the function after searching for a long time, and I didn’t dare to delete or change the code, for fear of disturbing my whole body.

At this time, Module and Package are your savior. They can help you break the code into small units with clear logic, which can be independently reused and avoid naming conflicts.


1. Module: Your first "code storage box"

What is a module?

Simply put, the ** module is a.pyfile**, the file name is its module name (such asmath_utils.pyThe module name ismath_utils)。

It can hold:

  • function definition
  • variable definition
  • class definition
  • Some execution statements (but the main logic is recommended not to write too many unless the module is used to run directly)

Core advantages of modules

Splitting large files into modules can bring these four real benefits:

  1. Easy to maintain: Separate functions such as "mathematical calculations", "file operations" and "data formatting". Next time you make changes, only the corresponding small files will be moved and other logic will not be affected.
  2. Reusable: The written tool modules can be copied and used directly between different projects, and can also be uploaded to PyPI for others to use.
  3. Avoid naming conflicts: Different modules can have functions/variables with the same name, just add the module prefix when calling (for examplemath_utils.get_totalandsales_utils.get_total)。
  4. Accelerated loading: After Python imports a module for the first time, it will save the compiled bytecode into.pycThe file is placed in__pycache__directory, and then directly read the cache when importing without re-parsing the source code.

2. Package: Modular "multi-layer storage rack"

When there are more and more modules (for example, 10 tool modules, 5 data model modules), just putting a bunch of files in a flat directory will be messy - then you need to package and use the directory structure to classify and manage the modules.

What is a package?

A package is essentially a directory, but it is recommended to put a__init__.pyFile (can be empty), used to tell Python: "This is a Python package, not an ordinary folder."

Note: Python 3.3+ supports namespace packages, allowing no__init__.pypackage, but it is still recommended to keep this file for compatibility and IDE friendliness.

A standard package structure example

The following is the structure of a small e-commerce backend project, which is clear at a glance:

ecommerce_backend/
├── __init__.py          # 标识整个项目为包
├── main.py              # 入口文件
├── models/              # 数据模型子包
│   ├── __init__.py
│   ├── user.py          # 用户模型
│   └── order.py         # 订单模型
└── utils/               # 工具子包
    ├── __init__.py
    ├── file_io.py       # 文件读写工具
    └── validator.py     # 数据校验工具

__init__.pyThe three major functions of

Don’t underestimate this file, it can do a lot of things:

  1. Identity: Compatible with older versions of Python and some development tools.
  2. Simplified import: Common modules/functions can be exposed at the top level of the package to avoid writing long and smelly sub-paths.
  3. Control visibility: Pass__all__list definitionfrom package import *What content will be imported to prevent internal helper functions from polluting the namespace.

Specific usage will be given in the later practical sessions.


3. 4 common ways to import modules/packages

The absolute path is the clearest and less error-prone no matter where the code is run from. It is the officially recommended method.

# 1. 导入整个模块(最安全,不污染命名空间)
import math_utils
import ecommerce_backend.utils.file_io

# 调用时必须加模块前缀
math_utils.add(1, 2)
ecommerce_backend.utils.file_io.read_csv("data.csv")

# 2. 导入特定内容(只导入需要的,减少内存占用)
from math_utils import add, multiply
from ecommerce_backend.utils.validator import check_phone

print(add(3, 4))
print(check_phone("13800138000"))

# 3. 导入并起别名(解决同名冲突或简化长路径)
import math_utils as mu
from ecommerce_backend.utils.validator import check_phone as is_valid_mobile

print(mu.multiply(5, 6))
print(is_valid_mobile("13912345678"))

# 4. 导入包中的模块(或暴露在 __init__.py 里的内容)
from ecommerce_backend import models
from ecommerce_backend.utils import file_io, validator

3.2 Relative import (only used inside the package)

For relative import.(current directory) and..(Superior directory) Simplifies references within the same project, but it can only be used in files inside the package and cannot be used in entry files outside the package, otherwise an error will be reported: ValueError: attempted relative import with no known parent package

Example: inecommerce_backend/models/user.pyImport sibling modules or superior packages.

# user.py(位于包内部)
# 导入同级目录的 order 模块
from . import order
# 导入上级目录的 utils 子包
from ..utils import validator
# 导入上级 utils 包的 check_email 函数
from ..utils.validator import check_email

4. How does Python find your module/package?

Sometimes the import will reportModuleNotFoundError, it’s not that the module is really missing, but that Python didn’t find it. If you understand the order of search paths first, you will be able to solve the problem much faster:

  1. Built-in module: e.g.ossysmath, the highest priority.
  2. The directory where the currently running script is located (when running interactively, it is the directory of the current terminal window).
  3. **PYTHONPATHDirectory specified by environment variable ** (common project paths can be added manually).
  4. Default path dependent on installation: such assite-packages(usepip installThe installed third-party libraries are all here).

View the current search path

import sys
print(sys.path)  # 打印出 Python 会按顺序查找的所有路径

5. 5 best practices for newbies to avoid pitfalls

5.1 Module/package naming should be standardized

  • Use all lowercase + underscores for module names, such asuser_service.pydata_parser.py, do not start with camel case or capital letters.
  • It is strictly prohibited to have the same name as the standard library or commonly used third-party libraries! For exampleos.pyrequests.pyIt will overwrite the official modules and cause various weird errors.
  • Simple check method: Open Python in the terminal and enterimport 你打算用的名字, if no error is reported, it means that the name has been occupied, change it quickly.

5.2 Avoidfrom module import *

This kind of import will dump all the content in the module (even internal auxiliary functions) into the current namespace, which is prone to conflicts and difficult to trace the source. Unless in the package__init__.pyChinese use__all__The scope is clearly defined, otherwise its use is not recommended.

5.3 Place the import statement in the right position

  • Put them all at the top of the file.
  • Group them in the order of standard library → third-party library → local module/package, with a blank line between each group to make it look clearer.
# 标准库
import os
import sys

# 第三方库
import pandas as pd
from pydantic import BaseModel

# 本地模块/包
from ecommerce_backend.models import user
from ecommerce_backend.utils import validator

5.4 Useif __name__ == '__main__'Do unit testing

After each module is written, you can add test code, so that the test will only be executed when the module is run directly, and will not be triggered when it is imported.

# math_utils.py
def add(a: float, b: float) -> float:
    return a + b

# 模块自测试
if __name__ == '__main__':
    print(f"add(2, 3) = {add(2, 3)}")  # 直接运行该文件时才会输出

5.5 The package structure should be as “flat” as possible

Try to avoid more than 3 levels of nested packages (e.g.a/b/c/d.py), otherwise the import path will be long and difficult to maintain. If the functionality is really thin, you can consider splitting it into multiple independent packages, or using namespace packages to manage it.


6. Practical combat: Build a small tool kit from scratch

Let's create a simplesimple_toolsThe toolkit contains two sub-modules, string and mathematics, and then use the entry file to test it.

6.1 Final package structure

simple_tools_demo/
├── __init__.py               # 空的,标识项目为包(可选)
├── simple_tools/             # 核心工具包
│   ├── __init__.py           # 简化导入用
│   ├── str_utils.py
│   └── math_utils.py
└── main.py                   # 入口文件

6.2 Core code

(1)simple_tools/str_utils.py(String Tools Module)

"""
字符串工具模块
提供常用的字符串处理函数
"""

def reverse_str(s: str) -> str:
    """反转字符串"""
    return s[::-1]

def count_vowels(s: str) -> int:
    """统计字符串中的元音字母数量(a/e/i/o/u,不区分大小写)"""
    vowels = {'a', 'e', 'i', 'o', 'u'}
    return sum(1 for char in s.lower() if char in vowels)

# 自测试
if __name__ == '__main__':
    print(reverse_str("hello"))      # 应输出 "olleh"
    print(count_vowels("Python"))    # 应输出 1

(2)simple_tools/math_utils.py(Mathematical Tools Module)

"""
数学工具模块
提供常用的数学计算函数
"""

def add(a: float, b: float) -> float:
    return a + b

def multiply(a: float, b: float) -> float:
    return a * b

if __name__ == '__main__':
    print(add(5, 6))      # 应输出 11

(3)simple_tools/__init__.py(simplify import + define visibility)

"""
simple_tools 工具包
版本:1.0.0
"""

__version__ = '1.0.0'

# 暴露常用的子模块,让外部可以直接用 from simple_tools import str_utils
from . import str_utils, math_utils

# 定义 from simple_tools import * 时会导入的公开内容
__all__ = ['str_utils', 'math_utils', '__version__']

(4)main.py(entry test file)

# 标准分组导入
from simple_tools import str_utils, math_utils, __version__

print(f"simple_tools 版本:{__version__}")
print("---")

# 测试字符串工具
test_str = "Hello, Simple Tools!"
print(f"原字符串:{test_str}")
print(f"反转字符串:{str_utils.reverse_str(test_str)}")
print(f"元音字母数量:{str_utils.count_vowels(test_str)}")
print("---")

# 测试数学工具
print(f"3.5 + 2.5 = {math_utils.add(3.5, 2.5)}")
print(f"4 * 7.2 = {math_utils.multiply(4, 7.2)}")

6.3 Running tests

Open in terminalsimple_tools_demoDirectory, execute:

python main.py

You can see all test results and verify that our toolkit is working properly.


7. Module Easter Eggs in Modern Python

7.1 Namespace package (Python 3.3+)

If you have multiple directories and want them to share a package name (such as a plug-in system), you can completely delete all related directories.__init__.py, Python will automatically merge them into a namespace package to achieve cross-directory package merging.

7.2 Clean module cache

Sometimes the module code is modified, but the old result is still the same when running the entry file - this is because.pycThe cache is not refreshed. The manual cleaning method is very simple:

  • Delete the directory where the module is located__pycache__folder.
  • or execute in Pythonimport importlib; importlib.reload(你的模块名)(It is only recommended for debugging, do not add formal code).

Summarize

Modules and packages are the cornerstone of building large-scale projects in Python. The core idea is: split the code into small units with clear logic, classify them according to the directory structure, and use standardized import methods to call.

Now hurry up and get your thousand-linemain.pyTake it apart and you will find that the joy of writing code is back 😎