SQL Detailed Explanation of DDL - Complete Guide to Database Definition Language | Daoman PythonAI

#SQL Detailed DDL - Complete Guide to Database Definition Language

Introduction

Entering the world of database design, have you ever been troubled by these scenarios: a table creation statement that runs well with MySQL locally, but PostgreSQL reports an error mercilessly once it is put into a production environment; you want toupdated_atFields are updated automatically. MySQL can easily do it in one line, but PostgreSQL requires triggers and stored procedures. Don't worry, this is a true portrayal of the "different personalities" of DDL (data definition language) in different databases.

Today’s article is prepared for you. We will dismantle the core commands of DDL in a simple and easy-to-understand manner, and put the two mainstream relational-databases - MySQL and PostgreSQL - together to compare their practical differences. There are no arcane theories in the article, only code examples and pitfall avoidance guides that you will use in actual development.


1. DDL Core Basics: What exactly is it?

If the database is compared to a warehouse, then DDL (Data Definition Language) is the construction team that builds the warehouse and plans the layout of shelves and rooms. It is responsible for defining and modifying the structure of the database, such as libraries, tables, indexes, and views.

DML (Data Manipulation Language) is like the daily administrator of a warehouse, responsible for the entry and exit of goods (data) and organization.

The core difference between them lies in the way transactions are submitted:

Language TypeMain CommandsCore PurposeTransaction Features
DDLCREATE/ALTER/DROP/TRUNCATEDefine or modify structureAutocommit, cannot rollback
DMLINSERT/UPDATE/DELETEManipulate the data itselfManually requiredCOMMITorROLLBACK

IMPORTANT WARNING: Because DDL operations are automatically submitted, they cannot be undone once executed. Therefore, before executing any DDL in the production environment, be sure to back it up first and fully verify it in the test environment.


2. Database-level operations: everything starts here

2.1 Create database

When creating a database, the character set and proofreading rules are directly related to whether Chinese garbled characters will be generated, and are configuration habits that must be developed.

-- ========== MySQL 示例 ==========
-- 安全创建数据库,指定国际通用字符集
CREATE DATABASE IF NOT EXISTS company_db
CHARACTER SET utf8mb4
COLLATE utf8mb4_unicode_ci;

-- 查看所有数据库
SHOW DATABASES;
-- 切换到目标数据库
USE company_db;
-- ========== PostgreSQL 示例 ==========
-- 需明确指定所有者和编码
CREATE DATABASE company_db
WITH
    OWNER = postgres
    ENCODING = 'UTF8'
    LC_COLLATE = 'en_US.UTF-8'
    LC_CTYPE = 'en_US.UTF-8';

-- 在 psql 客户端中,查看和切换数据库的命令有些不同
\l                  -- 列出所有数据库
\c company_db;      -- 连接到目标数据库

2.2 Delete database

Deletion is a dangerous operation,IF EXISTSIt is the best "safety rope" to avoid errors due to the non-existence of the database.

-- ========== MySQL 示例 ==========
DROP DATABASE IF EXISTS old_database;

PostgreSQL's deletion operation is relatively "cautious" and requires that no other sessions are connected to the database.

-- ========== PostgreSQL 示例 ==========
-- 【关键步骤!】首先,强制断开所有目标数据库的连接
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = 'old_database' AND pid <> pg_backend_pid();

-- 然后,才能顺利删除
DROP DATABASE IF EXISTS old_database;

3. Table operation: the main battlefield of differences

The creation and modification of tables is the most obvious difference between MySQL and PostgreSQL.

3.1 Create the table and handle the auto-increment primary key and automatic update time

This is a very classic requirement: an employee table needs to have an auto-increment ID, salary verification and automatically updated timestamp.

-- ========== MySQL 示例 ==========
CREATE TABLE employees (
    employee_id INT AUTO_INCREMENT PRIMARY KEY, -- 自增主键,简洁明了
    first_name VARCHAR(50) NOT NULL,
    salary DECIMAL(10,2) CHECK (salary > 0),
    -- 创建时间,记录一次就好
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    -- 更新时间,数据变更时自动刷新,MySQL 的特性之一
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    department_id INT,
    FOREIGN KEY (department_id) REFERENCES departments(department_id)
) ENGINE=InnoDB; -- 明确指定支持事务和外键的存储引擎

PostgreSQL takes another approach, and its syntax is closer to the SQL standard, but convenient features such as automatically updating timestamps need to be implemented through triggers.

-- ========== PostgreSQL 示例 ==========
-- 1. 创建表结构
CREATE TABLE employees (
    employee_id SERIAL PRIMARY KEY, -- SERIAL 是 PostgreSQL 标准的自增整型
    first_name VARCHAR(50) NOT NULL,
    salary NUMERIC(10,2) CHECK (salary > 0), -- 建议使用 NUMERIC 代替 DECIMAL
    -- 推荐使用带时区的,时间是绝对清晰的
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    department_id INTEGER,
    CONSTRAINT fk_department -- 给外键约束一个好名字,方便后续维护
        FOREIGN KEY (department_id)
        REFERENCES departments(department_id)
        ON DELETE SET NULL   -- 当部门被删除时,将此字段置为空
);

-- 2. 创建触发器函数,用于更新 updated_at 字段
CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW(); -- 每当行被更新,new.updated_at 就被赋值为当前时间
    RETURN NEW;
END;
$$ language 'plpgsql';

-- 3. 将触发器应用到 employees 表,在更新每一行之前触发
CREATE TRIGGER update_employees_updated_at
    BEFORE UPDATE ON employees
    FOR EACH ROW
    EXECUTE PROCEDURE update_updated_at_column();

3.2 Modify table structure

Modifying columns is a high-frequency operation in daily development, but the syntax difference between the two is not small. MySQL is more "integrated into one", and PostgreSQL is more "distinct".

-- ========== MySQL 示例 ==========
-- 1. 添加新列
ALTER TABLE employees ADD COLUMN mobile VARCHAR(20);
-- 2. 修改列类型 (MODIFY) 或 重命名列及类型 (CHANGE)
ALTER TABLE employees
    MODIFY COLUMN salary DECIMAL(12,2),       -- 仅修改类型
    CHANGE COLUMN mobile phone VARCHAR(20);   -- mobile 重命名为 phone,同时可改类型
-- 3. 删除列或索引
ALTER TABLE employees DROP COLUMN phone, DROP INDEX idx_name;
-- 4. 重命名表
ALTER TABLE employees RENAME TO staff;
-- ========== PostgreSQL 示例 ==========
-- 1. 添加新列
ALTER TABLE employees ADD COLUMN mobile VARCHAR(20);
-- 2. 修改列类型 与 重命名列,是两个独立操作
ALTER TABLE employees ALTER COLUMN salary TYPE NUMERIC(12,2);
ALTER TABLE employees RENAME COLUMN mobile TO phone;
-- 3. 删除列或约束(必须通过约束名称)
ALTER TABLE employees DROP COLUMN phone, DROP CONSTRAINT fk_department;
-- 4. 重命名表
ALTER TABLE employees RENAME TO staff;

3.3 Delete table

-- ========== MySQL 示例 ==========
DROP TABLE IF EXISTS employees;

In PostgreSQL, you can choose cascade delete to clear all objects that depend on the table (such as views, foreign keys, etc.) at once. It is convenient but must be used with caution.

-- ========== PostgreSQL 示例 ==========
-- RESTRICT (默认):有依赖对象时禁止删除
-- CASCADE:将依赖对象一并删除,清理得干净彻底
DROP TABLE IF EXISTS employees CASCADE;

4. Advanced indexing gameplay: PostgreSQL’s shining moment

Indexes are created to speed up queries. Both MySQL and PostgreSQL support basic indexes, but PostgreSQL provides some very powerful advanced index features that can greatly improve performance and save space in specific scenarios.

4.1 Basic Index

-- MySQL / PostgreSQL 通用
-- 普通索引,加速 last_name 字段的查询
CREATE INDEX idx_last_name ON employees(last_name);
-- 唯一索引,保证 email 唯一的同时加速查询
CREATE UNIQUE INDEX idx_employee_email ON employees(email);

This is the embodiment of PostgreSQL as a "developer-friendly" database.

-- ========== PostgreSQL 示例 ==========

--【特色1:部分索引】只对活跃员工建立索引。如果活跃员工只占少数,这能节省大量空间和查询时间。
CREATE INDEX idx_active_employees ON employees(employee_id)
WHERE status = 'active';

--【特色2:表达式索引】如果你经常查询大写格式的姓名,可以直接在函数结果上建索引。
CREATE INDEX idx_upper_name ON employees( UPPER(last_name) );

--【特色3:并发创建索引】在大表上创建索引时,如果不加 CONCURRENTLY,表会被锁定。
-- 加上它,创建过程就不会阻塞正常的读写操作,对业务无感知。
CREATE INDEX CONCURRENTLY idx_large_table ON large_table(column_name);

4.3 Delete index

-- ========== MySQL 示例 ==========
DROP INDEX idx_last_name ON employees;
-- ========== PostgreSQL 示例 ==========
-- 直接指定索引名删除,更简洁
DROP INDEX IF EXISTS idx_last_name;

5. View operation: from ordinary view to materialized view

The view is like a "virtual window" that encapsulates complex queries, making it as simple as looking up a table. Both support this.

5.1 Normal view

-- MySQL / PostgreSQL 通用
-- 创建一个视图,只展示活跃员工的关键信息
CREATE VIEW active_employees AS
SELECT employee_id, first_name, last_name, salary
FROM employees
WHERE status = 'active';

-- 修改或替换现有视图
CREATE OR REPLACE VIEW active_employees AS
SELECT employee_id, first_name, last_name, salary, hire_date
FROM employees
WHERE status = 'active';

5.2 PostgreSQL’s trump card: materialized views

A normal view needs to re-execute the underlying SQL every time it is queried. But if your query is complex (such as monthly reports) and the data does not change frequently, Materialized View is the perfect solution. It will save the query results like a "physical table", and the query speed is extremely fast.

-- ========== PostgreSQL 示例 ==========
-- 1. 创建一个按月统计销售额的物化视图
CREATE MATERIALIZED VIEW monthly_sales_summary AS
SELECT
    DATE_TRUNC('month', sale_date) as month,
    SUM(amount) as total_sales
FROM sales
GROUP BY DATE_TRUNC('month', sale_date);

-- 2. 当原始 sales 表有新数据后,需要手动或定时刷新视图以同步数据
REFRESH MATERIALIZED VIEW monthly_sales_summary;

6. Constraint operations: CHECK vs. ENUM

Data integrity cannot exist without constraints. Let’s use an “order status” example to see the pros and cons of the two implementation ideas.

-- ========== MySQL 示例 ==========
-- 使用 ENUM 类型,把选项限制在定义好的范围内
CREATE TABLE orders (
    order_id INT AUTO_INCREMENT PRIMARY KEY,
    status ENUM('pending', 'completed', 'cancelled') DEFAULT 'pending'
);
-- ENUM的缺点:后期想增加或修改选项,需要执行 ALTER TABLE,成本较高
-- ========== PostgreSQL 示例 ==========
-- 使用 CHECK 约束 + 普通字符串类型,语义相同但灵活得多
CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    status VARCHAR(20) DEFAULT 'pending'
        CHECK (status IN ('pending', 'completed', 'cancelled'))
);
-- CHECK 约束的优势:修改选项列表时,成本相对较低,逻辑也更清晰

7. DDL best practices: make your library tables more robust

7.1 The power of naming conventions

Uniform naming conventions are the cornerstone of database maintainability. It is recommended to use lowercase letters, underscores, and give constraints a concise name.

-- 好的实践:名称自解释,约束命名有规律
CREATE TABLE user_accounts (
    account_id SERIAL PRIMARY KEY,
    CONSTRAINT fk_user_orders FOREIGN KEY (user_id) REFERENCES users(user_id),
    CONSTRAINT chk_positive_amount CHECK (amount > 0)
);

7.2 Data type pitfall avoidance guide

  • Storage Amount: Use firmlyDECIMALorNUMERIC. Do not useFLOATorDOUBLE, they produce a loss of precision, a fatal flaw when dealing with money.
  • Storage Time:
  • PostgreSQL: preferredTIMESTAMP WITH TIME ZONE, it will completely record the time information.
    • MySQLTIMESTAMPis a good choice, it will convert according to time zone.DATETIMENo, it is "what you see is what you get" and does not carry time zone information.
  • Storage text: Used for fields with relatively fixed lengthVARCHAR(n), fields with huge differences in length can be used with confidence.TEXT

7.3 Balance between security and performance

  • Backup is the first priority: Before executing DDL in the production environment, must back up data.
  • Select business low peak periods: When making structural modifications to large tables, try to perform maintenance during low business peak periods such as early morning.
  • Make good use of concurrency features: In PostgreSQL, use it firstCREATE INDEX CONCURRENTLYto avoid locking the entire table when creating an index.
  • Periodic "slimming": Regularly check and clean up indexes that are no longer used or are ineffective, which slow down writes and waste storage.

8. MySQL and PostgreSQL DDL Difference Cheat Sheet

This table can be used as a "quick reference card" for your future development.

FeaturesMySQLPostgreSQL
Auto-increment primary keyAUTO_INCREMENTSERIAL
Automatically update timestampON UPDATE CURRENT_TIMESTAMPNeed to create triggers manually
Partition TableDirectly inCREATE TABLEmiddlePARTITION BYAfter creating the main table, create the subtable and declarePARTITION OF
Materialized ViewNot supportedSyntax support, need to be done manuallyREFRESH
Concurrent DDLLimited support for online DDLCREATE INDEX CONCURRENTLYPowerful Features
enumeration typeENUMType, modification is costlyRecommendedCHECKConstraints, more flexibility
Time with time zoneTIMESTAMP, but behavior depends on system settingsTIMESTAMP WITH TIME ZONE, native and clear

Before executing DDL, first perform a 1:1 restore and verification in the test environment, and back up the core data at the same time; select large table changes during the early morning off-peak period.

Summarize

DDL is the cornerstone on which the world of data is built. The choice between MySQL and PostgreSQL is often not about advantages or disadvantages, but about the difference in scenarios: one pursues lightness and simplicity, while the other pursues standards and power. Understanding and mastering their core differences in DDL will allow you to design the database with ease and avoid the "magic pits" that make you work overtime late at night. I hope this guide will become a handy addition to your toolbox.