relational-database tutorial - A complete guide to MySQL and PostgreSQL
Whether you are a beginner who has just started to get into back-end development, or a developer who has been writing CRUD for several years and wants to make up for your shortcomings, relational-database is the technical cornerstone that must be firmly mastered. This tutorial focuses on two major open source mainstream databases - MySQL 8.0+ and PostgreSQL 14+. From core concepts, SQL language to Python practice, it will take you to establish complete system capabilities from table creation to optimization, from native query to ORM programming.
The tutorial adopts a three-layer structure of "Basic + Advanced + Python Implementation". You can choose to skip reading or study in order according to your own level. All the codes in this article have been actually verified, and the differences between the two models are marked to facilitate your quick selection or migration.
1. First understand the core logic of relational-database
relational-database (RDBMS) stores data in Table, which is composed of Row and Column. Each table can be connected with other tables through Foreign Key. The core guarantee of this model is ACID:
- Atomicity: All operations in the transaction either succeed or are all rolled back.
- Consistency: The data before and after the transaction must satisfy all predefined rules (constraints, triggers).
- Isolation: Concurrent transactions do not interfere with each other.
- Durability: Committed transaction changes will not be lost even if the system crashes.
Both MySQL and PostgreSQL fully support ACID, but each has different emphasis on implementation details and advanced features: MySQL is known for its simplicity and ease of use, and mature separation of reading and writing; PostgreSQL is closer to the SQL standard and has more powerful support for complex queries, geographical space, and JSON structured data.
2. Quick installation and tool preparation
MySQL 8.0
- macOS:
brew install mysql - Windows: Download the community version MSI installation package and set the root password during installation.
- Linux:
sudo apt install mysql-server
PostgreSQL 14
- macOS:
brew install postgresql - Windows: Download the installer from the official website
- Linux:
sudo apt install postgresql
After installation, it is recommended to use visualization tools: MySQL Workbench, pgAdmin 4 or VS Code database plug-in to facilitate visual viewing of the table structure and execution results in the early stage.
3. SQL language five-piece set: DDL / DML / DQL / DCL / TCL
SQL can be divided into five categories according to functions. This is the skeleton for learning relational-database.
1. DDL (data definition language) - create databases, tables, and define constraints
Create database
Create table
Take the two tables of classes and students as an example:
Use of auto-increment columns in PostgreSQLSERIALorGENERATED AS IDENTITY:
Create a student table and associate classes with foreign keys
PostgreSQL writing is almost the same, just note that foreign key constraints are defined in the same way.
If you want to delete a table or database, useDROP TABLE student;orDROP DATABASE school;, operate with caution.
Common constraints
PRIMARY KEY: Uniquely identifies a row and automatically creates a unique indexFOREIGN KEY: Ensure referential integrityUNIQUE: Value cannot be repeatedNOT NULL: Empty is not allowedDEFAULT: Set default value
2. DML (data manipulation language) - adding, deleting and modifying data
Insert data
UPDATE DATA
Be careful: do not addWHEREThe condition will update the entire table.
Delete data
Again, noWHEREThe entire table will be cleared. Please think twice about this in production environments.
3. DQL (Data Query Language) - Flexible access to data
Basic Query
Fuzzy query, range and sorting
Multi-table association
Aggregation and Grouping
Subquery
4. DCL (Data Control Language) - Users and Permissions
The commands for PostgreSQL are slightly different, but the idea is the same: useCREATE ROLE、GRANTGrant permissions and useREVOKERecycle.
Principle of minimizing permissions: Only give the application the permissions it actually needs. Do not give them directly for convenience.ALL。
5. TCL (Transaction Control Language) - Ensure atomicity of operations
MySQL automatically commits by default, and production environments are often set to manual commit or explicit wrapping of transactions. PostgreSQL also recommends using explicitBEGIN…COMMIT/ROLLBACK。
4. Advanced tools: index optimization and execution plan
When the amount of data exceeds 100,000, index design directly affects performance. B+ tree is the most commonly used index structure. Understanding the "leftmost prefix principle" and "covering index" can avoid most slow queries.
Create index
Leftmost prefix principle: Union index(class_id, name)InquiryWHERE class_id = 1orWHERE class_id = 1 AND name = '张三'effective at the time, butWHERE name = '张三'The index cannot be exploited.
Use EXPLAIN to analyze the query
focus ontypefields (such asALLIndicates full table scan and needs to be optimized),keyThe index used and the number of scanned rows.
Common index failure scenarios:
- Perform functional operations on index columns, such as
WHERE YEAR(create_time) = 2021 - Leading fuzzy query
LIKE '%abc' - Implicit type conversion, such as comparing string fields to numbers
OROne side of the condition is not using an index
5. Views, functions and stored procedures (moderate use)
View can encapsulate complex queries and simplify application layer code:
Stored functions and procedures are suitable for sinking business logic into the database, but do not overuse it, otherwise it will reduce code maintainability and migration flexibility. It is generally recommended to only be used in scenarios such as data cleaning and batch statistics, and the business logic should be placed in the application layer as much as possible.
6. Python Practical Combat: From Native Driver to ORM
1. Native connection to MySQL
Connecting to PostgreSQL can be done usingpsycopg2, the interface is similar.
2. Parameterized query (prevent SQL injection)
Never use string concatenation to construct SQL, always use%s(MySQL) or%s(Postgres) placeholder and pass the parameters in as a tuple. This is the most effective way to prevent SQL injection.
3. Getting started with SQLAlchemy ORM
ORM can automatically handle the mapping between objects and tables, and supports advanced functions such as transactions and connection pools. SQLAlchemy 2.0 fully embraces asynchrony and is the first choice for modern Python projects.
7. Practical project suggestions
If you don’t just want to practice fake moves, I suggest you do it in the following order:
- Class Attendance Prototype: Implement three tables of classes, students, and attendance records, and practice multi-table related queries and aggregation statistics.
- E-commerce order table optimization: Import 100,000 pieces of simulated data, use EXPLAIN to analyze slow queries, try to build single-column indexes and joint indexes, and compare performance changes.
- Blog backend data layer: Use FastAPI + SQLAlchemy to build CRUD interfaces for users, articles, and comments, and add paging, transactions, and connection pools.
8. Quick Check on Frequently Asked Questions
**Q: Which one should I choose to get started with MySQL or PostgreSQL? ** A: MySQL is the first choice for pure beginners. It has rich community resources and it is easier to search for answers when reporting errors. If your business relies on JSON, geographical information, or has extremely high data integrity requirements, you will not regret it if you directly use PostgreSQL.
**Q: Is the more indexes the better? **
A: Absolutely not. Indexes can slow down write operations and occupy disk space. Only for high-frequency queriesWHERE、JOIN ON、ORDER BYColumns are indexed.
**Q: Should I use native driver or ORM for Python database programming? ** A: Use native drivers for small tools and simple scripts; for medium and large-scale team collaboration projects, SQLAlchemy ORM is recommended, which can significantly reduce duplicate code and improve maintainability.
What to learn next?
After mastering the content of this tutorial, you can continue to delve deeper:
- Cache Layer: Redis implements hotspot data caching
- Distributed extension: MySQL master-slave replication, PostgreSQL streaming replication
- Cloud practice: Alibaba Cloud RDS, AWS Aurora and other cloud database services
- NoSQL: MongoDB, ClickHouse cope with unstructured or analytical scenarios
I wish you can build a robust and efficient data layer as soon as possible and lay a solid foundation for more complex systems.

