Python Generator Tutorial
What is a generator
Generator is a special lazy loading iterator in Python. It does not calculate all the values at once and stuff them into the memory like a list. Instead, it waits until you actually want to use the next value before "temporarily starting" to calculate it. This on-demand computing feature allows the generator to directly reduce memory usage from "disaster level" to "friendly level" when processing extremely large data sets, infinite sequences, and other scenarios.
The core advantages of generators can be condensed into three points:
- ✅ Extreme memory saving: Only save the current execution context, do not store historical values, and do not need to precalculate future values.
- ✅ STATUS STAYS STRONG: Execute to
yieldIt will automatically pause. Next time it resumes from the breakpoint, the values of local variables will still be there. - ✅ Super concise code: compared to writing a complete iterator class by hand (to implement
__iter__and__next__), the generator can reduce the amount of code by more than half
Two ways to create a generator
1. Generator expression: the lightest "lazy calculation" recipe
The syntax of generator expressions is very similar to list comprehensions. The only difference is that the square brackets[]Replace with parentheses():
💡 Tips: Generator expressions are suitable for data conversion and filtering scenarios that do not require reuse and have simple logic. If the logic exceeds one line, or requires multiple pauses/resumes, it is recommended to use generator functions.
2. Generator function:yieldThe keyword is soul
Once an ordinary function is executedreturnIt will exit completely and all local variables will be destroyed. But as long as it appears in the function bodyyieldkeyword, the Python interpreter will recognize it as a generator function - when it is called, the function body will not be executed immediately, but a generator object will be returned.
##Basic usage of generators
useforAutomatic loop iteration (most recommended)
PythonforThe loop will implicitly callnext(), and automatically process it after the data is exhaustedStopIterationException, the most worry-free to use:
Manual callnext()(suitable for fine control)
If you need to control the value rhythm yourself (such as debugging or coordinating other logic), you can use the built-in functionnext()Get one by one:
Advanced features of generators
1. Clear state retention process
Add some print statements to the generator function to visually see the switching process between execution and pause:
⚠️ Emphasis: The generator** must be called first
next()** (or viaforThe loop implicit call) can start execution and will not run automatically when it is created.
2. New in Python 3.3+: Generator with return value
Ordinary generators will just throw when they reach the endStopIteration, no return value. Starting in Python 3.3, you can usereturngives a "final result", this value will be appended to the exceptionvalueAttributes for the upper layer to capture:
3. Python 3.3+ New:yield fromdelegate subgenerator
When you want to iterate over another iterable object in a generator and output it one by one, you don't have to write it manuallyfor item in it: yield item, use directlyyield from itYou can complete the "delegation" - it will automatically hand over all the elements of the internal iterator one by one, and it will also be responsible for the dirty work such as exception delivery:
Three high frequency practical applications
1. Infinite Fibonacci Sequence
Storing infinite sequences with lists is an impossible task, but generators can easily represent "infinitely long" logic:
2. Read very large files line by line
Faced with log files of hundreds of MB or even several GB, if you usefile.readlines()Reading everything into memory at once may directly cause memory overflow. Instead use a generator to read line by line, loading one line at a time, which is much more memory friendly:
3. Data processing pipeline (chain combination)
Generators can be combined withfilter()、map()These functional tools work seamlessly together to build a lazy evaluation data processing pipeline - the previous steps are not actually executed until the last step (e.g.list()orforWhen the loop) starts consuming data, the entire pipeline runs in sequence:
Quick addition: one-shot features of generators
Generators can only be iterated once. Once all the data is generated, it is "exhausted" and you can only create a new one if you want to use it again. If you really need to access the data repeatedly, you can convert the generator into a list in advance, but this will lose the memory saving advantage:
Small exercise: Use the generator to output the Yang Hui triangle
Try to implement a generator that can output every row of Yang Hui's triangle infinitely. The reference answer is as follows:
Summarize
Generators are a very "Pythonic" tool in Python. It is recommended to give priority to use in the following scenarios:
- Process very large data sets or large files (avoiding memory explosion)
- Represents infinite sequences (such as Fibonacci, infinite loop data flow)
- Build a lazy evaluation data processing pipeline (improve performance and readability)
- Learn the basic concepts of coroutines (understand
yieldfunction, for subsequentasync/awaitbase)
Although generators may be slightly slower than list comprehensions in terms of pure calculation speed, their huge advantages in memory usage usually far outweigh that slight speed penalty - especially when you need to process massive amounts of data, generators are your best friend.

