Python Multithreaded Programming Guide
Concurrent programming is a common method to improve the performance of Python applications. Among them, multi-threading is particularly popular in I/O-intensive scenarios (such as crawlers, database operations, and network requests) because of its low creation overhead, convenient data sharing, and high ease of use. This article will take you to quickly sort out the core knowledge points of Python multi-threading, from basic usage, common pitfalls, GIL restrictions to best practices. The code is highlighted and the paragraphs are clear. It can be completed in 3000 words~
1. Thread basics
Python via the standard librarythreadingProvides multi-thread support (the bottom layer encapsulates the system thread interface: Win32 threads are used under Windows, and pthread is used under Linux/macOS). Each process will have a main thread (MainThread) by default, through which we can create additional child threads to perform concurrent tasks.
1.1 Two ways to create threads
The most commonly used is functional creation, the code is concise and intuitive; if you need more fine-grained control (such as reusing thread classes), you can use inherited class creation.
Functional creation (recommended for entry-level use)
Inherited class creation (suitable for complex reuse)
1.2 Commonly used thread attributes/methods
threadingThe module comes with several practical global tools, which are very convenient for debugging and managing threads:
threading.current_thread(): Get the currently executing thread instancethreading.active_count(): Returns the current total number of active threads (including the main thread)threading.enumerate(): Returns the current list of all active threads
2. Thread synchronization and locking
Multithreading has a fatal advantage and a fatal pitfall: they share the memory space of the same process. If multiple threads modify the same global variable at the same time, a Race Condition will occur, resulting in unpredictable results.
2.1 Race condition example
For example, in the following accumulator, 10 threads each add 100,000 times. The expected result is 1 million, but the actual result is different every time it is run, and the probability is less than 1 million:
2.2 Use Lock (mutex lock) to resolve competition
threading.LockIt is the simplest synchronization primitive. Its rule is: Only one thread can get the lock at the same time, and other threads must wait until the lock is released.
RecommendedwithStatement management lock, it will automatically handle acquisition and release to avoid deadlock caused by forgetting to release the lock:
2.3 Other commonly used synchronization primitives
In addition to the basicLock, and there are several advanced primitives to solve more complex scenarios:
threading.RLock: Reentrant lock, the same thread can acquire it multiple times (suitable for recursive calls)threading.Condition: Condition variable, used in combination with Lock, used for "wait-notification" communication between threadsthreading.Semaphore: Semaphore, controls the number of threads accessing a certain resource at the same time (such as limiting the number of database connections)threading.Event: Event Object, used for simple "on/off" notification between threads
3. The unavoidable GIL (global interpreter lock)
Many beginners will wonder: "With Python's multi-threading, why do CPU-intensive tasks (such as scientific computing and image processing) slow down?" The answer lies in the GIL (Global Interpreter Lock) of the CPython interpreter.
3.1 The essence of GIL
GIL is an implementation detail of the CPython interpreter (not available in Jython, IronPython and other interpreters). Its rules are: At any time, only one thread is executing Python bytecode, and other threads must wait for the GIL to be released.
There are two main timings for GIL release:
- Encountered I/O operation (such as
time.sleep(), network requests, reading and writing files) - After Python 3.2+, the interpreter will forcibly switch threads every 15 milliseconds (regardless of whether there is I/O or not)
3.2 Impact of GIL
- I/O intensive tasks: multi-threading can still benefit! Because the GIL will be released when I/O is encountered, the CPU will not wait idle.
- CPU-intensive tasks: Multi-threading cannot effectively utilize multiple cores! Because only one thread is running bytecode at the same time, forced switching will increase the overhead.
- C extended calculation: If the calculation logic is placed in a C/C++ extension, you can manually release the GIL. At this time, multi-threads can also run multi-core
3.3 Solutions for dealing with GIL restrictions
- Use multiple processes:
multiprocessingModule creates independent interpreter and GIL for each process, completely bypassing limitations (suitable for CPU intensive) - Asynchronous Programming:
asyncioIt is single-threaded collaborative concurrency without GIL switching overhead (suitable for high concurrent I/O intensive) - Use other interpreters: Jython (based on Java) and IronPython (based on .NET) do not have GIL, but the ecology is not as complete as CPython
- C extension/Cython: Use C/Cython to implement and release GIL in the core computing part
4. Thread pool and best practices
4.1 Using ThreadPoolExecutor (recommended)
It is troublesome to manually create and manage a large number of threads (such as controlling the number of threads, recycling resources, and handling exceptions). Python 3.2+ providesconcurrent.futures.ThreadPoolExecutorIt can help us complete these things automatically.
Basic usage
Get results quickly (not in order)
If you don't need to get the results in the order of submission, you can useexecutor.map()orconcurrent.futures.as_completed():
4.2 Best practices for multi-threaded programming
- Try to avoid sharing global variables: Use locks first when they must be shared, and the granularity of the locks must be as small as possible (locking the entire loop will turn multi-threads into single-threads, which is a waste of time)
- Prioritize the use of thread-safe data structures: For example
queue.Queue(Thread-safe queue, suitable for inter-thread communication) - Use thread local storage: Each thread holds a copy of data independently to avoid sharing (see next section)
- Properly handle thread exceptions: Exceptions of sub-threads will not be propagated to the main thread by default. They must be captured inside the task function, or use Future
result()/exception()get - Don’t abuse multi-threading: When the number of tasks is small or purely CPU-intensive, multi-threading may be slower.
5. Thread local storage
If each thread needs to save its own temporary data (such as database connection, user session), you can usethreading.local()Create a thread local storage object, and each thread can read and write it independently.
6. Multi-threading vs. multi-process selection guide
7. Summary of modern Python concurrent programming
- I/O intensive, low concurrency: use directly
ThreadPoolExecutor - I/O intensive, high concurrency: priority
asyncio(Single-threaded cooperative, no switching overhead) - CPU intensive: use
multiprocessingorProcessPoolExecutor - Complex Scenario: For example, if distributed execution is required, use
Celerywaiting task queue
Python 3.10+ also provides more powerful concurrency tools (such asasyncio.TaskGroup, more flexible Executor timeout control), developers can choose the appropriate model according to specific needs~

