Use dict and set
Among Python's built-in data structures, lists and tuples are good at handling ordered sequences, but in actual development, we often need to quickly locate data based on a unique identifier**, batch deduplication, or efficiently determine whether a member exists. In these scenarios, dictionary (dict) and set (set) become irreplaceable golden partners. Their core secret is that they all use Hash Tables at the bottom layer, which allows many operations to be completed almost instantly.
Dictionary (dict) basics
A dictionary is a typical key-value mapping container. You can think of it as an index of a book: keywords (keys) correspond to page numbers (values). In other languages it is also called a hash table, map or associative array.
Basic addition, deletion, modification and query
The syntax of a dictionary is very intuitive. It is wrapped in a pair of curly braces, with pairs insidekey: value, separated by commas.
The core of efficiency: Hash table principle
Why can dictionaries achieve near-constant-time query, insertion, and deletion speeds? It can be simply understood as three steps:
- For input
keyExecute the built-inhash()Function, calculates a fixed-length "hash value" (equivalent to giving this key a unique fingerprint). - Map this hash value to a position index of the internal array (just like looking up the index table through keywords and turning directly to the corresponding page).
- Directly locate the storage based on the index
valuememory location.
Regardless of whether there are 10 elements or 100,000 elements in the dictionary, the entire process does not need to be traversed one by one, so the speed is hardly affected by the amount of data.
Commonly used advanced operations
Three core features that must be remembered
- Insertion order, but don’t rely entirely on it: Starting with Python 3.7, the language specification guarantees that dictionaries will preserve the insertion order of keys. But when you need compatibility with older versions or need to flexibly adjust the order (such as moving elements to the beginning), please use
collections.OrderedDict。 - The key must be unique: If the same key is assigned a value repeatedly, the latter value will overwrite the previous one, and there are no duplicate keys in the dictionary.
- The key must be an immutable object: The hash table requires the hash value of the key to be stable, so mutable objects such as lists and ordinary collections cannot be used as dictionary keys. Numbers, strings, and tuples (and the elements within the tuple must also be immutable) are all qualified keys.
Collection (set) basics
A collection can be understood as a dictionary that only stores keys and no values. It also relies on hash tables at the bottom, so it is naturally suitable for removal and mathematical set operations.
Basic usage and operations
💡 Tips: The elements of the collection must also be immutable objects. You can put strings, numbers, and tuples into a collection, but not lists or another ordinary collection.
Three major characteristics of collections
- Disorder: Do not rely on the order of the elements in the collection, the results of each traversal may be different.
- Uniqueness: Automatically remove duplicate elements, which is one of its most practical features.
- Elements must be immutable: Like the keys of the dictionary, the elements of the set also require the hash value to be stable.
Key prefix: mutable and immutable objects
Whether it is a dictionary key or a collection element, "immutability" is required. To understand this, we must first distinguish between two types of data in Python.
Mutable object (Mutable)
After creation, you can modify the object's contents in situ, but the object's address will not change:
- List, dictionary (dict), ordinary set (set)
Immutable objects (Immutable)
After creation, the content of the object cannot be modified directly. Any "modification" operation will generate a new object:
- String (str), number (int/float/bool), tuple (tuple), frozen set (frozenset)
🔑 **Why is it so important? ** If the keys of the dictionary change, its hash value will not match the storage location, resulting in no data being found at all, or even destroying the entire structure. Therefore, Python directly prohibits using mutable objects as keys or collection elements.
Practical application scenario suggestions
When to use dict?
- Key-value mapping query: Check the name based on the student number, and get the parameters based on the name of the configuration item.
- Data grouping statistics: For example, to count the number of occurrences of each word in a paragraph of text, use
word_count[word] = word_count.get(word, 0) + 1Extremely efficient. - Implement simple caching: temporarily save the calculation results in the dictionary and retrieve them directly using the key next time, making full use of its near-instantaneous search speed.
When to use set?
- Batch deduplication: Extract all unique IDs from a massive list of user IDs, one row
set(id_list)That's it. - Efficient member judgment: Checking whether an IP is in the blacklist, using a set is dozens to hundreds of times faster than using a list traversal.
- Set logical operation: Quickly find the common friends of two people (intersection), the unique visitor added to a platform today (difference set), etc.
Performance experience (analogy with examples in life)
Answers to high-frequency questions
**Q1: Why can’t a list be used as a key of a dictionary or an element of a set? **
Because lists are mutable objects. Imagine that if lists are allowed to be used as keys, we first stored = {[1,2]: "A"}, and then accidentally modify the contents of this list, its hash value will change, and the originally stored value will no longer be found, which will also cause confusion in the internal structure. If you need to use a list-like sequence as a key, you can convert it to a tuple, since tuples are immutable:
**Q2: Since the Python 3.7+ dictionary already maintains insertion order, you still needOrderedDict? **
This depends on the scenario:
- If you just want to make the traversal order consistent with the insertion order, ordinary
dictIt's enough. - But if you need to move the recently inserted key to the front or last, or the code must be compatible with Python 3.6 and earlier, you must use
collections.OrderedDict。
Mastering dictionaries and collections, you will have a powerful tool to handle most efficient query and deduplication scenarios in Python. As long as you remember that "keys/elements must be immutable", you can avoid common pitfalls and use them with ease.

