Python sorting guide: sorted() function full coverage from entry to advanced usage

0. Write in front

Sorting is an operation that almost all developers use every day - whether it is organizing data reports, filtering Top N results, or beautifying list output, efficient and flexible sorting tools can be of great help. Python does not let us hand-write these bubbling, fast, and merging algorithms. Instead, it directly builds in the extremely optimized stable sorting algorithm Timsort (default implementation of Python 2.3+), and encapsulates it into the protagonist we are going to talk about today:sorted()

Next, we will step by step from the most basic integer/string sorting to complex objects, multi-level sorting, and performance optimization scenarios, demonstrating the entire process with reproducible code examples.


1. Basic introduction: sort the sequence directly

sorted()The core advantage lies in its strong versatility - it can accept strings, lists, tuples, generators, etc. all iterable objects, and always returns a new sorted list (the original input will not be modified, very safe!).

1.1 Integer/floating point sorting

By default, the values ​​are sorted from small to large (ascending order):

>>> sorted([36, 5, -12, 9, -21])   # 列表
[-21, -12, 5, 9, 36]

>>> sorted((2.5, 0.3, -7.2, 9.9))  # 元组
[-7.2, 0.3, 2.5, 9.9]

>>> sorted(x for x in range(10, 0, -2))  # 生成器
[2, 4, 6, 8, 10]

2. Advanced core: usekeyCustom sorting rules

If "default sorting" is an entry-level operation, then**keyThe parameter** issorted()The soul - it allows us to pass in a function, which will act on each element of the input to generate a "sort by value", and finallysorted()Only the size of these basis values ​​will be compared.

2.1 Simple numerical deformation sorting

For example, sort by absolute value in ascending order:

>>> sorted([36, 5, -12, 9, -21], key=abs)  # key=abs 表示用绝对值做依据
[5, 9, -12, -21, 36]

2.2 Avoiding pitfalls and optimization of string sorting

⚠️ Pitfalls: Sort by ASCII/Unicode code value by default

The Unicode code value of English uppercase and lowercase letters is that uppercase (A-Z: 65-90) is smaller than lowercase (a-z: 97-122), so direct sorting will put all words starting with uppercase at the front:

>>> sorted(['bob', 'about', 'Zoo', 'Credit'])
['Credit', 'Zoo', 'about', 'bob']

✅ Optimization: Ignore case sorting

Givekeypassstr.lower()orstr.upper()That’s it (choosing lower is more general, such as handling the upper and lower case of some non-English characters):

>>> sorted(['bob', 'about', 'Zoo', 'Credit'], key=str.lower)
['about', 'bob', 'Credit', 'Zoo']

3. Commonly used additions:reverse=TrueSort descending

Want to sort from largest to smallest? Add directlyreverse=TrueParameters, simple and direct:

>>> sorted([36, 5, -12, 9, -21], reverse=True)
[36, 9, 5, -12, -21]

>>> sorted(['bob', 'about', 'Zoo', 'Credit'], key=str.lower, reverse=True)
['Zoo', 'Credit', 'bob', 'about']

4. Advanced application: complex object sorting

During development, we rarely only sort pure integers/pure strings, but more complex data such as tuple lists, dictionary lists, and custom class instance lists. At this timekeyThe advantages are even more obvious.

First prepare a list of student tuples as test data:

students = [('Bob', 75), ('Adam', 92), ('Bart', 66), ('Lisa', 88), ('Alice', 75)]

4.1 Sorting tuple lists

Option 1: Customize ordinary functions

For example, sort by name in ascending order (ignoring case):

def get_student_lower_name(student):
    return student[0].lower()  # 取元组的第0个元素(名字)转小写

sorted_students_name = sorted(students, key=get_student_lower_name)
print(sorted_students_name)
# 输出: [('Adam', 92), ('Alice', 75), ('Bart', 66), ('Bob', 75), ('Lisa', 88)]

Option 2: Simplify with lambda expression

ifkeyThe function logic is very simple and can be written in one line. There is no need to define it separately, just uselambdaAnonymous functions are cleaner:

# 按名字升序(忽略大小写)
sorted_students_name = sorted(students, key=lambda s: s[0].lower())

# 按成绩升序
sorted_students_score = sorted(students, key=lambda s: s[1])
print(sorted_students_score)
# 输出: [('Bart', 66), ('Bob', 75), ('Alice', 75), ('Lisa', 88), ('Adam', 92)]

Option 3: UseoperatorModules improve performance

operatorThe module providesitemgetter()attrgetter()This type of function is specially used to quickly obtain the index value and attribute value of an object. It is 10%-20% faster than lambda expressions (more obvious for large data sets):

from operator import itemgetter

# 按名字升序(默认区分大小写;要忽略大小写仍需配合 lower)
sorted_students_name = sorted(students, key=itemgetter(0))

# 按成绩升序
sorted_students_score = sorted(students, key=itemgetter(1))

4.2 Multi-level sorting

What should you do if you encounter the requirement that "the first priority is in descending order by grades, and the second priority is in ascending order by name"?

In fact, it's very simple - using the bit-by-bit comparison feature of Python tuples (compare the 0th bit first, compare the 1st bit if equal, and so on), givekeyJust return a tuple containing multi-level evidence:

# 第一级:成绩降序 → 可以用 -成绩(仅适用于数值型)
# 第二级:名字升序 → 忽略大小写
sorted_students_multi = sorted(students, key=lambda s: (-s[1], s[0].lower()))
print(sorted_students_multi)
# 输出: [('Adam', 92), ('Lisa', 88), ('Alice', 75), ('Bob', 75), ('Bart', 66)]

useitemgetter()Multi-level sorting can also be implemented (but it can only handle priorities in the same direction. To mix ascending and descending order, you still need to combine lambda or separate reverse):

# 先按成绩升序,再按名字升序
sorted_students_multi_itemgetter = sorted(students, key=itemgetter(1, 0))

5. Performance and stability tips

Although Python's Timsort is already very fast, there are still a few points to pay attention to when processing millions of data:

5.1 Try to useoperatormodule functions

As mentioned earlier,itemgetter()/attrgetter()Faster than lambdas because they are implemented in C and have less interpreter overhead.

5.2 Try to avoid double countingkey

If the same data set needs to be sorted multiple times, don't letsorted()Recalculate each element'skey——Can be precalculated firstkey, save it as a tuple list, and then extract the original data after sorting (this is called the "decoration-sorting-de-decoration" mode. In fact, Timsort has been optimized internally, but it may be more obvious to deal with extremely large lists manually).

5.3 Taking advantage of the stability of Timsort

Python's sorting is stable sorting - if two elementskeyCompletely equal, their relative positions after sorting are consistent with the original list. For example, in the previous test data, both Alice and Bob scored 75. Bob was first in the original list, but the second priority was by name when sorting, so Alice came to the front; if the second priority was removed, the relative position of the original list would be retained:

sorted_students_stable = sorted(students, key=itemgetter(1))
print(sorted_students_stable)
# 输出: [('Bart', 66), ('Bob', 75), ('Alice', 75), ('Lisa', 88), ('Adam', 92)]  # Bob还在Alice前面

6. Summary

Pythonsorted()It is a built-in sorting function with comprehensive functions, excellent performance and friendly interface. The core points are as follows:

  1. High versatility: accept all iterable objects and return a new list (without modifying the original input)
  2. Soul parameterskey: Pass in the function to customize the sorting basis, supportedlambdasimplify,operatorspeed up
  3. Additional parametersreverse=True: Quickly implement descending order
  4. Multi-level sorting: Using the bit-by-bit comparison feature of Python tuples
  5. Stability: equalkeyThe elements retain their original relative positions

Once you master these, 99% of the sorting requirements in daily development can be easily solved ~