title: filter description: filter() is Python's built-in higher-order function for filtering elements in a sequence. It takes a function and an iterable object as arguments, and returns an iterator (in Python 3) containing all elements for which the function returns True.

Python filter() function detailed explanation and practical cases

In the Python functional programming toolbox,filter()It is one of the core tools for handling sequence "filtering" scenarios - it does not modify elements, but only accurately retains or eliminates them according to rules, andmap()are two complementary commonly used higher-order functions.

1. Filter() function basics

Core definition

filter()It is a built-in higher-order function in Python that receives two parameters:

  1. Determination function: returnTrue/False(or an object implicitly converted to a Boolean value, such as the empty string,0NoneAllFalse
  2. Iterable objects: such as lists, tuples, generators, strings, etc.

In Python 3,filter()The filtered sequence will not be returned directly, but a lazy iterator will be returned - only when it is actually traversed (such asnext()forThe result is calculated only when looping or converting into a list/tuple), which can significantly save the memory of large data sets.

Basic syntax

filter(function, iterable)

Core differences with map()

We use tables to compare more intuitively:

Higher-order functionsCore functionsReturn content (Python 3)Example scenarios
map()Batch conversion each elementConverted iteratorSquare numbers, convert strings to uppercase
filter()Batch filter elements that meet the conditionsIterator composed of original elementsFilter odd numbers, filter invalid data

2. Basic introduction: two simple scenarios

Scenario 1: Filter odd numbers in the list

Here we first use an explicit custom function to make it easier to understand the decision logic:

def is_odd(n):
    return n % 2 == 1  # 余数为1 → 奇数 → 保留

numbers = [1, 2, 4, 5, 6, 9, 10, 15]
# 用 list() 把惰性迭代器转为可视化的列表
filtered_numbers = list(filter(is_odd, numbers))
print(filtered_numbers)  # 输出: [1, 5, 9, 15]

Scenario 2: Filter invalid strings (null values, all spaces)

When processing crawlers or form data, you often encounter empty content that needs to be cleaned.filter()It’s easy to handle:

def is_valid_str(s):
    # s.strip() 可以去掉首尾空格,如果结果是空 → s.strip() 是 False,整体返回 False → 剔除
    return s and s.strip()

raw_data = ['A', '', 'B', None, 'C', '   ', 'D']
valid_data = list(filter(is_valid_str, raw_data))
print(valid_data)  # 输出: ['A', 'B', 'C', 'D']

3. Advanced features: the charm of lazy evaluation

Why Python 3filter()Changed to return iterator? The core is memory optimization - there is no need to load all qualifying elements into memory at once.

For example, we can step by step observe the execution process of the iterator:

# 第一步:定义 filter() 迭代器
# 此时函数 is_odd 还没有被调用过!
numbers = [3, 6, 2, 8, 10]
filtered = filter(lambda x: print(f"正在检查: {x}") or x > 5, numbers)

# 第二步:用 next() 逐步取元素
print("第一次取:", next(filtered))  # 输出「正在检查: 3」→ 跳过 → 继续「正在检查: 6」→ 取6
print("第二次取:", next(filtered))  # 跳过「正在检查: 2」→ 取8
print("第三次取:", next(filtered))  # 取10

# 第四步:再取的话会触发 StopIteration(迭代器耗尽)
# print(next(filtered))  # 取消注释会报错

Tip: This "whatever you use" lazy evaluation mechanism is particularly useful when processing large files, database cursors, or infinite sequences - you don't need to wait for all the data to be ready before starting processing.


4. Everyday simplification: combine lambda expressions

If the decision logic is very simple (no need for reuse), there is no need to write a separate custom function. lambda + filter is the golden combination:

# 过滤偶数
numbers = [1, 2, 3, 4, 5, 6]
evens = list(filter(lambda x: x % 2 == 0, numbers))
print(evens)  # 输出: [2, 4, 6]

# 过滤长度≥5的单词
words = ["apple", "banana", "cat", "dog", "elephant"]
long_words = list(filter(lambda w: len(w) >= 5, words))
print(long_words)  # 输出: ['apple', 'banana', 'elephant']

5. Classic practical case: Sieve of Eratosthenes (generating infinite prime numbers)

This isfilter()An excellent example combined with an infinite generator - it can efficiently generate any range of prime numbers with very low memory usage:

def infinite_primes():
    """生成无限素数的生成器"""
    # 第一步:生成从3开始的无限奇数序列(偶数除了2都不是素数,先排除)
    def _odd_iter():
        n = 1
        while True:
            n += 2
            yield n

    # 第二步:定义「是否不能被n整除」的判定函数
    def _not_divisible(n):
        return lambda x: x % n != 0

    # 先返回唯一的偶素数2
    yield 2
    # 初始化奇数生成器
    odd_seq = _odd_iter()
    while True:
        # 取当前奇数序列的第一个数 → 一定是素数
        current_prime = next(odd_seq)
        yield current_prime
        # 用 filter() 剔除能被 current_prime 整除的数 → 构造新的无限序列
        odd_seq = filter(_not_divisible(current_prime), odd_seq)

# 测试:打印100以内的素数
print("100以内的素数:")
for p in infinite_primes():
    if p < 100:
        print(p, end=' ')
    else:
        break
# 输出: 2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97

Key points to understand: The whole process is like sifting through layers - the first layer removes even numbers (except 2), the second layer removes multiples of 3, the third layer removes multiples of 5...use it every timefilter()Apply a new layer of filtering conditions, and the infinite sequence is "screened" cleanly.


6. Modern Python Alternative: List/Generator Expressions

Althoughfilter()It's classic, but for simple filters that require only one result, list comprehensions or generator expressions are usually more readable (in line with Python's "readability first" principle):

numbers = [3, 6, 2, 8]

# 方案1:filter() + lambda
result1 = list(filter(lambda x: x > 5, numbers))

# 方案2:列表推导式(更推荐给新手/简单场景)
result2 = [x for x in numbers if x > 5]

# 方案3:生成器表达式(等价于 filter() 的惰性特性,大数据集用)
result3 = (x for x in numbers if x > 5)

# 三个方案结果相同
print(result1, result2, list(result3))  # 输出: [6, 8] [6, 8] [6, 8]

One sentence suggestion: If your filtering condition is just a line of expression, use derivation; if the decision logic requires multi-step calculation or will be reused in multiple places, usefilter()+ Custom functions.


7. Hands-on exercise: screening palindromes

Palindrome numbers refer to numbers that are read the same forward and back (such as 121, 1331). We can usefilter()Quick implementation:

def is_palindrome(n):
    """判断一个数是否是回文数"""
    s = str(n)
    return s == s[::-1]  # 切片 [::-1] 可以反转字符串

# 练习1:筛选1~1000的回文数
palindromes_1k = list(filter(is_palindrome, range(1, 1000)))
print("1~1000的回文数前15个:", palindromes_1k[:15])

# 练习2:简单验证(测试1~200的回文数是否符合预期)
test_data = range(1, 200)
expected = [1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 22, 33, 44, 55, 66, 77, 88, 99,
            101, 111, 121, 131, 141, 151, 161, 171, 181, 191]
if list(filter(is_palindrome, test_data)) == expected:
    print("✅ 回文数筛选测试成功!")
else:
    print("❌ 回文数筛选测试失败!")

8. Performance and usage scenario suggestions

Performance comparison

  • Memory efficiency:filter()and generator expressions are better than list comprehensions (the latter two do not require all results to be prestored)
  • Speed ​​efficiency: The difference between the three is very small in simple scenarios, and the list comprehension may be slightly faster (because no additional function call overhead is required)

Use scene selection

ScenarioRecommended solution
Simple filtering, the results need to be used multiple timesList comprehension
Complex/reusable decision logicfilter()+ Custom function
Large data sets and results only need to be traversed oncefilter()or generator expression

9. Summary

filter()It is an indispensable filtering tool in Python functional programming, but it does not need to be used blindly in modern Python development - readability always comes first.

Core Points Review:

  1. Receive "decision function" and "iterable object"
  2. Python 3 returns lazy iterator, memory friendly
  3. Can be used in combination with lambda, custom functions, and unlimited generators
  4. Simple scenarios prioritize list/generator expressions

10. Further reading

  • Python official documentation: Built-in Functions — filter()
  • Additional filtering tools:itertools.filterfalse()(The return judgment function isFalseelements)
  • The Three Musketeers of Functional Programming:map()filter()functools.reduce()(Python 3 needs to be imported)