title: Interview questions in depth: operating system description: In Python interviews, language features are a must. Understanding these underlying mechanisms will not only help you cope with interviews, but also avoid writing bugs that are difficult to troubleshoot in actual development.

Although this article is called Interview Questions Explained: Operating System, in interviews with Python or even full-stack backends, the underlying system internal skills other than language features are the key to big manufacturers' screening and actual avoidance of pitfalls - for example, when you write asynchronous web services, the upper limit of the number of connected cards, non-blocking reading and writing, data loss, and multi-process program resource grabbing, 90% of the root causes can be traced to this knowledge.

Without further ado, let’s break down 7 high-frequency interview test points👇


1. I/O multiplexing: select, poll and epoll

💡 Core test points directly: Why do all Linux high-performance network applications (Redis, Nginx, and early Tornado) choose epoll?

Simply put, I/O multiplexing is the mechanism of "a process manages a bunch of I/O events at the same time": when the kernel finds that the Socket you specified has data readable, TCP connection establishment is completed, etc., it will proactively send a notification.

However, the performance and applicable scenarios of these three mechanisms are very different. It is more intuitive to look at the comparison table directly:

Featuresselectpollepoll (exclusive for Linux, the ultimate first choice)
Underlying data structureArray (fixed sizefd_set)linked list (dynamicpollfdArray)Red-black tree (manage all registered fd) + ready list (directly retrieve the results)
Maximum number of connectionsHard limit! Usually the kernel default is 1024No hard limit (affected by system memory and process fd upper limit)No hard limit (same as poll)
Traversal efficiencyEvery call must fully scan all registered fd, regardless of whether there is an eventLike select, full pollingThe kernel directly inserts the fd that triggered the event into the ready list, and the returned result is a valid result, no need to find it yourself
Memory copy overheadEach call must transfer the user spacefd_setCopy the full amount to the kernel, and then copy it backLike select, full two-way copyUse mmap shared memory to avoid repeated copies between user mode and kernel mode

Brief summary: select and poll are both "stupid methods". They poll all fds and cannot handle multiple CPU connections. However, epoll uses red-black trees for efficient management, only returns ready fds, and uses mmap to reduce copies, which is perfect for high-concurrency scenarios.


2. Process scheduling algorithm

💡 Core test points directly: How does the operating system "accurately assign troops" to the CPU?

Process scheduling is the core module of "Multiple processes compete for the CPU, and the OS acts as the referee to allocate time slices". Common classic/practical algorithms are sorted out according to timeline + complexity:

  1. First come, first served (FCFS) The most primitive algorithm queues in the order of arrival in the CPU ready queue. The shortcoming is fatal: Long mission exclusivity will hurt short missions - Just like when you are picking up tickets at a movie theater, and there is a team building group in front of you who buys 100 tickets, you have to wait for 10 minutes to buy 1 ticket. This is called the "convoy effect".

  2. Short Jobs First (SJF) Theoretically, the average waiting time is the shortest, but there is a fatal problem - the task duration needs to be known in advance, and it is difficult to completely reproduce it in the actual system.

  3. Time Slice Rotation (RR) Basic scheduling ideas for modern desktop/server OS! Allocate a fixed small time slice (such as 10 to 100ms) to each ready process, and force it to be queued to the end of the queue after it is used up, even if the task has not been completed. This ensures that each short task can receive a quick response and will not be "starved" by long tasks.

  4. Multi-Level Feedback Queue (MLFQ) The ultimate solution for modern OS that actually works! Combining the advantages of all previous algorithms:

  • Set up multiple queues with different priorities. The higher the priority, the smaller the time slice;
  • New processes enter the highest priority queue by default;
  • If the time slot is used up and the task is not completed, it will be downgraded to one level;
  • Actively give up the CPU (such as waiting for I/O) and retain the current priority;
  • Periodically "raise" processes in the low-priority queue to high-priority to prevent starvation.

3. Deadlock

💡 Core test points directly: Four "one without one" conditions for deadlock + how to break the situation?

Deadlock is a deadlock in which "multiple processes occupy the resources that each other needs, and no one can move". For example, when two people are eating, A takes chopsticks A and waits for spoon B, and B takes spoon B and waits for chopsticks A. They are just frozen.

Four necessary conditions

Note that this is a necessary condition - it does not necessarily lead to a deadlock, but all deadlocks must be met:

  1. Mutually exclusive condition: Resources can only be used by one process at a time (for example, two people cannot hold chopsticks at the same time).
  2. Request and retention conditions: Hold on to the resources in your hand and request new resources.
  3. No deprivation condition: You cannot forcibly take away resources from others, you can only wait for the other party to release them voluntarily.
  4. Loop waiting condition: A "circular resource request chain" is formed between processes (A waits for B, B waits for C, C waits for A).

Breaking Strategy

The game can be broken by breaking any one of the four conditions:

  • Deadlock prevention: The most direct but relatively "violent", such as stipulating that all resources must be requested in order (destroying loop waiting), or requesting all resources at once when the process starts (destroying requests and holds).
  • Avoid Deadlock: A more "mild" solution, such as the classic Banker's Algorithm - before each allocation of resources, simulate "whether the system still has enough resources after allocation to allow all processes to complete successfully", and if not, reject the request.

4. Program construction: from C source code to executable binary

💡 Core test points directly: What is done in each step of preprocessing, compilation, assembly, and linking?

Although we usually directly interpret and execute Python when writing it, the Python interpreter itself is a binary file compiled from C/C++. Understanding this process can also help you understand many underlying optimizations (such as header file protection and the difference between static/dynamic libraries).

Let's take a simple C codemain.cFor example:

// 这是注释,会被删掉
#include <stdio.h>  // 包含标准输入输出头文件
#define PI 3.14     // 定义宏

int main() {
    printf("PI is %.2f\n", PI);
    return 0;
}

The build process is divided into four steps:

  1. Pre-processing: Usegcc -E main.c -o main.iGenerate preprocessed files:
  • Delete all comments;
  • Expand all macros (e.g. putPIReplace with3.14);
  • Insert the contents of the included header file intact (stdio.hOnce expanded, there will be tens of thousands more lines of code).
  1. Compilation: Usegcc -S main.i -o main.sGenerate assembly language file:
  • Do lexical analysis first (split the code into "words", for exampleintmain();
  • Do grammatical analysis again (check if there are any grammatical errors, such as missing parentheses);
  • Finally, convert the preprocessed C code into human-readable assembly language.
  1. Assembly: Usegcc -c main.s -o main.oGenerate machine code target file:
  • Convert each assembly language instruction directly into binary machine code that the CPU can recognize;
  • Generate a symbol table (record the addresses corresponding to function names and variable names, but the addresses before linking are all false).
  1. Linking: usegcc main.o -o mainGenerate executable binary file:
  • Put multiple target files (such asmain.o+Written by myselfutils.o)merge;
  • Put the required libraries (such as the standard C librarylibc) merge in (static linking), or only record the path and function name of the library (dynamic linking);
  • Resolve all symbol references (replace false addresses with real memory addresses).

5. Virtual memory: paging and segmentation

💡 Core test points directly: Why cannot memory be "continuously allocated" to each process?

Early operating systems allocate memory continuously. For example, process A occupies 0100MB and process B occupies 100200MB. However, two big problems soon appeared:

  • External fragmentation: There are many small empty fragments in the middle, which add up to 100MB, but the continuous space is not enough and the new process cannot use it;
  • Security issue: Process A can directly access the memory of process B, which is easy to cause damage.

Virtual memory perfectly solves these two problems. It "draws a big pie" for each process - each process thinks that it occupies the entire memory space, but in fact the OS maps the virtual address and the physical address.

There are two common mapping methods:

  1. Paging: Physical memory management method.
  • Cut physical memory into small blocks of fixed size, called "page frames";
  • Cut the virtual address space of the process into small pieces of the same size, called "pages";
  • Use "page table" to record the correspondence between virtual pages and physical page frames;
  • Completely eliminated external fragmentation (internal fragmentation up to the size of a page, such as 4KB).
  1. Segmentation: Logical memory management method.
  • Divide the virtual address space according to the logical structure of the program, such as code segment, data segment, and stack segment;
  • Convenient for different processes to share the same piece of code (for example, both processes uselibcofprintf, no need to make two copies);
  • Convenient permission protection (for example, code segments can only be read but not written).
  1. Page Fault: When the virtual page accessed by the process is not in the physical memory, the OS will trigger an interrupt and "swap" the corresponding page on the disk into the physical memory (if the physical memory is full, an old page must be "kicked away", see the next section).

6. Page replacement algorithm

💡 Core test point directly: The physical memory is full, which old page is the most "cost-effective" to kick out?

When a page fault interrupt is triggered but the physical memory is full, a page replacement algorithm is needed to make a decision - the goal is to minimize the number of future page faults (a page fault requires access to the disk, which is hundreds of thousands of times slower than accessing the memory!).

Common algorithms:

  1. FIFO (First In, First Out) The simplest is to kick out the page that first entered the physical memory, but the performance is very poor - it may kick out frequently used pages (for example, your commonly used browser page was kicked out because it was opened early, and it will have to be reloaded next time you open it).

  2. LRU (Least Recently Used) The most commonly used algorithm in the industry, the core idea is "pages that have not been used in the past period of time will most likely not be used in the future." To implement, you need to record the last access time of each page, or use a stack/linked list to maintain it.

  3. LFU (Least Used) Count the number of visits to each page and kick out the one with the lowest number of visits. The disadvantage is "heavy historical baggage" - for example, a page was used 1,000 times yesterday and not once today, but it will still be retained.

  4. Clock (clock algorithm) The low-cost approximate implementation of LRU adds an "access bit" to each page and scans it cyclically with a pointer: if the access bit is 1, change it to 0 and continue; if the access bit is 0, kick it out directly.


7. Trigger mode: horizontal trigger (LT) vs edge trigger (ET)

💡 Core test points directly: Why is the default mode of epoll is LT, but Nginx uses ET?

These two modes are unique to epoll (select/poll only has LT). The difference between them is that the timing of the kernel sending you notifications is different:

Level Triggered

  • As long as the buffer has data to read/space to write, the kernel will keep urging you to process until the buffer is empty/full;
  • The advantage is that it is simple to develop, even if you miss a little bit of data, it will be called next timeepoll_waitYou will also receive notifications;
  • The disadvantage is that the number of repeated triggers is high, and if the processing speed is slow, CPU resources may be wasted.

Edge Triggered

  • Only notify once when the status changes (for example, the buffer changes from empty to data, from full to space);
  • The advantage is very few triggers, higher performance, and suitable for high concurrency scenarios;
  • But the requirements are extremely high:
  1. Must cooperate with Non-blocking I/O - otherwise if the buffer is not read/full, the process will be stuck;
  2. You must read in a loop until EAGAIN is returned, and write in a loop until EAGAIN is returned or the writing is completed - otherwise the missed data will not be notified next time and will be lost directly!

Therefore, the default LT is to allow novices to write correctly, and projects like Nginx that pursue ultimate performance will use a combination of ET + non-blocking I/O + loop reading and writing.