title: Python office automation: Excel file processing (Part 1) description: Master the core skills of operating traditional Excel (.xls) files in Python, and learn to use xlrd, xlwt and xlutils to realize automated data reading, writing, style customization and formula calculation.
Reading and writing Excel files with Python (Basics)
1. Excel introduction and technology selection
Excel is the "universal collaboration language" in the field of business data processing, and almost all positions involving reports and statistics will deal with it. But the repetitive manual operations are really a headache - manually merging 20 subsidiary exports every Friday.xlsWeekly reports, or sifting and summing thousands of rows of old financial data at the end of the month... These scenarios appear repeatedly in Daoman Python AI's practical coaching. The reason is simple: Many state-owned enterprises, small and medium-sized traditional enterprises' internal systems, and reports exported by old equipment are still forced to use the Excel 97-2003 standard (.xlsFormat). new versionopenpyxlThere is nothing you can do about such files, so mastering the library combination specifically for older versions of Excel has become a must-learn skill for office automation.
Quickly organize the tool chain
Python ecological pair.xlsThe document has a clear division of labor:
- Read old version (.xls):
xlrd——Open workbooks, locate cells, and handle special formats such as dates and merged cells. - Write old version (.xls):
xlwt——Create a new workbook, write data, set fonts/colors/borders, and even write Excel formulas. - Change to old version (.xls):
xlutils-- passcopyFunction handlexlrdThe read-only object is converted intoxlwtThe writable object realizes "modifying the original file".
One-click installation of environment
💡 Knock on the blackboard!
xlrdStarting from version 2.0.0 the support for.xlsxFormat support, in order to ensure that old version files can be read and written normally, you must specify it during installation.1.2.0This classic version.
After the installation is complete, we can officially start our automation journey.
2. Read Excel file
.xlsThe file reading logic is almost exactly the same as the process when we manually open the Excel software: Open the file (Workbook) → Find and switch to the specified worksheet (Sheet) → Locate the cell (Cell) → Process the special format in the cell. As long as you follow this idea, the code will be very clear.
Practical combat: Batch reading of Alibaba stock data in 2020 (simplified version)
Let's take a simplified "Alibaba stock trading day" data as an example. The file name is阿里巴巴2020年股票数据.xls, which contains fields such as date, opening price, closing price, etc. The core exercise in this example is date format conversion - this is alsoxlrdThe pit that beginners are most likely to fall into: Excel does not use a string like "2020-01-02" to store dates internally, but uses a pure number to record 1900-01-01 as 1, and then add 1 to the number every day after that.
The code goes directly below:
Code key points analysis:
formatting_info=True: This parameter is very critical, it allowsxlrdTry to retain the original format information when opening the file. When you later need to determine the cell type (such as date, with borders, etc.) or usexlutilsBe sure to bring it with you when modifying files.cell_type(row_idx, col_idx): Returns the data type constant of the cell, such asXL_CELL_DATEIt means that this cell is in date format.xldate_as_tuple(value, datemode): Convert Excel internal numbers to real date tuples.datemodeThe parameters are automatically provided by the workbook to distinguish between the 1900 date system (common) and the 1904 date system (some old files on Mac). Use directlywb.datemodeThere will be no error if you pass it in.
Run this code and you can see the neat date and corresponding stock price data. So far,.xlsThe file reading context has been opened up - no matter how many dates or rows of data are hidden in the table, it can be processed automatically using this set of ideas.
3. Summary of this article
Through this article, you have mastered:
.xlsThe necessity of format and tool chainxlrd/xlwt/xlutilspositioning;- use
xlrdOpen the workbook, locate the worksheet, and traverse all cells; - Handle the toughest date numbers in Excel and convert them into human-readable format.
These reading skills can already help you handle a large number of report automation tasks, such as batch summary, filtering by conditions, cross-file integration, etc. And in the next tutorial, we will continue to usexlwtcreate new.xlsFile, write data, and customize beautiful table styles to truly open up the closed loop of "read-write-modify".

