💡 Do you often need to save temporary configurations, crawler results or API response cache during development? It’s too heavy to use a database, and it’s troublesome to parse it manually with ordinary text? Today’s protagonist—Python standard libraryjson, add some third-party gadgets to easily handle lightweight JSON text storage and interaction~
JSON processing tutorial in Python
JSON (JavaScript Object Notation) is a lightweight data exchange format that is easy to read and parse for both humans and machines. It is now a "universal language" used in almost all scenarios. This tutorial takes you from basics to advanced to quickly master JSON processing in Python.
📌 JSON Basics
JSON data structure
JSON only supports 6 core data structures, which is very concise:
- Object 🗂️: curly braces
{}Wrapped key-value pair, the key must be a double-quoted string - Array 📊: square brackets
[]Wrapped ordered collection of values, mixed types possible (but specification recommended) - Value: can be a string (double quotes), number, Boolean value (
true/false)、null, object or array
🔄 Correspondence between JSON and Python types
PythonjsonThe module will automatically do bidirectional mapping, just remember this table clearly:
📖 Read JSON data
Loading from string (json.loads)
loads= load string, suitable for processing strings returned by API, multi-line configuration text, etc.
Load from file (json.load)
load= load file object, remember to usewithStatement ** automatically closes the file.
✍️ Write JSON data
Convert to JSON string (json.dumps)
dumps= dump string, three must-use practical parameters should be remembered:
ensure_ascii=False: Reserve Chinese and other non-ASCII characters (if not added, it will become\uXXXXGarbled characters)indent=2or4:Set the indentation level to make it easier for people to read.sort_keys=True: Sort by dictionary key, the output result is more stable (easy to compare)
Write to JSON file (json.dump)
dump= dump file object, parameters anddumpsconsistent.
🚀 Advanced usage
Processing custom classes/datetime objects
standardjsonModules cannot directly serialize custom classes (such asUser)ordatetime, two methods can be used:
Method 1: CustomizedefaultFunction (simple and flexible)
existdumps/dumpwhen passed indefault, tells the module how to handle unknown objects:
Method 2: Inheritancejson.JSONEncoder(Suitable for packaging)
If the same serialization logic is used in multiple places, it can be encapsulated into a subclass and passed inclsparameter:
Automatically restore special types when parsing (object_hook)
When reading JSON, each time a JSON object (that is, Python'sdict), will be calledobject_hookfunction, here we can restore the time in ISO format todatetime, or put ordinarydictRestore toUser:
Streaming large JSON files (usingijson)
If the JSON file exceeds the memory limit (such as several G of log or crawler data), you can use a third-party libraryijsonRead chunk by chunk instead of loading the entire file at once:
✅ Best Practices
- Encoding issues must be paid attention to: read and write files
encoding='utf-8', serialization plusensure_ascii=False - Required for file operations
with: Automatically manage file handles to prevent leaks - Error handling is required: at least capture
FileNotFoundError(File does not exist),json.JSONDecodeError(Parse error, Python 3.5+),TypeError(serialization error) - Streaming must be used for large data:
ijson、pandas.read_json(chunksize=...)All are good choices - Security must be considered: Do not parse JSON from untrusted sources to prevent malicious code injection (although Python
jsonmodule ratioevalSafe, but still cautious)
🎉 Summary
Python's standard libraryjsonIt can already cover 90% of daily JSON processing needs, withijsonGadgets such as lightweight text storage, API interaction, and temporary data caching can be easily handled~ Next time you encounter such needs, don’t rush to open a database!

