Elasticsearch practical tutorial
Have you ever worked on SQLLIKEQuerying is slow and giving you a headache? Need to quickly analyze massive amounts of unstructured data? Elasticsearch is a powerful tool to solve these problems - it is a distributed RESTful search and analysis engine built on Apache Lucene. It is suitable for full-text retrieval, log analysis, real-time data insights and other scenarios.
1. Get started quickly: installation and startup
We use Docker for deployment, which is the fastest way to start a local development environment without having to deal with system dependencies.
1.1 Docker single node deployment
First pull the official stable version image (this article uses 8.12.0):
Then run the container (Note: Security features are disabled for local development and must be enabled for production environments):
Wait about 30 seconds and check whether the startup is successful:
see containscluster_name、versionThe JSON response indicates success.
1.2 Docker Compose deployment
To facilitate subsequent management, usedocker-compose.ymldocument:
Start command:
2. Literacy of core concepts
Use relational-database as an analogy to quickly understand the core concepts of Elasticsearch:
The Type concept is obsolete after Elasticsearch 7.x and is no need to pay attention to.
3. Index management: define data structure
The index is the container of the document. Before creating the index, you need to define Mapping (similar to the table structure of the database).
3.1 Create blog post index
We create an index for storing blog posts, containing common fields:
text: Used for full-text retrieval, it will be split into terms by the word segmenter;keyword: Used for exact matching, sorting, and aggregation, without word segmentation;date/integer: time, numerical type, used for range query and aggregation. :::
Check whether the index is created successfully:
4. Document operation: CRUD basics
now we areblog_postsAdd, delete, modify, and query documents in the index.
4.1 New documents
You can specify the ID or let Elasticsearch automatically generate it:
4.2 Query documents
Get a single document based on ID:
4.3 Update documentation
Partially update the document (without rewriting the entire document):
4.4 Delete documents
Delete documents based on ID:
5. Search DSL: Core Features
Elasticsearch's Query DSL is its most powerful feature, using JSON to build complex search requests.
5.1 Basic search
1. Search all documents
2. Full text search (match)
existcontentSearch for documents containing "elasticsearch" in the field:
3. Multi-field search (multi_match)
existtitle、content、tagsSearch in three fields, wheretitleThe highest weight (^3):
4. Exact match (term)
Find documents whose author is "Zhang San" (must usekeywordfields):
5.2 Boolean query (bool)
Combine multiple conditions usingmust(AND)、filter(Filtering, does not affect scoring),should(OR):
:::tip Prioritize use of filter
Filter context does not calculate relevance scores, and the results are cached, and the performance is much higher thanmust。
5.3 Sorting results
In descending order of publication time, then in descending order of relevance score:
6. Python integration: practical development
Use officialelasticsearchPython client interacts with the cluster.
6.1 Install client
6.2 Basic connections and operations
First connect to the local cluster:
Then write several commonly used auxiliary functions:
6.3 Test code
Add an article and perform a search:
7. Rapid optimization and best practices
Here are some optimization suggestions that novices must know:
- Batch Index: Use
bulkAPI adds multiple documents at one time to reduce network overhead; - Restrict return fields: Use
_sourceGet only the required fields; - Use index aliases: Access the index through aliases to avoid downtime during re-indexing;
- Avoid over-sharding: Use 1-3 shards for small indexes;
- Monitor cluster status: Use
GET /_cluster/healthCheck status (green=normal, yellow=no replicas, red=abnormal).
Summarize
This article covers the core of Elasticsearch: quick installation, core concepts, indexing and document management, Query DSL search, and Python integration. After mastering this content, you can already add basic search functionality to your project.
We will continue to update advanced content such as aggregation analysis, cluster deployment, and security configuration in the future, so stay tuned!

