Embedding packaging explanation
LangChain provides an efficient development framework for developing custom applications based on LLM, allowing developers to quickly activate the powerful capabilities of LLM and build LLM applications. LangChain also supports Embeddings of a variety of large models, and has built-in calling interfaces for Embeddings of large models such as OpenAI and LLAMA. However, LangChain does not have all large models built-in. It provides strong scalability by allowing users to customize Embeddings types.
In this section, we take Zhipu AI as an example to describe how to customize Embeddings based on LangChain.
This part involves relatively more technical details of LangChain and large model calls. If you have the energy, you can learn to deploy it. If you don’t have the energy, you can directly use the subsequent code to support the calls.
To implement custom Embeddings, you need to define a custom class that inherits from LangChain's Embeddings base class, and then define two functions: ① embed_query method, used to embedding a single string (query); ② embed_documents method, used to embedding a list of strings (documents).
First we import the required third-party libraries:
Here we define a custom Embeddings class that inherits from the Embeddings class:
embed_documents is a method for calculating embedding for a string list (List[str]). Here we override this method and instantiate it when calling the verification environment.ZhipuAITo call the remote API and return the embedding results.
embed_queryIt is a method of calculating embedding for a single text (str). Here we call the just definedembed_documentsmethod and return the first sublist.
For the above method, you can add some content processing before requesting embedding. For example, if the text is particularly long, we can consider segmenting the text to prevent exceeding the maximum token limit. These are all possible, and it is up to everyone to use their own subjective initiative to improve it. Here is just a simple demo.
Through the above steps, we can define the calling method of embedding based on LangChain and Zhipu AI. We encapsulate this code in the zhipuai_embedding.py file.
The source code corresponding to this article is at 此处. If you need to reproduce it, you can download and run the source code.

