Chapter 4 Building RAG Applications
4.1 Connect LLM to LangChain
LangChain provides an efficient development framework for developing custom applications based on LLM, allowing developers to quickly activate the powerful capabilities of LLM and build LLM applications. LangChain also supports a variety of large models and has built-in calling interfaces for large models such as OpenAI and LLAMA. However, LangChain does not have all large models built-in. It provides strong scalability by allowing users to customize LLM types.
4.1.1 Call ChatGPT based on LangChain
LangChain provides encapsulation of a variety of large models. The interface based on LangChain can easily call ChatGPT and integrate it into personal applications built with LangChain as the basic framework. Here we briefly describe how to use the LangChain interface to call ChatGPT.
Note that calling ChatGPT based on the LangChain interface also requires configuring your personal key. The configuration method is the same as above.
fromlangchain.chat_modelsimportOpenAIdialogue modelChatOpenAI. Except for OpenAI,langchain.chat_modelsOther dialogue models are also integrated, see Langchain官方文档 for more details.
If langchain-openai is not installed, please run the following code first!
Next you need to instantiate a ChatOpenAI class. You can pass in hyperparameters when instantiating to control the answer, for exampletemperatureparameter.
The cell above assumes that your OpenAI API key is set in an environment variable, if you wish to specify the API key manually, use the following code:
As you can see, the ChatGPT-3.5 model is called by default. In addition, several commonly used hyperparameter settings include:
· model_name: The model to be used, the default is ‘gpt-3.5-turbo’, and the parameter settings are consistent with the OpenAI native interface parameter settings.
· temperature: temperature coefficient, the value is the same as the native interface.
· openai_api_key: OpenAI API key. If you do not use environment variables to set the API Key, you can also set it during instantiation.
· openai_proxy: Set the proxy. If you do not use environment variables to set the proxy, you can also set it at instantiation time.
· Streaming: Whether to use streaming, that is, output the model answer verbatim. The default is False, which will not be described here.
· max_tokens: The maximum number of tokens output by the model. The meaning and value are the same as above.
When we initialize theLLMAfter that, we can try to use it! Let’s ask “Please introduce yourself!”
AIMessage(content='Hello, I am an intelligent assistant dedicated to providing you with various services and help. I can answer your questions, provide information and suggestions, and help you solve problems. If you have any needs, please feel free to tell me and I will try my best to help you. Thank you for choosing me as your assistant! If you have any questions, please feel free to ask me.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 109, 'prompt_tokens': 20, 'total_tokens': 129, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-c611b32a-4adf-47af-9b97-6dda68a117e1-0', usage_metadata={'input_tokens': 20, 'output_tokens': 109, 'total_tokens': 129})
When we develop large model applications, in most cases the user's input is not passed directly to the LLM. Typically they add user input to a larger text called提示模板, this text provides additional context about the specific task at hand.
PromptTemplates helps solve this problem! They bundle all logic from user input to fully formatted prompts. This can be started very simply - for example, the tip for generating the string above is:
We need to construct a personalized Template first:
Next, let's take a look at the completed prompt template:
'Please translate the text separated by three backticks into English! text:我带着比身体重的行李,游入尼罗河底,经过几道闪电 看到一堆光圈,不确定是不是这里。\n'
We know that the interface of the chat model is based on messages, not raw text. PromptTemplates can also be used to generate message lists. In this example,promptIt not only contains the input content information, but also contains eachmessageinformation (role, position in the list, etc.). Typically, aChatPromptTemplateis aChatMessageTemplatelist. eachChatMessageTemplateContains instructions for formatting this chat message (its role as well as its content).
Let's look at an example together:
[SystemMessage(content='You are a translation assistant who can help me translate Chinese into English.', additional_kwargs={}, response_metadata={}), HumanMessage(content='I swam into the bottom of the Nile River with luggage that was heavier than my body. After several lightning bolts, I saw a bunch of circles of light, not sure if they were here.', additional_kwargs={}, response_metadata={})]
Next let us call the definedllmandmessagesTo output the answer:
OutputParsers convert the raw output of the language model into a format that can be used downstream. There are several main types of OutputParsers, including:
- Convert LLM text to structured information (e.g. JSON)
- Convert ChatMessage to string
- Convert extra information returned by calls other than messages (such as OpenAI function calls) to strings
Finally, we pass the model output tooutput_parser, it is aBaseOutputParser, which means it accepts a String or a BaseMessage as input. StrOutputParser is particularly simple to convert any input into a string.
As can be seen from the above results, we successfully used the output parser toChatMessageThe output of type resolves to字符串
We can now combine all of this into a chain. This chain will take the input variables, pass those variables to the prompt template to create the prompt, pass the prompt to the language model, and then pass the output through the (optional) output parser. Next we will use LCEL syntax to quickly implement a chain. Let’s see it in action!
Let’s test another example:
'I dived to the bottom of the Nile carrying luggage heavier than my body. After passing through a few bolts of lightning, I saw a bunch of rings and wasn't sure if this was the destination. '
What is LCEL? LCEL (LangChain Expression Language, Langchain's expression language), LCEL is a new syntax and an important addition to the LangChain toolkit. It has many advantages, making it easier and more convenient for us to deal with LangChain and agents.
- LCEL provides asynchronous, batch and stream processing support so that code can be quickly ported across different servers.
- LCEL has backup measures to solve the problem of LLM format output.
- LCEL increases the parallelism of LLM and improves efficiency.
- LCEL has built-in logging, which helps understand the operation of complex chains and agents even if the agent becomes complex.
Usage examples:
chain = prompt | model | output_parser
In the code above we use LCEL to piece together the different components into a chain where user input is passed to the prompt template, then the prompt template output is passed to the model, and then the model output is passed to the output parser. The notation of | is similar to the Unix pipe operator, which links different components together, using the output of one component as the input of the next component.
4.1.2 Use LangChain to call Baidu Wenxinyiyan
We can also call the Baidu Wenxin large model through the LangChain framework to integrate the Wenxin model into our application framework.
res = llm("Hello, please introduce yourself!")
[ERROR][2025-03-05 19:41:13.835] base.py:134 [t:8258539328]: http request url https://qianfan.baidubce.com/wenxinworkshop/service/list failed with http status code 403
error code from baidu: IamSignatureInvalid
error message from baidu: IamSignatureInvalid, cause: Could not find credential.
request headers: {'User-Agent': 'python-requests/2.32.3', 'Accept-Encoding': 'gzip, deflate, zstd', 'Accept': '/', 'Connection': 'keep-alive', 'Content-Type': 'application/json', 'Host': 'qianfan.baidubce.com', 'request-source': 'qianfan_py_sdk_v0.4.12.3', 'x-bce-date': '2025-03-05T11:41:13Z', 'Authorization': 'bce-auth-v1//2025-03-05T11:41:13Z/300/x-bce-date;host;request-source;content-type/cc383f75803c577d6486841dc228aea994102a4b70bd5ff76f27d12bdb7af133', 'Content-Length': '2'}
request body: '{}'
response headers: {'Content-Length': '0', 'Date': 'Wed, 05 Mar 2025 11:41:13 GMT', 'X-Bce-Error-Code': 'IamSignatureInvalid', 'X-Bce-Error-Message': 'IamSignatureInvalid, cause: Could not find credential.', 'X-Bce-Exception-Point': 'Gateway', 'X-Bce-Gateway-Region': 'BJ', 'X-Bce-Request-Id': '17059251-201a-4a1f-8fbc-220df83ff184', 'Content-Type': 'text/plain; charset=utf-8'}
response body: b''
[WARNING][2025-03-05 19:41:13.835] base.py:1083 [t:8258539328]: fetch_supported_models failed: http request url https://qianfan.baidubce.com/wenxinworkshop/service/list failed with http status code 403
error code from baidu: IamSignatureInvalid
error message from baidu: IamSignatureInvalid, cause: Could not find credential.
request headers: {'User-Agent': 'python-requests/2.32.3', 'Accept-Encoding': 'gzip, deflate, zstd', 'Accept': '/', 'Connection': 'keep-alive', 'Content-Type': 'application/json', 'Host': 'qianfan.baidubce.com', 'request-source': 'qianfan_py_sdk_v0.4.12.3', 'x-bce-date': '2025-03-05T11:41:13Z', 'Authorization': 'bce-auth-v1//2025-03-05T11:41:13Z/300/x-bce-date;host;request-source;content-type/cc383f75803c577d6486841dc228aea994102a4b70bd5ff76f27d12bdb7af133', 'Content-Length': '2'}
request body: '{}'
response headers: {'Content-Length': '0', 'Date': 'Wed, 05 Mar 2025 11:41:13 GMT', 'X-Bce-Error-Code': 'IamSignatureInvalid', 'X-Bce-Error-Message': 'IamSignatureInvalid, cause: Could not find credential.', 'X-Bce-Exception-Point': 'Gateway', 'X-Bce-Gateway-Region': 'BJ', 'X-Bce-Request-Id': '17059251-201a-4a1f-8fbc-220df83ff184', 'Content-Type': 'text/plain; charset=utf-8'}
response body: b''
[ERROR][2025-03-05 19:41:14.033] base.py:134 [t:8258539328]: http request url https://qianfan.baidubce.com/wenxinworkshop/service/list failed with http status code 403
error code from baidu: IamSignatureInvalid
error message from baidu: IamSignatureInvalid, cause: Could not find credential.
request headers: {'User-Agent': 'python-requests/2.32.3', 'Accept-Encoding': 'gzip, deflate, zstd', 'Accept': '/', 'Connection': 'keep-alive', 'Content-Type': 'application/json', 'Host': 'qianfan.baidubce.com', 'request-source': 'qianfan_py_sdk_v0.4.12.3', 'x-bce-date': '2025-03-05T11:41:13Z', 'Authorization': 'bce-auth-v1//2025-03-05T11:41:13Z/300/x-bce-date;host;request-source;content-type/cc383f75803c577d6486841dc228aea994102a4b70bd5ff76f27d12bdb7af133', 'Content-Length': '2'}
request body: '{}'
response headers: {'Content-Length': '0', 'Date': 'Wed, 05 Mar 2025 11:41:14 GMT', 'X-Bce-Error-Code': 'IamSignatureInvalid', 'X-Bce-Error-Message': 'IamSignatureInvalid, cause: Could not find credential.', 'X-Bce-Exception-Point': 'Gateway', 'X-Bce-Gateway-Region': 'BJ', 'X-Bce-Request-Id': '09088182-5e6e-4725-bd6c-e1f476287b34', 'Content-Type': 'text/plain; charset=utf-8'}
response body: b''
[WARNING][2025-03-05 19:41:14.034] base.py:1083 [t:8258539328]: fetch_supported_models failed: http request url https://qianfan.baidubce.com/wenxinworkshop/service/list failed with http status code 403
error code from baidu: IamSignatureInvalid
error message from baidu: IamSignatureInvalid, cause: Could not find credential.
request headers: {'User-Agent': 'python-requests/2.32.3', 'Accept-Encoding': 'gzip, deflate, zstd', 'Accept': '/', 'Connection': 'keep-alive', 'Content-Type': 'application/json', 'Host': 'qianfan.baidubce.com', 'request-source': 'qianfan_py_sdk_v0.4.12.3', 'x-bce-date': '2025-03-05T11:41:13Z', 'Authorization': 'bce-auth-v1//2025-03-05T11:41:13Z/300/x-bce-date;host;request-source;content-type/cc383f75803c577d6486841dc228aea994102a4b70bd5ff76f27d12bdb7af133', 'Content-Length': '2'}
request body: '{}'
response headers: {'Content-Length': '0', 'Date': 'Wed, 05 Mar 2025 11:41:14 GMT', 'X-Bce-Error-Code': 'IamSignatureInvalid', 'X-Bce-Error-Message': 'IamSignatureInvalid, cause: Could not find credential.', 'X-Bce-Exception-Point': 'Gateway', 'X-Bce-Gateway-Region': 'BJ', 'X-Bce-Request-Id': '09088182-5e6e-4725-bd6c-e1f476287b34', 'Content-Type': 'text/plain; charset=utf-8'}
response body: b''
[ERROR][2025-03-05 19:41:14.216] base.py:134 [t:8258539328]: http request url https://qianfan.baidubce.com/wenxinworkshop/service/list failed with http status code 403
error code from baidu: IamSignatureInvalid
error message from baidu: IamSignatureInvalid, cause: Could not find credential.
request headers: {'User-Agent': 'python-requests/2.32.3', 'Accept-Encoding': 'gzip, deflate, zstd', 'Accept': '/', 'Connection': 'keep-alive', 'Content-Type': 'application/json', 'Host': 'qianfan.baidubce.com', 'request-source': 'qianfan_py_sdk_v0.4.12.3', 'x-bce-date': '2025-03-05T11:41:14Z', 'Authorization': 'bce-auth-v1//2025-03-05T11:41:14Z/300/x-bce-date;host;request-source;content-type/34d382a0332f9213819d512ca7cd9bf264d3126e0454764341daa2ed7c9bf1bb', 'Content-Length': '2'}
request body: '{}'
response headers: {'Content-Length': '0', 'Date': 'Wed, 05 Mar 2025 11:41:14 GMT', 'X-Bce-Error-Code': 'IamSignatureInvalid', 'X-Bce-Error-Message': 'IamSignatureInvalid, cause: Could not find credential.', 'X-Bce-Exception-Point': 'Gateway', 'X-Bce-Gateway-Region': 'BJ', 'X-Bce-Request-Id': '8d60a700-e5c1-42af-93cc-02e817421476', 'Content-Type': 'text/plain; charset=utf-8'}
response body: b''
[WARNING][2025-03-05 19:41:14.217] base.py:1083 [t:8258539328]: fetch_supported_models failed: http request url https://qianfan.baidubce.com/wenxinworkshop/service/list failed with http status code 403
error code from baidu: IamSignatureInvalid
error message from baidu: IamSignatureInvalid, cause: Could not find credential.
request headers: {'User-Agent': 'python-requests/2.32.3', 'Accept-Encoding': 'gzip, deflate, zstd', 'Accept': '/', 'Connection': 'keep-alive', 'Content-Type': 'application/json', 'Host': 'qianfan.baidubce.com', 'request-source': 'qianfan_py_sdk_v0.4.12.3', 'x-bce-date': '2025-03-05T11:41:14Z', 'Authorization': 'bce-auth-v1//2025-03-05T11:41:14Z/300/x-bce-date;host;request-source;content-type/34d382a0332f9213819d512ca7cd9bf264d3126e0454764341daa2ed7c9bf1bb', 'Content-Length': '2'}
request body: '{}'
response headers: {'Content-Length': '0', 'Date': 'Wed, 05 Mar 2025 11:41:14 GMT', 'X-Bce-Error-Code': 'IamSignatureInvalid', 'X-Bce-Error-Message': 'IamSignatureInvalid, cause: Could not find credential.', 'X-Bce-Exception-Point': 'Gateway', 'X-Bce-Gateway-Region': 'BJ', 'X-Bce-Request-Id': '8d60a700-e5c1-42af-93cc-02e817421476', 'Content-Type': 'text/plain; charset=utf-8'}
response body: b''
[INFO][2025-03-05 19:41:14.219] oauth.py:277 [t:8258539328]: trying to refresh token for ak 6hM0ZG***
[INFO][2025-03-05 19:41:14.340] oauth.py:304 [t:8258539328]: successfully refresh token
Hello! I am an artificial intelligence language model, and my name is Wen Xinyiyan. I am able to interact with people in natural language and provide a variety of information and services. If you have any questions or need help, please feel free to let me know and I will try my best to help you.
4.1.3 iFlytek Spark
We can also call the iFlytek Spark model through the LangChain framework. For more information, please refer to SparkLLM
We hope to store the secret key directly in the .env file like calling ChatGPT and load it into an environment variable, thereby hiding the specific details of the secret key and ensuring security. Therefore, we need to configure it in the .env fileIFLYTEK_SPARK_APP_ID、 IFLYTEK_SPARK_API_KEYandIFLYTEK_SPARK_API_SECRET, and loaded using the following code:
In addition, each model of Spark corresponds tospark_api_urlandspark_llm_domainThey are all different, you can refer to 接口说明 to select the call.
Hello, I am a cognitive intelligence model developed by iFlytek. My name is iFlytek Spark Cognitive Model. I can communicate naturally with humans, answer questions, and efficiently complete cognitive intelligence needs in various fields.
Therefore, we can add the Spark large model to the LangChain architecture to realize the call of the Wenxin large model in the application.
4.1.4 Use LangChain to call GLM
We can also call the smart spectrum AI large model through the LangChain framework to connect it to our application framework. Since the ChatGLM provided in langchain is no longer available, we need to customize a LLM.
If you are using the Zhipuai GLM API, you need to download our encapsulated code [zhipuai_llm.py] to the same directory of this Notebook before you can run the following code to use GLM in LangChain.
According to the official announcement of Zhipu, the following models will be deprecated. After these models are deprecated, they will be automatically routed to new models. Users are requested to update your model coding to the latest version before the deprecation date to ensure a smooth transition of services. For more model-related information, please visit model
AIMessage(content='Hello! I am the artificial intelligence assistant ChatGLM, which is developed based on the language model trained by ChatGLM in 2024. My task is to provide appropriate responses and support to users' questions and requests.', additional_kwargs={}, response_metadata={'time_in_seconds': 1.87}, id='run-4e509a7e-9859-4acb-9418-23245fa5b7a7-0', usage_metadata={'input_tokens': 11, 'output_tokens': 42, 'total_tokens': 53})
4.2 Build a search question and answer chain
existC3 搭建数据库Chapter, we have introduced how to build a vector knowledge base based on our own local knowledge documents. In the following content, we will use the built vector database to recall the query query, combine the recall results with the query to build a prompt, and input it into the large model for question and answer.
4.2.1 Load vector database
First, we load the vector database we built in the previous chapter. Note that you need to use the same Emedding here as when building.
Load your API_KEY from environment variable
Load the vector database, which contains the Embedding of multiple documents under ../../data_base/knowledge_db
Number stored in vector library: 1004
We can test the loaded vector database byas_retrieverMethod constructs a vector database into a retriever. We use a question query for vector retrieval. The following code will search based on similarity in the vector database and return the top k most similar documents.
Number of items retrieved: 3
Print the retrieved content
The 0th content retrieved: Specifically, a first version of Prompt is written first, and then gradually improved through multiple rounds of adjustments until satisfactory results are produced. For more complex applications, iterative training can be performed on multiple samples to evaluate the average performance of Prompt. After the application becomes more mature, it is necessary to conduct detailed optimization by evaluating Prompt performance on multiple sample sets. Because this requires higher computing resources.
In short, the core of Prompt engineers is to master the iterative development and optimization skills of Prompt, rather than requiring 100% perfection from the beginning. The correct way to design Prompt is to finally find a reliable and applicable Prompt form through constant adjustment and trial and error.
Readers can practice the examples given in this chapter on Jupyter Notebook, modify Prompt and observe different outputs to gain a deeper understanding of the iterative optimization process of Prompt. This will provide good practical preparation for further development of complex language model applications.
- English version
Product manual ----------------------------------------------------- The first content retrieved: Chapter 1 Introduction
Welcome to the Prompt Engineering for Developers section. The content of this section is based on the "Prompt Engineering for Developer" course taught by Andrew Ng. The "Prompt Engineering for Developer" course is taught by Mr. Ng Enda in cooperation with Mr. Isa Fulford, a member of the OpenAI technical team. Mr. Isa has developed the popular ChatGPT search plug-in and has made great contributions in teaching the application of LLM (Large Language Model) technology in products. She also co-wrote the OpenAI cookbook that teaches people to use Prompt. We hope that through studying this module, we can share with you the best practices and techniques for developing LLM applications using prompt words. ----------------------------------------------------- The 2nd content retrieved: Chapter 2 Prompt Principles
How to use Prompt to give full play to the performance of LLM? First of all, we need to know the principles of designing Prompt. They are the basic concepts that every developer must know when designing Prompt. This chapter discusses two key principles for designing effective prompts: writing clear, specific instructions and giving the model enough time to think. Mastering these two points is particularly important for creating reliable language model interactions.
First, Prompt needs to clearly express the requirements and provide sufficient context so that the language model accurately understands our intentions, just like explaining the human world to an alien in detail. Too simple Prompt often makes it difficult for the model to grasp the specific tasks to be completed.
Secondly, it is also critical to allow the language model enough time to reason. Just like when humans solve problems, hasty conclusions often lead to mistakes. Therefore, Prompt should add the requirement of step-by-step reasoning and allow sufficient thinking time for the model, so that the generated results will be more accurate and reliable.
If Prompt is optimized on both points, the language model can maximize its potential and complete complex reasoning and generation tasks. Mastering these Prompt design principles is an important step for developers to successfully apply language models.
-
Principle 1: Write clear and specific instructions
4.2.2 Create retrieval chain
We can use LangChain's LCEL (LangChain Expression Language, LangChain Expression Language) to build workflow. LCEL can support asynchronous (ainvoke), streaming (stream), batch processing (batch) and other operating modes, and can also use LangSmith for seamless tracking.
Next we define a simple retrieval chain using the retriever just defined.
'Preface\n "Machine Learning" (Xigua Book) by Teacher Zhou Zhihua is one of the classic introductory textbooks in the field of machine learning. In order to enable as many readers as possible\n to understand machine learning through Xigua Book, Teacher Zhou Therefore, the derivation details of some formulas are not detailed in the book, but this may be "unfriendly" to readers who want to delve into the details of formula derivation\n. This book aims to analyze the more difficult to understand formulas in the Xigua book and add specific derivation details to some formulas. "\nAfter reading this, you may wonder why the previous paragraph is here. We added quotation marks because this was just our initial reverie. Later we learned that the real reason why Teacher Zhou omitted these derivation details was that he believed that "sophomore students with a solid foundation in science and engineering mathematics should have no difficulty with the derivation details in Xigua Shu\n. The key points are all in the book, and the omitted details should be able to make up for them in their heads or do exercises." So... this pumpkin book can only be regarded as the notes that I\nother math bastards took down when they were studying on their own. I hope it can help everyone become a qualified "sophomore student with a solid foundation in mathematics in science and engineering." \nInstructions for use\n• All contents of the Pumpkin Book are expressed based on the content of the Watermelon Book as pre-knowledge, so the best way to use the Pumpkin Book is to use the Watermelon Book\n as the main line. When you encounter formulas that you cannot derive or understand, please refer to the Pumpkin Book;\n• For beginners who are new to machine learning, it is strongly not recommended to go into the formulas in Chapters 1 and 2 of the Watermelon Book. Just go through it briefly and wait until you learn the latest version of PDF\n\n Access address: https://github.com/datawhalechina/pumpkin-book/releases\n编委会\n主编:Sm1les、archwalker、jbb0523\n编委:juxiao、Majingmin、MrBigFan、shanry、Ye980226\n封面设计:构思-Sm1les、创作-林王茂盛\n致谢\n特别感谢awyd234、feijuan、Ggmatch、Heitao5200、huaqing89、LongJH、LilRachel、LeoLRH、Nono17、\nspareribs、sunchaothu、StevenLzq Contributions to the Pumpkin Book from its earliest days. \nScan the QR code below and reply with the keyword "Pumpkin Book" to join the "Pumpkin Book Readers Exchange Group"\nCopyright Statement\nThis work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. \n\n• For newcomers to machine learning, it is strongly not recommended to delve into the formulas in Chapters 1 and 2 of Xigua Book. Just go through it briefly. You can come back to it when you have learned it\n;\n• We will strive (zhi) for the analysis and derivation of each formula (neng) It is explained from the perspective of the basics of undergraduate mathematics, so super-curricular mathematical knowledge\nWe usually give it in the form of appendices and references. Interested students can continue to study in depth along the information we provide;\n• If there is no formula you want to check in the Pumpkin Book, or you find an error somewhere in the Pumpkin Book, please do not hesitate to go to our GitHub\nIssues (Address: https://github.com/datawhalechina/pumpkin-book/issues)进行反馈,在对应版块\n提交你希望补充的公式编号或者勘误信息,我们通常会在24 We will reply to you within 24 hours, more than 24 hours) If there is no reply within an hour\nYou can contact us via WeChat (WeChat ID: at-Sm1les);\nSupporting video tutorial: https://www.bilibili.com/video/BV1Mh411e7VU\n在线阅读地址:https://datawhalechina.github.io/pumpkin-book(仅供第1 version)'
LCEL requires that all constituent elements areRunnableType, as we saw earlierChatModel、PromptTemplateetc. are all inherited fromRunnablekind. aboveretrieval_chainis determined by the retrieverretrieverand combinercombinercomposed of|Symbol concatenation, data is passed from left to right, that is, the problem is firstretrieverSearch to get the search results, and thencombinerfurther processed and output.
4.2.3 Create LLM
Here, we call OpenAI’s API to create an LLM. Of course, you can also use other LLM’s APIs to create it.
'sure! I am an artificial intelligence assistant developed by OpenAI called ChatGPT. I specialize in processing and generating natural language text to help answer questions, provide information, assist with problem solving, and conduct a variety of conversations. I have no personal experiences or emotions, but I will try to provide accurate and helpful answers. If you have any questions or need help, feel free to ask me! '
4.2.4 Constructing a search question and answer chain
In the above code, we regard the retrieval chain just defined as a sub-chain aspromptofcontext, then useRunnablePassthroughStore user's questions aspromptofinput. And because these two operations are parallel, we useRunnableParallelto run them in parallel.
Search question and answer chain effect test
The result of answering question_1 after large model + knowledge base: Pumpkin Book is a book that analyzes the more difficult to understand formulas in "Machine Learning" (Watermelon Book) and supplements the derivation details. It aims to help readers better understand the mathematical derivation in machine learning. It uses the content of the Xigua Book as pre-knowledge and is suitable for reference when encountering derivation difficulties. The goal of the Pumpkin Book is to help readers become "sophomore students with a solid foundation in science, engineering, and mathematics." Thank you for your question!
The result of answering question_2 after large model + knowledge base: The "Prompt Engineering for Developer" course is taught by Mr. Andrew Ng in collaboration with Isa Fulford, a member of the OpenAI technical team. Thank you for your question!
The effect of the large model answering by itself
'The Pumpkin Book usually refers to the book "Deep Learning: Algorithms and Implementation" because the cover of the book is orange and looks like a pumpkin. This book, written by Li Mu, Aston Zhang, Zachary C. Lipton, and Alexander J. Smola, mainly introduces the basic knowledge and practical methods of deep learning. The book covers the basic concepts, commonly used models and algorithms of deep learning, and provides a large number of code examples to help readers understand and implement deep learning technology. Due to its detailed and practical content, the Pumpkin Book is widely popular among learners in the field of deep learning. '
'Prompt Engineering for Developers' is a book co-authored by Isa Fulford and Andrew Ng. This book aims to help developers better understand and apply prompt engineering technology to improve the efficiency and effectiveness of interacting with large language models (such as GPT-3). '
⭐ Through the above two questions, we found that LLM did not answer very well for some knowledge in recent years and non-common knowledge professional questions. And that, coupled with our local knowledge, can help LLM come up with better answers. In addition, it also helps alleviate the "illusion" problem of large models.
4.2.5 Add chat records to the search chain
Now that we have achieved this by uploading local knowledge documents and then saving them to the vector knowledge base, by combining the query questions with the recall results of the vector knowledge base and inputting them into the LLM, we have obtained a much better result than letting the LLM answer directly. When interacting with language models, you may have noticed a key problem - they don't remember your previous communications. This creates a big challenge when we build some applications (such as chatbots), making the conversation seem to lack real continuity. How to solve this problem?
Transfer chat history
In this section we will use LangChainChatPromptTemplate, that is, embedding previous conversations into the language model to give it the ability to continue conversations.ChatPromptTemplateChat message history can be received, which will be passed to the chatbot along with the questions when answering them, adding them to the context.
You are an assistant on a question and answer task. Please answer this question using the retrieved context fragment. If you don't know the answer just say no. Please use concise words to answer users.
What is a pumpkin book?
You are an assistant on a question and answer task. Please answer this question using the retrieved context fragment. If you don't know the answer just say no. Please use concise words to answer users.
What is the Watermelon Book? The Xigua Book refers to the book "Machine Learning" by teacher Zhou Zhihua, which is one of the classic introductory textbooks in the field of machine learning. Can you introduce him?
4.2.6 Retrieval chain with information compression
Because the question and answer chain we are building has the function of supporting multiple rounds of dialogue, compared with the question and answer chain of a single round of dialogue, it will face more problems like the output results above, that is, the user's latest dialogue semantics are incomplete, and it is difficult to retrieve relevant information when using the user question query vector database. Like "Can you introduce him?" above, it actually means "Can you introduce Mr. Zhou Zhihua?" In order to solve this problem, we will adopt information compression method and let llm improve the user's problem based on historical records.
Supports search Q&A chain of chat records
Here we use the question and answer template defined beforeqa_promptConstruct a question and answer chain, and we passRunnablePassthrough.assignSave the intermediate query results as"context", save the final result as"answer". Because query results are stored as"context", so we integrate the function of query resultscombine_docsCorresponding changes must also be made.
Test retrieval question and answer chain
{'input': 'What is the Watermelon Book? ', 'chat_history': [], 'context': [Document(metadata={'author': '', 'creationDate': "D:20230303170709-00'00'", 'creator': 'LaTeX with hyperref', 'file_path': '../../data_base/knowledge_db/pumkin_book/pumpkin_book.pdf', 'format': 'PDF 1.5', 'keywords': '', 'modDate': '', 'page': 1, 'producer': 'xdvipdfmx (20200315)', 'source': '../../data_base/knowledge_db/pumkin_book/pumpkin_book.pdf', 'subject': '', 'title': '', 'total_pages': 196, 'trapped': ''}, page_content='Foreword\n"Teacher Zhou Zhihua's "Machine Learning" (Xigua Book) is one of the classic introductory textbooks in the field of machine learning. In order to enable as many readers as possible\n to understand machine learning through Xigua Book, Teacher Zhou Therefore, the derivation details of some formulas are not detailed in the book, but this may be "unfriendly" to readers who want to delve into the details of formula derivation\n. This book aims to analyze the more difficult to understand formulas in the Xigua book and add specific derivation details to some formulas. "\nAfter reading this, you may wonder why the quotation marks are added to the previous paragraph. , because this was just our initial reverie, but later we learned that the real reason why Teacher Zhou omitted these derivation details was that he believed that "sophomore students with a solid foundation in science and engineering mathematics should have no difficulty with the derivation details in the Xigua book. The key points are all in the book, and the omitted details should be able to make up for it in their heads or practice." So... This Pumpkin Book can only be regarded as the notes that I, a math bastard, took down during my self-study. I hope it can help everyone become a qualified "sophomore student with a solid foundation in science and engineering mathematics." \nInstructions for use\n• All contents of the Pumpkin Book are expressed with the content of the Watermelon Book as prerequisite knowledge, so the best way to use the Pumpkin Book is to use the Watermelon Book as the main line, and refer to the Pumpkin Book when you encounter formulas that you cannot derive or understand;\n• For beginners who are new to machine learning, it is strongly not recommended to study the formulas in Chapters 1 and 2 of Xigua Book. You can simply go through it and wait until you learn it'), Document(metadata={'author': '', 'creationDate': "D:20230303170709-00'00'", 'creator': 'LaTeX with hyperref', 'file_path': '../../data_base/knowledge_db/pumkin_book/pumpkin_book.pdf', 'format': 'PDF 1.5', 'keywords': '', 'modDate': '', 'page': 161, 'producer': 'xdvipdfmx (20200315)', 'source': '../../data_base/knowledge_db/pumkin_book/pumpkin_book.pdf', 'subject': '', 'title': '', 'total_pages': 196, 'trapped': ''}, page_content='For the analysis of concepts such as "error", "loss" and "risk", please refer to the notes in Chapter 2, Section 2.1 of "Xigua Book"\n→→\nSupporting video tutorial: https://www.bilibili.com/video/BV1Mh411e7VU\n←←'), Document(metadata={'author': '', 'creationDate': "D:20230303170709-00'00'", 'creator': 'LaTeX with hyperref', 'file_path': '../../data_base/knowledge_db/pumkin_book/pumpkin_book.pdf', 'format': 'PDF 1.5', 'keywords': '', 'modDate': '', 'page': 1, 'producer': 'xdvipdfmx (20200315)', 'source': '../../data_base/knowledge_db/pumkin_book/pumpkin_book.pdf', 'subject': '', 'title': '', 'total_pages': 196, 'trapped': ''}, page_content='• For beginners who are new to machine learning, it is strongly not recommended to study the formulas in Chapters 1 and 2 of Xigua Book. Just go through them briefly. You can come back to them when you have learned\n a little bit;\n• We will strive (zhi) for the analysis and derivation of each formula (neng) It is explained from the perspective of the basics of undergraduate mathematics, so super-curricular mathematical knowledge\nWe usually give it in the form of appendices and references. Interested students can continue to study in depth along the information we provide;\n• If there is no formula you want to check in the Pumpkin Book, or you find an error somewhere in the Pumpkin Book, please do not hesitate to go to our GitHub\nIssues (Address: https://github.com/datawhalechina/pumpkin-book/issues)进行反馈,在对应版块\n提交你希望补充的公式编号或者勘误信息,我们通常会在24 We will reply to you within 24 hours, more than 24 hours) If there is no reply within an hour\nYou can contact us via WeChat (WeChat ID: at-Sm1les);\nSupporting video tutorial: https://www.bilibili.com/video/BV1Mh411e7VU\n在线阅读地址:https://datawhalechina.github.io/pumpkin-book(仅供第1 version)')], 'answer': 'Xigua Book refers to the book "Machine Learning" by teacher Zhou Zhihua. It is one of the classic introductory textbooks in the field of machine learning. '}
{'input': 'What does the pumpkin book have to do with it? ', 'chat_history': [('human', 'What is the Watermelon Book?'), ('ai', 'Xigua Book refers to the book "Machine Learning" by Teacher Zhou Zhihua, which is one of the classic introductory textbooks in the field of machine learning.')], 'context': [Document(metadata={'author': '', 'creationDate': "D:20230303170709-00'00'", 'creator': 'LaTeX with hyperref', 'file_path': '../../data_base/knowledge_db/pumkin_book/pumpkin_book.pdf', 'format': 'PDF 1.5', 'keywords': '', 'modDate': '', 'page': 1, 'producer': 'xdvipdfmx (20200315)', 'source': '../../data_base/knowledge_db/pumkin_book/pumpkin_book.pdf', 'subject': '', 'title': '', 'total_pages': 196, 'trapped': ''}, page_content='Foreword\n"Teacher Zhou Zhihua's "Machine Learning" (Xigua Book) is one of the classic introductory textbooks in the field of machine learning. In order to enable as many readers as possible\n to understand machine learning through Xigua Book, Teacher Zhou Therefore, the derivation details of some formulas are not detailed in the book, but this may be "unfriendly" to readers who want to delve into the details of formula derivation\n. This book aims to analyze the more difficult to understand formulas in the Xigua book and add specific derivation details to some formulas. "\nAfter reading this, you may wonder why the quotation marks are added to the previous paragraph. , because this was just our initial reverie, but later we learned that the real reason why Teacher Zhou omitted these derivation details was that he believed that "sophomore students with a solid foundation in science and engineering mathematics should have no difficulty with the derivation details in the Xigua book. The key points are all in the book, and the omitted details should be able to make up for it in their heads or practice." So... This Pumpkin Book can only be regarded as the notes that I, a math bastard, took down during my self-study. I hope it can help everyone become a qualified "sophomore student with a solid foundation in science and engineering mathematics." \nInstructions for use\n• All contents of the Pumpkin Book are expressed with the content of the Watermelon Book as prerequisite knowledge, so the best way to use the Pumpkin Book is to use the Watermelon Book as the main line, and refer to the Pumpkin Book when you encounter formulas that you cannot derive or understand;\n• For beginners who are new to machine learning, it is strongly not recommended to study the formulas in Chapters 1 and 2 of Xigua Book. You can simply go through it and wait until you learn it'), Document(metadata={'author': '', 'creationDate': "D:20230303170709-00'00'", 'creator': 'LaTeX with hyperref', 'file_path': '../../data_base/knowledge_db/pumkin_book/pumpkin_book.pdf', 'format': 'PDF 1.5', 'keywords': '', 'modDate': '', 'page': 13, 'producer': 'xdvipdfmx (20200315)', 'source': '../../data_base/knowledge_db/pumkin_book/pumpkin_book.pdf', 'subject': '', 'title': '', 'total_pages': 196, 'trapped': ''}, page_content='→→\nWelcome to purchase the paper version of the pumpkin book "Detailed Explanation of Machine Learning Formulas" on major e-commerce platforms\n←←\nNo. 1 Chapter\nIntroduction\nThis chapter is the beginning of the "Xigua Book". It mainly explains what machine learning is and the related mathematical symbols of machine learning, which will pave the way for the subsequent content. It does not involve complex algorithm theory, so when reading this chapter, you only need to patiently sort out all the concepts and mathematical symbols. In addition, it is recommended to read the West before reading this chapter. The "Main Symbol Table" on the front page of the "Water Melon Book" can answer most of the doubts about mathematical symbols that arise during the reading of "The Water Melon Book". \n This chapter is also the beginning of this book. The author will elaborate on the original intention of writing this book. This book aims to accompany readers to read "Water Melon Book" from the perspective of an "experienced person" and try to help readers eliminate reading problems. As long as the reader has studied "Advanced Mathematics", "Linear Algebra\n" and "Probability Theory and Mathematical Statistics", which are three compulsory mathematics courses in universities, they can understand the explanation and derivation of the formulas in Xigua's book. At the same time, they can also appreciate the "beauty of mathematics" produced by the collision of these three mathematics courses in machine learning. .\n1.1\nIntroduction\nThis section focuses on conceptual understanding. Here is a supplementary explanation of "algorithm" and "model". "Algorithm" refers to the specific method of learning "model" from data, such as linear regression, logarithmic probability regression, decision tree, etc. that will be described in subsequent chapters. '), Document(metadata={'author': '', 'creationDate': "D:20230303170709-00'00'", 'creator': 'LaTeX with hyperref', 'file_path': '../../data_base/knowledge_db/pumkin_book/pumpkin_book.pdf', 'format': 'PDF 1.5', 'keywords': '', 'modDate': '', 'page': 1, 'producer': 'xdvipdfmx (20200315)', 'source': '../../data_base/knowledge_db/pumkin_book/pumpkin_book.pdf', 'subject': '', 'title': '', 'total_pages': 196, 'trapped': ''}, page_content='• For beginners who are new to machine learning, it is strongly not recommended to study the formulas in Chapters 1 and 2 of Xigua Book. Just go through them briefly. You can come back to them when you have learned\n a little bit;\n• We will strive (zhi) for the analysis and derivation of each formula (neng) It is explained from the perspective of the basics of undergraduate mathematics, so super-curricular mathematical knowledge\nWe usually give it in the form of appendices and references. Interested students can continue to study in depth along the information we provide;\n• If there is no formula you want to check in the Pumpkin Book, or you find an error somewhere in the Pumpkin Book, please do not hesitate to go to our GitHub\nIssues (Address: https://github.com/datawhalechina/pumpkin-book/issues)进行反馈,在对应版块\n提交你希望补充的公式编号或者勘误信息,我们通常会在24 We will reply to you within 24 hours, more than 24 hours) If there is no reply within an hour\nYou can contact us via WeChat (WeChat ID: at-Sm1les);\nSupporting video tutorial: https://www.bilibili.com/video/BV1Mh411e7VU\n在线阅读地址:https://datawhalechina.github.io/pumpkin-book(仅供第1 version)')], 'answer': 'The Pumpkin Book is a book that provides detailed analysis and derivation of formulas that are difficult to understand in the Watermelon Book. It uses the content of Xigua Book as pre-knowledge to help readers better understand and learn the content of Xigua Book. '}
It can be seen that LLM accurately determines what "it" is, which means that we have successfully conveyed historical information to it. In addition, the recalled content also has answers to the questions, proving that our information compression strategy also works. This ability to correlate previous and previous questions and compress and retrieve information can greatly enhance the continuity and intelligence of the question and answer system.
4.3 Deploy Knowledge Base Assistant
Now that we have a basic understanding of knowledge bases and LLM, it’s time to blend them neatly and create a visually rich interface. Such an interface is not only easier to operate, but also easier to share with others.
Streamlit is a fast and convenient way to demonstrate machine learning models directly in Python through a friendly web interface. In this course, we'll learn how to use it to build user interfaces for generative AI applications. After building a machine learning model, if you want to build a demo to show others, maybe to get feedback and drive improvements to the system, or just because you think the system is cool and want to demonstrate it: Streamlit allows you to quickly achieve this through a Python interface program without writing any front-end, web or JavaScript code.
4.3.1 Introduction to Streamlit
Streamlitis an open source Python library for quickly creating data applications. It is designed to allow data scientists to easily transform data analysis and machine learning models into interactive web applications without requiring in-depth knowledge of web development. The difference from regular web frameworks, such as Flask/django, is that it does not require you to write any client code (HTML/CSS/JS). You only need to write ordinary Python modules to create a beautiful and highly interactive interface in a short time, thereby quickly generating data analysis or machine learning results. On the other hand, unlike those tools that can only be generated by dragging and dropping, you still have complete control over the code.
Streamlit provides a simple yet powerful set of basic modules for building data applications:
-
st.write(): This is one of the most basic modules used to render text, images, tables, etc. in the application.
-
st.title(), st.header(), st.subheader(): These modules are used to add titles, subtitles, and grouped titles to organize the layout of the application.
-
st.text(), st.markdown(): used to add text content, supporting Markdown syntax.
-
st.image(): used to add images to the application.
-
st.dataframe(): used to render Pandas data frame.
-
st.table(): used to render simple data tables.
-
st.pyplot(), st.altair_chart(), st.plotly_chart(): used to render charts drawn by Matplotlib, Altair or Plotly.
-
st.selectbox(), st.multiselect(), st.slider(), st.text_input(): used to add interactive widgets that allow users to select, enter, or slide in the application.
-
st.button(), st.checkbox(), st.radio(): used to add buttons, checkboxes and radio buttons to trigger specific actions.
These basic modules make it easy to build interactive data applications with Streamlit, and can be combined and customized as needed. For more information, see 官方文档
4.3.2 Building the application
First, create a new Python file and save it streamlit_app.py in the root of your working directory
- Import the necessary Python libraries.
- Definition
get_retrieverfunction that returns a retriever
- Definition
combine_docsFunction that processes the text returned by the retriever
- Definition
get_qa_history_chainfunction, which can return a search question and answer chain
- Definition
gen_responseFunction, which accepts the retrieval question and answer chain, user input and chat history, and returns the chain output in a streaming format
- Define the main function, which formulates the display effect and logic
4.3.3 Deploy the application
Run locally:streamlit run "notebook/C4 构建 RAG 应用/streamlit_app.py"
Remote Deployment: To deploy your application to Streamlit Cloud, follow these steps:
-
Create a GitHub repository for the application. Your repository should contain two files:
your-repository/
├── streamlit_app.py
└── requirements.txt -
Go to Streamlit Community Cloud, click in the workspace
New appbutton and specify the repository, branch, and master file path. Alternatively, you can customize your application's URL by selecting a custom subdomain -
Click
Deploy!button
Your application will now be deployed to the Streamlit Community Cloud and accessible from anywhere in the world! 🌎
Our project deployment is basically completed at this point. It has been simplified for the convenience of demonstration. There are still many places that can be further optimized. We look forward to learners making various magic changes!
Optimization direction:
- Added the function of uploading local documents and establishing vector database in the interface
- Added buttons for multiple LLM and embedding method selections
- Add button to modify parameters
- More......

