adporn.net Retrieval augmented generation - Python Video Tutorial | LinkedIn Learning, formerly Lynda.com

Start free trial Sign in

From the course: Hands-On AI: Building LLM-Powered Apps

Retrieval augmented generation - Python Tutorial

From the course: Hands-On AI: Building LLM-Powered Apps

Retrieval augmented generation

“

- In the previous chapter, we built a simplified Chat GPT application using Chen and Chain Lit.* In this chapter, we will try to bring knowledge into our chat with PTF application via PTF document. We mentioned that in the previous video, a large language model tends to hallucinate, and we can fix that by putting information in the input context, but the contact then is not infinite to fit all of the information out there. And the solution to this problem is to augment the large language models with relevant knowledges, with regards to the question. This architecture pattern is called Retrieval Augmented Generation or RAG. What Retrieval Augmented Generation does is that it separates our application into two portions. On one hand, we have the large language model, and on the other, we have a search engine. So on the large language model side, it is responsible for generating and reasoning the answers. On the other side, we rely on the search engine to surface the most relevant documents for us to send in to the contacts for the large language model. So when a user ask a question to our chat with PDF application or application will first pass the question to the search engine, then the search engine retrieves the most relevant documents and send those relevant documents back to the application. Then our application includes those relevant documents inside the prompt to the large language model. Then the large language model respond to user's question with the relevant information supported by sources from our search engine. And this completes a retrieval augmented generation process. In summary, the RAG architecture uses the large language model to conduct reasoning and generation and get the factual context from the search engines. The RAG architecture is a very, very good concept, but it also has some limitations where there is no guarantee that the generated sentences will be supported by the citations or does it guarantee all retrieved citations can be used and will be used in the generation process. In summary, we enhance the capabilities of our application by using a Retrieval Augmented Generation architecture or RAG architecture. It first retrieve the relevant documents, provide those documents to the large language model. And when we ask the model to answer the question using the context and context only, this provides grounding to the model's answers. Now, since we brought up a search engine, we will go into a brief introduction on what is a search engine.

Contents