From the course: Hands-On AI: Building LLM-Powered Apps
Solution: Fixing hallucination via prompting - Python Tutorial
From the course: Hands-On AI: Building LLM-Powered Apps
Solution: Fixing hallucination via prompting
- [Instructor] Welcome back. I trust you must have a very fun lab session iterating on the prompts with all the possibilities and all the knobs that we can tune. There are many, many ways to resolve this problem, and here, I will demonstrate my solution. Your solution might be different, but as long as it produces the intended results, it's awesome. So let's start by uploading the document again and ask the question we want to fix. What is the operating margin? And let's navigate down to the prompt playground. Click on this little bug icon, and we are in the prompt playground. So the first thing we will do is we will investigate what is the current prompt. So the current prompt provide the actions that it wants to do first by saying, "Given the extracted part of a long document and a question, create final question with references. If you don't know the answer, just say that you don't know. Don't try to make up the answer." So this is the action in the RACEF Framework. And it says, "Always return a sources part in your answer." So this is the format, the F, of RACEF. Then, it goes on to ask a question with some context. The first question is: Which state country law governs the interpretation of the contract? And it goes on providing the right information. And then the second question is: What did the president say about Michael Jackson? And it uses certain script from somewhere else and then it provides an example answer. This, because it provided two examples on how to answer questions, this is called two-shot or few-shot learning. And then we can see our actual question: What is the operating margin? And then the extracted or relevant document that we provided to the prompt. So now, with that, we can start by editing or by prompt engineering. First, for me, I feel that these two examples are not too relevant to the financial documents that we are looking at. Specifically, they are closer to a prose, instead of some semi-structured financial documents. So let's remove this two examples. So let's go back here, select, remove the examples. Then, since we are providing a source field, let's just make it even more explicit. So, "Always return a SOURCES part in your answer with the format SOURCES: source1, source2, et cetera." And since while we are added, let's make sure and provide it a role to play. And the role we want to assign it is a financial analyst since we're dealing with a financial document or financial press release. So let's do this. "Please act as an expert financial analyst when you answer the questions." And because we are dealing with some sort of financial document, so that's, "And pay attention to the financial statements." And one of the reason that the model could not get what an operating margin is, might be that it doesn't know what the definition of operating margin is. So let's spell out the steps it needs to take to calculate the operating margin. So operating margin is also known as op margin and is calculated by dividing operating income by revenue. So now, we change the prompt by assigning the model a role, provide the steps it need to calculate the operating margin, remove irrelevant examples, and provide more explicit sources using our RACEF Framework. So let's give this a spin and see how the model behaves now. So now, the answer is clear, the operating margin is a measure of company's profitability, et cetera, et cetera. And it goes on, get the operating income, and then get the revenue and calculates the operating margin to be 50.3%. So this is a success. Now, let's take our newly formatted prompt. Right-click, Copy, and put it in our prompt template. So we will put in the template here, here. And then we also remove the examples. So let's remove the examples. And then we can save the prompt, and the application is going to hot reload, and we can go back to our chat bot, browse the file, upload the same file again. It's going to take a while to load. And asks the question that we had problem before. What is the operating margin? And fingers crossed. 50.3%. So with this, we just used prompt engineering apply our RACEF Framework and fix a problem the model couldn't solve earlier. In the future, we might run into more problems. So we'll repeat the same process and collect a set of questions and expected outputs for our testing dataset. So now, our Chat with PDF is now robust enough to answer our questions.