From the course: Hands-On AI: RAG using LlamaIndex
Ingestion pipeline
- [Instructor] In this video, we are going to talk about the ingestion pipeline abstraction. So, let's go ahead and connect to our kernel, do our standard imports, set our API keys, and start to build out an ingestion pipeline. So, I want you to notice that we're doing something a little bit different here. We're using the settings from LlamaIndex. So, what are settings? Settings are just essentially global configurations. So, it's just something that we are going to make use of a lot later, but it's just a way to set things globally. So, if you're interested, you can look at the LlamaIndex documentation about settings. Next, we are going to now build the ingestion pipeline. So, what is the ingestion pipeline? Well, we're using transformations that are going to be applied some input data. We're going to modify this input data, turn it into documents, parse it into nodes, and then we're going to insert it into a vector database. With an ingestion pipeline, you can also use a caching mechanism, so that the nodes and transformations that pair is cached. So, this just makes it more efficient to do a subsequent operation. If you want to learn more about the ingestion pipeline, you can go to LlamaIndex documentation. So, you go to docs.llamaindex.ai, look at Component Guides, and then under Loading, go to Ingestion Pipeline. And this will give you a breakdown of the usage pattern for the ingestion pipeline. So, a lot more detail here if you're interested in that. But I'm going to cover that here anyways, so let's see it in action. So, what we're going to do now is instantiate a ingestion pipeline. We're going to build it out and push data into quadrant. So, what we're going to do is use the same data that we used in the previous module, the book of poems. We're going to use the same book of poems in the previous module. It can be done. And what I want to do now is actually, we're going to go back into quadrant, and I'm going to actually delete the collection that we've already created. So, we can log in, click on the hamburger menu here, go to Clusters, open the cluster dashboard, and just hit delete. So, we're going to delete this collection and we're going to rebuild it this time using the ingestion pipeline. So, again, using the simple directory reader, we've touched on this before and we can now build out the ingestion pipeline. You can build out an ingestion pipeline with document management. Document management is nice, because you can cache your embeddings. So, this just speeds up processing. You can cache locally if you like. You can cache your documents locally, you can add a dock store, you can, you know, handle duplicate data, and things like that. So, I encourage you to read more about this. Feel free to pause, read more. You can also again, go to the documentations. So, if you go to LlamaIndex ingestion pipeline documentation, you can read more about the caching here, the remote and local caching, document management, and all that stuff. So, I'm going to go ahead now and instantiate our quadrant client and our ingestion cache. And then, I'm going to build out the ingestion pipeline. So, the ingestion pipeline is going to take transformations. And the transformations, I'm going to use in this case is a token text splitter, and then also pass in the embedding model. So, what we're saying to the ingestion pipeline is split our text into chunks of 256 tokens, and then embed those. And then, here's just kind of for document management, just really illustrate how it works. We're not actually going to make use of those too much. And once we've created the pipeline, we run it so we can create our nodes. So, we'll create nodes by taking our pipeline and running it against our documents. And this will take a little bit of time to run here. And now, once it's run, you go back to the Quadrant UI, you get hit refresh, and say, oh, we've got our collection back. Right. So, you see here, it's following the same patterns that we've seen before. It's just a higher level abstraction. Right. So, we see that we have our nodes, we've got it ingested into quadrant. And now, if we are curious to see what one of the nodes looks like, we can do so like this. So, each node has essentially got all the metadata that we have seen before. We can now instantiate a index, create a retriever, and then retrieve nodes. And you can see that we've got a set of retrieved nodes. Let's just look at one. We can get some text. So, "Start Where You Stand" is a poem by Berton Braley. And so, we've got some text there, we can look at the score. And of course, if we wanted to, we can persist the ingestion pipeline. So, if I hit this, you'll see that we create a directory here that's just got a doc store and a cache. And that's it. That's the ingestion pipeline. We are going to make use of this over and over throughout the course. To be honest, I think it's meant to make it easier to ingest stuff. But as you can see here, we're going to be doing a lot of the same stuff over and over, instantiating a client, instantiating a vector store, creating a pipeline, and then doing some type of ingestion. And so, it might be useful now to create some helper functions to make this easier, so that we have less code on our notebook. And after the next lesson, I'll show you how we're going to do that. So, I'll see you in the next one where we are going to talk about the query pipeline.