From the course: Hands-On AI: RAG using LlamaIndex

Storing and retrieving - LlamaIndex Tutorial

From the course: Hands-On AI: RAG using LlamaIndex

Storing and retrieving

- [Instructor] Now that you know how to load and index data in LlamaIndex, I'm going to show you how we can now store our data. And in this lesson, I'm going to show you how to store data into Quadrant, which is going to be the vector database that we're going to use in this course. So let's go ahead and do that here, do our standard imports. We will set our API keys for cohere OpenAI. Also instantiate our quadrant, URL and the quadrant API key. This function here is just going to create a directory called data. I've already done that here. And essentially we're just downloading a text file. This text file is going to be a book of poems called, "It can be done." So we're just saving that to disc. I've already done that and I'm going to now load the text document itself as a LlamaIndex document object using the simple directory reader. Now remember that simple directory reader. Typically if you just pass it a path that it's going to just read all of the files in that path, or you can specify a individual file and pass that as the input files argument, and you have to pass that as a list. So basically that's all we're doing here. We're just pointing to this particular file on disc and saying, Hey, LlamaIndex simple Directory Reader, this is the input file that I want to read into a document object. Great. So we'll go ahead and do that. And I'm going to instantiate a node parser and instantiate a embedding model. So up to this point, we have loaded our document. We haven't parsed the nodes yet, but we have instantiated a node parser. And now what we need to do is create a index. So we are going to be creating a vector store index using quadrant. So to use quadrant to store embeddings, we need to initialize the client, create a collection to store data into quadrant. Then we need to assign quadrant as the vector store in the storage context. Then we initialize the vector store using storage context. So I will talk about what storage context is in just a moment, but let's go ahead and initialize the quadrant client. We need to pass in the quadrant URL and the API key. And then once we have that, we'll instantiate the vector store using the quadrant vector store passing in the client and the collection name, in this case, the collection name, I'll just call it, it can be done. And we also need to pass in the embedding model that we're using. So this case, we are going to be making use of OpenAI text embedding three small. So we instantiate those and now we can talk about storage context. So storage context and LlamaIndex is just an abstraction that says, these are my nodes that I want to store, and just makes it easy to store and retrieve data. And in a storage context, you can indicate different storage kind of places. For example, a simple document store, a index store, a vector store, or a graph store a link to the documentation here where you can read more about each one of those. But what we are going to do is substantiate a storage context from defaults and just specify the vector store. All this is saying now is to LlamaIndex like, Hey, look, this is the vector store that I'm using. This is where I want to store all of my nodes and all of my embeddings, and it's just like a pointer to where it is that I'm just saving stuff essentially. Now I can create an index. So I'm going to create the vector store index from documents, right? And I'm switching the pattern up on you just a little bit because it's a bit of malice in forethought because you're going to learn about the ingestion pipeline in a later module. But what I'm doing here is I'm passing this transformation argument into the vector store index class and just saying, Hey, split my text into smaller chunks using this embedding model and store it in this place. So now we've constructed this index that is where our vectors and our text is going to live. So let's go ahead and run this code here, right? And we can see progress here. Great. Now that this is done, you can go into quadrant, go to overview. You can look at your clusters. We see we have a cluster here. Click that carrot, hit open dashboard and see now that we have a collection, right? So this collection has all the content of our node, so all the metadata and a bunch of other information here. Essentially they saw metadata and that it also has vectors. So our vector length is 1536, which is the dimensionality of the text embedding three model. You can copy this default vector if you want, you know, just copy and just paste it here you can see that, you know, here's the vector representation of that text that we have just indexed. What's interesting is you can actually also do a little bit of visualization as well. So if we go to the visualization tab, go down here, just hit run, and you'll see that here is all of the vectors that we've embedded. So you can get some vectors here that kind of have a notion of similarity, which is quite interesting. So you can click around here, and essentially this is just a way for you to see, okay, visually like, oh, these two vectors are kind of close to each other. What are they? What can these two vectors be? And then you can grab the node ID, the document ID, and kind of just take a look at what's in there. So another way you could do that is going back here and just scrolling down and find similar. And you'll get the most similar vectors to that one. So just a little bit of interesting things you could do with the quadrant UI. So now let's go ahead and go back into the code and talk about retrieval. So now we've indexed our data. It's living in the quadrant cloud. We can now do some retrieval. So a retriever is actually a interface that's exposed by the index. So an index with its retriever is what we use to store and fetch data. And this retriever is actually part of the index. There's a bunch of different retrievers that you can use in LlamaIndex. Look at the source code, and you can see some here, we'll talk about these later in the course when we talk about advanced RAG techniques. But here are some examples. There's a vector retriever, fusion retriever, recursive retriever. I'm going to be respectful of your time and let you read about those on your own. But just know that they exist and that we are going to touch on some of these a bit later in the course. So we're going to be now using a vector retriever. So when we're searching, our query is going to be converted to a vector embedding. It's going to be converted to the vector embedding using the exact same embedding model that we had used to embed our vectors. So in this case, that is going to be the text embedding three from OpenAI, text embedding three small from OpenAI. So in quadrant, when we ingest these vectors in quadrant parlance, they're known as points. So what I'm telling the index to do here is retrieve the top five most similar points to the user's query and make sure that those retrieved vectors have a similarity of at least 0.75. So that's what I'm saying to the retriever, that these are the conditions for the points that I want to retrieve from quadrant. So I instantiate my retriever. Now that my retriever has been instantiated, I can pass a query. So here I'm just in the pass a generic query, what lessons can be learned from the poems about success? And you'll see here that we end up with five nodes, and these are nodes with scores. So if you scroll all the way to the far end here, you'll see that we have all the texts for all the nodes, and we also have some similarity as well. So you'll notice that none of these scores meet the similarity threshold. We'll talk about order of operations for this in a later module, but here we've retrieved our documents. So chances are you don't just want to return documents, you want documents to be synthesized into a response. So we are going to build on this pattern that we've seen here in the next lesson, and then I'm going to show you how you can actually just get a generated response based on those retrieved documents. So you can go ahead and close out the client. I'll see you in the next video.

Contents