From the course: Introduction to AI-Native Vector Databases
CRUD operations in vector DBs
From the course: Introduction to AI-Native Vector Databases
CRUD operations in vector DBs
In last video, we spoke about metrics that you can use to assess the performance of a vector database. In this video, we'll talk about how vector databases keep this data fresh and up to date. Our data is always changing. Whether it be due to new customers signing on to our platform or new products being listed on our store, we need a way to have our database always up to date and loaded with this fresh data. In order to keep a database fresh, we need to be able to create data objects, read already stored data objects, update old data objects, and even delete them if needed. This makes up the CRUD operations that every database needs to support. Because a database needs to be a real-time account of all the data our business is dealing with. It needs to be able to support rea- time CRUD operations. Let's have a look at how we can perform CRUD operations with the Weaviate vector database. So, in this notebook, you'll see some boilerplate code that we've been using back-and-forth. We're going to import our data set again, and then we've got our function to help us visualize our data. We're going to simply run this. And we're going to visualize one data point that tells us that everything is good to go. We've loaded in our data. We're going to go in and spin up Weaviate. This is all the same as before. We're going to delete the question class if it exists. We're going to create it from scratch so that we know exactly what we have in there. So, now, what we're going to do is instead of loading in our data that we had before, we're going to load in and perform CRUD operations. So, we're going to come in here, and the first thing we're going to do is create a specific object where we specify the question answer and category ourself. And that's going to teach us how we can create particular objects in a vector database, and what we need to specify. So, the way to do this is we can go in here, and we can tell the client, client, I want you to create a data object by saying data object dot create. And inside these parentheses, we can specify exactly what this data object needs to have. So, here, we're going to have a data object. And this is going to be another Python dictionary. And again, this has to be in line with the information that we want to pass in. So, the question itself, for example can be Leonardo da Vinci was born in this country. So, this is our jeopardy-like question. We've also got an answer for this question. So, we'll do an answer and this will be Italy. And we've got the category that this question is supposed to be from. And let's say we do culture here. So, this is going to be the bulk of the content that we pass in. It's just one object for you to get the idea. The other thing that we need to pass in here is specifying the class in which this needs to be inserted. So, we specify the class name, and the class name comes from here, remember. So, we'll take that, and we'll let it know exactly where to put that new data point. So, now that this is set up, this is actually going to return to us the unique identifier for the object that it creates. And just to make sure that the indentation is correct, we're going to go here, we're going to clean things up. So, now that that's run, we can actually look at what's stored in the UUID here. And so we get the unique identifier for this object that's just been inserted into the database. How do we know that it's been inserted into the database? Well, that's where the next step comes in. So, this is the create portion the C in CRUD. The next thing is reading an object. So, what we're going to do is read this object back out to make sure that it's correctly stored. So, here, what we're going to do is store our unique user ID into a variable. So, I'll say object UUID, and I'll store the UUID in there so that we can refer back to it over and over again. And the reason why we need to do this is because we're going to read this object using its unique user ID. And so we're going to say data object equal to, and we're going to tell the client to extract our data object by ID. So, we can say get by ID. And here, we can pass in the object ID, and we can tell it which class this object is stored in. And this, again, has to be the same class that we specified here when we were putting the object in the first place. This seems good. We can take the data object and now we can go in here, and we can print this out. To print JSON, humps. We take our data object, we add a little bit of indentation here. So, this is what comes back. So, now that we've read the object using its ID, we can verify that it is, in fact, the same object that we passed in. Right. So, Leonardo da Vinci answer Italy, category culture. It seems good. So, we've now created an object and we've read that object from the vector database. We can also go in and say, what is the vector associated with this object? Because the vector database has a vector affiliated with each object, we can ask it to return the vector as well. And that's just a slight difference where we can say get by ID, specify the ID here so we can go in and say this is object UUID. It's going to come from our question class. And we want the vector to be returned as well. So, we set that to true. And now, when we run this slightly different query, notice how we get the question, answer, and category as before, but we also now have this vector. And this is the vector that captures the meaning of this new question that we just passed in. So, now, The next step is U the operation, the update operation. Let's say, for some reason, we had inserted the wrong data point here. Or let's say we wanted to add more detail here, right? So, we wanted to answer the question, but not just with the country, we want it to add a region within the country as well. We can do that using an update operation so we can update the object. So, when we want to update the object, we need to have the UUID for that object. As long as we have the unique ID for that object, we can go ahead and we can say, client data object. This time, we want to update the object. So, we're going to use this function. And then we're going to go in here. We're going to say that the unique user ID here is the object ID that we stored earlier. And then we're going to go in here and we're going to specify which class this is stored in. So, that's question. And then we're going to pass in the new data object that's going to be the update that we pass in. So, here we're only going to update the answer. The question and the category are going to remain as is. So, here we can be a little bit more specific. We'll say Florence, Italy. We go ahead and run that. And then we can go in and see whether or not the update has been reflected in the database. So, we can go ahead and say, client data object get by ID. So, this is just a read operation. We can specify the ID here, which is just object ID, and we can print it out. When you run this. Now, we get the same category, the same question, but notice how the answer is now updated to be Florence, Italy, as opposed to just Italy. So, we've got the C, the R and the U. The last thing that we want to talk about is how we delete an object. Let's say we add this object in and it was erroneous, or we wanted to get rid of it. How do we delete that object from the vector database? Here we're going to talk about the delete operation, the D in CRUD. So, once again, in order to delete an object, we need its unique ID. But we can just go ahead and we can say data object. We can say, client data object. So, all these CRUD operations are accessed through the data object method here. So, we're going to go in and. First of all, let's examine the fact that this exists. So, here we'll pass in, say object ID class name question then we'll print it out here. We want to delete this particular instance. So, to delete it, let's go ahead and do this. Client data object. This time, we want it to delete, so we use the delete function. We again use the user id to specify which object we want to delete, and we tell it where this object lives, which is in the question class. And that should be good. So, you run that, and now the object is deleted. How do we know the object is deleted? Let's have a look. We can go ahead and use our with meta count function to see how many objects are in the database. And in this whole notebook, we've only inserted one object. We've modified it, we read it, and then we've deleted it. So, what should happen when I run this line is I should see zero objects. The count of objects should be zero in my database. So, let's go ahead and run that. So, verifying that we've got a count of zero. So, that means that there's nothing in the database. And that completes the deletion operation. So, we've covered the C, the R, the U, and the D operations CRUD operations that help us keep the database fresh. Every time you have a data point you can insert into the database, You can read from it, you can update it, you can delete it if necessary. Now that we've seen how to perform basic CRUD operations with vector databases, in the next video, we'll see how we can compare one vector database to another, and answer what metrics matter when it comes to assessing vector database performance.