adporn.net Embedding search - Python Video Tutorial | LinkedIn Learning, formerly Lynda.com

LinkedIn and 3rd parties use essential and non-essential cookies to provide, secure, analyze and improve our Services, and to show you relevant ads (including professional and job ads) on and off LinkedIn. Learn more in our Cookie Policy.

Select Accept to consent or Reject to decline non-essential cookies for this use. You can update your choices at any time in your settings.

Start free trial Sign in

From the course: Hands-On AI: Building LLM-Powered Apps

Embedding search - Python Tutorial

From the course: Hands-On AI: Building LLM-Powered Apps

Embedding search

“

- [Instructor] In the last video, we went through the retrieval part of the REC architecture. We also briefly discussed indexing and searching. Now let's dive deeper into search, specifically embeddings search. Embeddings search is currently the most popular way to search for an LLM application. What is embedding? Embeddings is a low dimensional representation of data. It is calculated using an encoder model where we input a chunk of text and the models output embedding. The output is an array of floating point numbers. Embeddings contains semantic information, including its meaning and the sentence structure. And one of the key feature of this is that sentences are similar, will be numerically close to each other. And this allows us to search by similarities. The way we measure how close the sentence are is by using something called cosine similarity. It is the angular distance between two embeddings. The embedding model learned to predict that similar sentences are closer to each other and different sentences are farther away. This is an extremely powerful concept. In the old days, we would need to search for exact match. As an example, when we search for the word USA, or United States of America, it used to require multiple different searches because those are spelled differently. But now with embedding search, USA, United States, and America will be all close to each other semantically so they will show up altogether. Now we capture the meanings and structures of the underlying sentences. We can then search by retrieving the top K, similar documents around the questions asked, measured in cosine similarity. This algorithm is called K-Nearest Neighbors, or KNN. And when there is a massive amount of data, we can use a more approximate model called approximate nearest neighbor search. And this improve the latency by sacrificing a little bit of relevancy. In summary, embeddings are low dimensional lossy representation that enables us to retrieve relevant documents using KNN or ANN. And the way we measure similarity is cosine similarity. Again, as usual, there are certain limitations to embedding models and we'll discuss them next.

Contents