From the course: Vector Databases in Practice: Deep Dive
Retrieval augmented generation
- [Instructor] Let's talk about retrieval-augmented generation, which combines the power of generative AI models with the grounding of your real data. I think many of you have probably heard of generative AI models. For example, large language models like the GPT or the Llama models that can produce human-like text. These models can do all sorts of things like recite the capital of Australia, explain how gravity works, or even write a haiku about toothbrushes. These models can do it all. In fact, they've even been shown to perform relatively high levels of reasoning or deduction as well. But as amazing as they are, they're not quite perfect. They often fall short by confidently producing answers that are either out of date or just simply incorrect. These are often called hallucinations. One key reason for this is that the data used to produce or train these models can go out of date or simply not be available. Facts like the population of Australia change over time and some proprietary data like customer data or company information may not have been available to the model in the first place. This is where retrieval-augmented generation or RAG comes in. RAG remedies this problem by retrieving relevant data and then providing it to the AI model along with a prompt. So let's take a look at a few examples of RAG queries. Here's an example syntax for RAG. This query will search for objects most similar to science fiction and then perform the task, which is to summarize each description. Look at that, the models returned well-written summaries of the description like we asked. And if you think the query syntax here looks familiar, you would be correct. Here's an equivalent search-only syntax. Just above the RAG query. You'll see that the only differences are the sub-module name here and the prompt that's provided with the RAG query. The thing is RAG is a two-step process. The first is a search, just like any of the searches we've just learned about. And then we can send some of the search results along with a prompt to the AI model for it to base its answer on. Let's try another one. Here's a RAG query where we find objects related to science fiction. We then pass a group task prompt to say, "Extract some of the key common themes in these movies." And if we take a look at the generated output, you'll see that the model has produced quite a good answer. And it was able to do that even though the data is synthetic and available only to us. This is enabled by retrieving the data from a proprietary data source and sending it to the model. RAG is a powerful two-step process, combining search or retrieval with generation. RAG can address generative AI models' potential shortcomings like hallucinations or simply not having the right data available to it. So as such, RAG is a really exciting area that's transforming how we think of data and deal with it. So of course we'll be expanding on the idea further as we go on in the course as well.
Contents
-
-
-
A high-level view of vector databases3m 15s
-
What you can do with vector databases3m 3s
-
Get set up for the course3m 42s
-
Keyword filtering and keyword searches4m 25s
-
Vector searches3m 7s
-
Searching with filters3m 37s
-
Hybrid searches3m 33s
-
Retrieval augmented generation3m 30s
-
Challenge: Vector database queries1m 33s
-
Solution: Vector database queries4m 28s
-
-
-
-
-