The Power of RAG
SETTING UP RAG ARCHITECTURE
RAG architecture touches on a number of different components. First is the ingestion of the data to be included in the model. But it can’t go in without some form of preprocessing.
As unstructured data, at the very least, it needs to be machine readable. Other preprocessing tasks involve chunking, tagging, data cleansing, and data pipelines. The data chunks are usually converted into vector embeddings and stored in a vector database to facilitate the most effective and efficient use of similarity retrieval and semantic search.
Data may still need normalization. Integrating search, vectoring, knowledge graphs, and other components adds a degree of complexity that cannot be ignored.
Speaking of similarity retrieval, the more general concept of information retrieval comes into the mix. That query going to both the curated model and the LLM? It boils down to the somewhat old-fashioned notion of information retrieval, but updated for today’s GenAI world. Within the RAG architecture, the user’s query is searched in the vector store for the most relevant chunks, and the documents discovered are grounded for the LLM. This contextual grounding is what lies at the heart of providing accurate responses to the query.
One power of RAG is, in fact, its contextual understanding. Setting up a RAG architecture is not a “one and done.”
Evaluation is an ongoing process that requires monitoring results to ensure they remain accurate and relevant in the face of changing aspects of queries and of the language models. Metrics and data analysis keep RAG on track in regard to freshness and security concerns.
VARIETIES OF RAG
We tend to talk about RAG as if it encompassed just one technology. But RAG exists in many varieties. Here are just a few:
- Graph RAG uses knowledge graphs to explain more complicated relationships than can be expressed with semantic similarity and to provide more complex reasoning ability.
- Multimodal RAG integrates data from formats beyond traditional text documents, such as images, audio, video, chats, spreadsheets, and databases.
- Knowledge-augmented generation (KAG) focuses on structured knowledge graphs to generate responses that are highly factual and accurate for tasks involving known entities or well-defined facts.
- Path RAG is an open source framework, available on GitHub, that leverages structured path reasoning.
- GAR (generation-augmented retrieval) matches queries to complex objects resulting from specific edge cases and displays results attractively for a better user experience
RAGTIME FOR AGENTIC
The latest variety of RAG is agentic. Touted as the RAG of the future, agentic RAG employs agent technology to automate portions of augmented retrieval, particularly at the query level, but it can go beyond the basic information retrieval level. Agentic RAG can interact with vector search, formulate and refine its own queries, decide whether or not to retrieve information, synthesize information, route queries to specific knowledge sources, and evaluate results prior to showing them to the user. The promise of agentic RAG is fewer (possibly even no) hallucinations, thus more accurate answers. With agentic AI, autonomy is key, as is the notion of viewing agents as partners and collaborators with humans.
Agentic RAG is in its infancy, so more applications will undoubtedly show up. Regardless of which variety of RAG enterprises choose to implement, the overall impetus for doing so remains a better search and information retrieval experience. RAG addresses the limitations of AI-generated information, allowing for more trust in the technology and the results it delivers.
It can’t solve every problem, however. If a student or an attorney persists in accessing a non-RAG-enabled chatbot to query LLMs trained on information gleaned from the open web and then relies on hallucinated journal articles and court cases, it is not the fault of the technology. Within the enterprise, however, this scenario should not occur. RAG, whether agentic or not, has the power to save humans from making basic errors.
Thanks to RAG, enterprise AI gains credibility and trust every day.