Understanding Corrective Retrieval Augmented Generation (CRAG)

4 min readMay 7, 2024

Introduction

Language models have become increasingly sophisticated and capable of understanding complex instructions and generating coherent responses. However, they still suffer from hallucinations, where they produce incorrect or fabricated information due to their struggle with factual errors and limitations in capturing up-to-date and accurate knowledge.

This paper proposes a novel approach of ‘Corrective Retrieval Augmented Generation (CRAG)’ to enhance the robustness of generative models by integrating a lightweight retrieval evaluator to assess the relevance and reliability of retrieved documents. By triggering different knowledge retrieval actions based on a confidence degree, CRAG ensures the accuracy and timeliness of the generated responses.

Background

Generative models like transformer architectures have achieved remarkable success in various natural language processing tasks. They rely on vast amounts of data to capture patterns, structures, and relationships between words and phrases. However, despite their impressive abilities, generative models face challenges in ensuring the accuracy of their responses, leading to hallucinations.

Retrieval-Augmented Generation (RAG) emerges as a promising solution to mitigate hallucination issues by providing access to relevant knowledge stored in external databases. RAG enhances the input questions of generative models with retrieved documents, thereby increasing the likelihood of obtaining accurate and updated information.

However, RAG alone does not provide mechanisms to ensure the accuracy of the retrieved documents, leaving room for potential hallucinations. CRAG addresses this limitation by adding a lightweight retrieval evaluator to monitor the relevance and reliability of the retrieved documents, thus reducing the chances of hallucinations.

How Does it Actually Work?

CRAG operates through a multi-step procedure involving retrieval, evaluation, and adjustment. Given a user query, CRAG first employs a retriever module to fetch relevant documents from a designated corpus. Next, the retrieved documents pass through a lightweight retrieval evaluator, estimating their relevance and reliability. Depending on the estimated confidence degree, CRAG takes appropriate measures:

Correct: The retrieved documents contain adequate and accurate information, allowing direct integration into the generative model’s input sequence.
Incorrect: The retrieved documents do not offer valuable insights or even contain false information. In this case, CRAG performs a large-scale web search to collect alternative perspectives and cross-verifies them with the existing database.
Ambiguous: There is uncertainty regarding the relevance and reliability of the retrieved documents. CRAG combines information from multiple sources, enabling a comprehensive assessment of the situation.

Architecture

At its core, CRAG comprises a retriever module, a generator module, and a lightweight retrieval evaluator. The architecture enables seamless interaction between components, facilitating accurate and robust language generation.

Retriever Module: Responsible for identifying pertinent documents matching a user query, the retriever module forms the foundation of the CRAG pipeline. Utilizing sparse or dense representations, the retriever efficiently scans the corpus to find suitable matches.
Lightweight Retrieval Evaluator: Built around a small transformer model, the retrieval evaluator determines the relevance and reliability of the fetched documents concerning the user query. Fine-tuning this module using relevant signals helps maintain high precision levels, ultimately preventing hallucinations.
Generator Module: Comprising a generative language model, the generator module produces the desired response based on the processed input. Integrating the corrected and validated documents, the generator guarantees enhanced accuracy and minimizes hallucinations.
Web Search Integration: When the retrieved documents are deemed irrelevant, large-scale web searches are utilized to augment the retrieval results with complementary knowledge from the Internet.

Comparison of CRAG with Other Strategies

Real-life Applications

CRAG can be integrated into various real-life applications that rely on knowledge-intensive language models, enhancing their robustness and accuracy:

Conversational AI Systems: In virtual assistants, chatbots, and customer service applications, CRAG can ensure that the generated responses are factually accurate and relevant by effectively utilizing retrieved knowledge from external sources.
Question-Answering Systems: CRAG can be employed in question-answering systems across domains such as healthcare, finance, and education, providing accurate and reliable answers by leveraging relevant knowledge from corpora and the web.
Content Generation: In applications like automated writing, report generation, and content creation, CRAG can enhance the quality and factual accuracy of the generated content by incorporating relevant knowledge from external sources.
Fact-checking and Verification: CRAG can be integrated into fact-checking and verification systems, ensuring that the information presented is accurate and supported by relevant knowledge from trusted sources.
Domain-specific Applications: CRAG can be tailored and applied to various domain-specific applications, such as legal document analysis, scientific literature summarization, and medical diagnosis support systems, where accurate knowledge retrieval and utilization are crucial.

Conclusion

The Corrective Retrieval Augmented Generation (CRAG) method addresses the critical challenge of robustness in retrieval-augmented language models by introducing corrective strategies and optimizing the utilization of retrieved knowledge. Through its innovative components, including a retrieval evaluator, knowledge refinement process, web search integration, and knowledge correction and utilization mechanisms, CRAG enhances the ability of automatic self-correction and efficient use of retrieved information.

With its plug-and-play nature and adaptability to various RAG-based approaches, CRAG offers a promising solution to improve the robustness and performance of language models in knowledge-intensive tasks, mitigating the impact of inaccurate retrieval and reducing the risk of hallucinations.

By seamlessly integrating CRAG into real-time RAG systems, organizations and developers can unlock the full potential of language models, ensuring accurate and reliable generation while leveraging the vast knowledge available from external sources.

Reference

Yan, S.Q., Gu, J.C., Zhu, Y. and Ling, Z.H., 2024. Corrective Retrieval Augmented Generation. arXiv preprint arXiv:2401.15884.