Last reviewed:

What is RAG? Definition and business implications

RAG (Retrieval-Augmented Generation) is an AI architecture that pairs a search engine across your documents with a generative model. The model answers by relying on citable business data rather than on its training knowledge alone.

RAG operates in three steps invisible to the end user. First, your documents (PDFs, web pages, knowledge base) are split and converted into numerical vectors, that is, mathematical coordinates representing their meaning, then stored in a vector database. Second, when a user asks a question, the system retrieves the passages semantically closest to the query. Third, these passages are injected into the prompt sent to the LLM, which composes an answer that explicitly relies on these sources. The practical consequence is major. The model no longer draws on its training memory; on every query it reads the relevant documents you supply. This is what sets RAG apart from fine-tuning, which modifies the model's internal parameters. RAG leaves the model intact and operates solely on what it has in view at the moment of responding.

Concrete example

A 2025 study published in the Journal of Medical Internet Research compared, on oncology questions, chatbots with and without RAG. Conventional chatbots, with no access to official sources, produced about 40% of hallucinated responses. Chatbots equipped with a RAG connected to the U.S. Cancer Information Service brought that rate down to 19% for GPT-4 and 35% for GPT-3.5. RAG does not eliminate hallucination, but in this configuration it cuts it by half. Each answer cites the source used, which is the prerequisite for any after-the-fact audit.

See also

Sources

  1. Reducing Hallucinations in Generative AI Chatbots for Cancer Information, JMIR / PMC, 2025. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12425422/ (accessed 2026-05-24)
  2. Moffatt v. Air Canada, British Columbia Civil Resolution Tribunal, February 2024. https://www.canlii.org/en/bc/bccrt/doc/2024/2024bccrt149/2024bccrt149.html (accessed 2026-05-24)

← Back to glossary

Address copied