Retrieval-augmented generation (RAG)

Combining a retriever over documents or tools with a generator LLM so answers can cite fresher or private context.

RAG retrieves relevant chunks from an index, vector database, or corpus and conditions LLM decoding on that context in natural language processing (NLP) pipelines, reducing reliance on memorized parametric knowledge alone.

Trade-offs: retrieval quality, latency, prompt fit, attribution, indirect prompt injection in retrieved text; grounding still needs human review for high-stakes facts.