🔎 RAG
Retrieval-augmented generation — chunking, embeddings, hybrid search, rerankers, citation, evaluation. The dominant pattern for grounding LLMs in private data.
The workflow
flowchart LR
A[User query] --> B[Query rewrite<br/>+ HyDE expansion]
B --> C[Embed]
C --> D[Vector search<br/>top-k]
D --> E[Reranker<br/>cross-encoder]
E --> F[Context window<br/>assembly]
F --> G[LLM answer<br/>with citations]
G --> H[Eval & feedback<br/>loop]
Naive RAG is one embedding + one search. Real production RAG layers query rewriting, hybrid retrieval, and reranking on top.
Key takeaways
Videos (48)
Building Production-Ready RAG Applications: Jerry Liu
Naive top-k vector retrieval rarely survives production; treat RAG as a tunable pipeline that you evaluate component-by-component before adding complexity.
GraphRAG: The Marriage of Knowledge Graphs and RAG: Emil Eifrem
GraphRAG adds knowledge-graph traversal on top of vector retrieval to materially boost RAG accuracy and unlock multi-hop questions baseline vector RAG cannot answer.
RAG Agents in Prod: 10 Lessons We Learned — Douwe Kiela, creator of RAG
Win enterprise RAG by treating the system (not the model) as the product, specializing aggressively on noisy proprietary data, and shipping iteratively from day one.
The Future of Knowledge Assistants: Jerry Liu
Move from naive RAG to agentic, multi-agent knowledge assistants built on high-quality parsing and tool-using LLM orchestration.
Practical GraphRAG: Making LLMs smarter with Knowledge Graphs — Michael, Jesus, and Stephen, Neo4j
For enterprise RAG, build a knowledge graph alongside vectors — combining lexical, domain, and graph algorithms produces more grounded, explainable answers.
When Vectors Break Down: Graph-Based RAG for Dense Enterprise Knowledge - Sam Julien, Writer
For dense enterprise corpora, graph-augmented RAG with custom graph-extraction models outperforms vector-only retrieval where similar-sounding documents must be disambiguated.
Agentic GraphRAG: AI's Logical Edge — Stephen Chin, Neo4j
Pair agents with Neo4j-style knowledge graphs so reasoning is grounded in structured facts rather than LLM extrapolation.
Intro to GraphRAG — Zach Blumenfeld
Even a simple knowledge graph plus LangGraph agent gives you accurate, explainable retrieval that beats raw vector search for structured-domain questions.
Anchoring Enterprise GenAI with Knowledge Graphs: Jonathan Lowe (Pfizer), Stephen Chin (Neo4j)
Enterprise GenAI succeeds when knowledge graphs anchor retrieval and when builders translate executive purpose statements into specific GenAI bets.
How Codeium Breaks Through the Ceiling for Retrieval: Kevin Hou
For production code retrieval, replace generic embedding benchmarks with product-derived multi-needle evals like recall@50 — and own your retrieval stack end-to-end.
Knowledge Graphs & GraphRAG: Techniques for Building Effective GenAI Applications: Zach Blumenthal
GraphRAG = vector search + graph traversal + graph embeddings; the combo outperforms naive vector RAG for personalized recommendation tasks.
HybridRAG: A Fusion of Graph and Vector Retrieval - Mitesh Patel, NVIDIA
Fusing LLM-built knowledge graphs with vector search yields richer multi-hop retrieval than either alone on enterprise documents.
Building a Smarter AI Agent with Neural RAG - Will Bryk, Exa.ai
AI agents need a web-search engine designed for them — high-throughput, query-rich, embedding-native — not Google retrofitted onto an LLM.
Layering every technique in RAG, one query at a time - David Karam, Pi Labs (fmr. Google Search)
Improve RAG by inspecting failing real queries and adding only the technique that fixes that class, layer by layer.
Going beyond RAG: Extended Mind Transformers - Phoebe Klett
Building retrieval into attention rather than bolting it on as RAG yields fine-grained causal citations and avoids the long-context-fine-tuning quality penalty.
Agentic Search for Context Engineering — Leonie Monigatti, Elastic
Context engineering is mostly about giving an agent the right set of search tools across heterogeneous sources rather than perfecting a single retrieval pipeline.
Agentic GraphRAG: Simplifying Retrieval Across Structured & Unstructured Data — Zach Blumenfeld
For aggregations, similarity and relationship questions, give your agent a knowledge graph and Cypher-generating MCP tool — not just a vector index.
OpenRAG: An open-source stack for RAG — Phil Nash
OpenRAG gives a production-grade open-source RAG baseline (Docling + OpenSearch/JVector + LangFlow) with agentic retrieval out of the box.
The RAG Stack We Landed On After 37 Fails - Jonathan Fernandes
A reliable on-prem RAG stack is LlamaIndex + Qdrant + open BAAI/NVIDIA embeddings + Llama/Qwen served via Ollama or TGI, with tracing built in.
Forget RAG Pipelines—Build Production Ready Agents in 15 Mins: Nina Lopatina, Rajiv Shah, Contextual
Treat RAG like a managed service with modular components rather than hand-assembling extractor/embedder/reranker/vector-DB pipelines.
How to look at your data — Jeff Huber (Chroma) + Jason Liu (567)
Build cheap golden-set fast-evals from real and synthetic queries, then look at conversation transcripts to find the implicit feedback users already give.
Building Alice's Brain: an AI Sales Rep that Learns Like a Human - Sherwood & Satwik, 11x
For a vertical AI agent like an SDR, flipping context flow from seller-push to agent-pull via a multi-modal RAG knowledge base eliminates onboarding friction and improves email quality.
Context Engineering: Connecting the Dots with Graphs — Stephen Chin, Neo4j
Layer knowledge graphs on top of vector RAG so LLMs can traverse relationships, retrieve community structure and maintain typed long/short-term memory — not just semantic neighbors.
RAG for VPs of AI: Jerry Liu
Enterprise RAG success hinges on a dedicated data-processing stack and a bet on in-house developers, with parsing quality (e.g., LlamaParse) being the single biggest lever against hallucinations.
Scaling Enterprise-Grade RAG: Lessons from Legal Frontier - Calvin Qi (Harvey), Chang She (Lance)
Frontier enterprise RAG needs tiered eval-driven development plus a multimodal lakehouse foundation (not just a vector DB) to handle complex jurisdiction-spanning legal queries.
VoiceVision RAG - Integrating Visual Document Intelligence with Voice Response — Suman Debnath, AWS
Vision-based document retrieval skips fragile OCR pipelines and pairs naturally with voice interfaces for complex enterprise documents.
RAG in 2025: State of the Art and the Road Forward — Tengyu Ma, MongoDB (acq. Voyage AI)
In 2025, RAG with high-quality domain-specific embeddings, rerankers, and contextual enrichment remains the most cost-effective and governable way to ground LLMs in proprietary data.
Navigating RAG Optimization with an Evaluation Driven Compass: Atita Arora and Deanna Emery
Optimizing RAG is impossible without an evaluation-driven loop tracking retrieval quality and faithfulness; choose metrics for your domain's tolerance for hallucination.
Information Retrieval from the Ground Up - Philipp Krenn, Elastic
Solid lexical analysis (tokenization, stemming, offsets) plus hybrid vector retrieval beats pure vector search for production RAG.
Knowledge Graphs in Litigation Agents — Tom Smoker, WhyHow
For litigation and other relation-heavy domains, schematized knowledge graphs beat vector RAG because the value is in explicit multi-hop connections.
Disrupting the $15 Trillion Construction Industry with Autonomous Agents: Dr. Sarah Buchner
Horizontal RAG is commoditized; the value is in vertical agents that surface discrepancies in highly unstructured trillion-dollar-scale industry data.
LLM Scientific Reasoning: How to Make AI Capable of Nobel Prize Discoveries: Hubert Misztela
Scientific discovery RAG needs reasoning before retrieval — decompose the question, restructure the data into graphs, and pick a reasoning type that matches question complexity.
GraphRAG methods to create optimized LLM context windows for Retrieval — Jonathan Larson, Microsoft
Graph-structured memory turns RAG from snippet retrieval into repository-scale reasoning that survives multi-file edits like adding jump mechanics to Doom.
Retrieval Augmented Generation in the Wild: Anton Troynikov
Production RAG needs dynamic, feedback-driven memory plus smart chunking/relevance — vector search alone is just the open-loop baseline.
Wisdom-Driven Knowledge Augmented Generation at Scale - Chin Keong Lam, Patho AI
For expert-advisor AI systems, model 'wisdom' as a feedback loop over knowledge/experience/insight/situation in a knowledge graph rather than relying on flat RAG retrieval.
The Knowledge Graph Mullet: Trimming GraphRAG Complexity - William Lyon
Hybridizing property-graph ergonomics with RDF-triple storage (Dgraph) lets GraphRAG combine vector, geospatial, and image entry points into a single knowledge graph.
Why Your Agent's Brain Needs a Playbook: Practical Wins from Using Ontologies - Jesús Barrasa, Neo4j
Ontology-driven graphs give GraphRAG pipelines a shared schema that produces better extraction, retrieval, and structured query generation than ad-hoc per-pipeline schemas.
RAG at scale: production ready GenAI apps with Azure AI Search
Production RAG breaks on different axes than prototypes — hybrid retrieval, scale tiers, and semantic ranking in Azure AI Search are aimed at closing that gap.
The Hidden Costs of Building Your Own RAG Stack — Ofer Vectara
Building a production-grade RAG stack is far more than a vector DB + LLM — quality, security, latency and vendor management make 'rag-as-a-service' an attractive alternative.
RAG Evaluation Is Broken! Here's Why (And How to Fix It) - Yuval Belfer and Niv Granot
Standard chunk-retrieve-rerank RAG collapses on aggregative queries; structured RAG that builds per-corpus SQL schemas during ingestion is a practical fix.
Enterprise Deep Research: The Next Killer App for Enterprise AI — Ofer Mendelevitch, Vectara
The deep-research pattern applied to private enterprise data, with strong hallucination detection, is Vectara's bet for the next killer enterprise AI app.
Graph Intelligence: Enhance Reasoning and Retrieval Using Graph Analytics - Alison & Andreas, Neo4j
Layering graph analytics over vector RAG adds the structural context single-document retrieval misses, especially for relationship-heavy enterprise data.
EyeLevel Launch: Your RAG is Tripping, Here's the Real Reason Why
RAG fails on enterprise docs because of ingestion, not retrieval — vision-model layout parsing plus semantic objects beat vector chunking for complex documents.
Build, Evaluate and Deploy a RAG-Based Retail Copilot with Azure AI: Cedric Vidal and David Smith
Azure AI Search + Cosmos DB + Azure OpenAI is a workable production-RAG stack for retail copilots and Microsoft ships a turnkey workshop to build one.
Data is Your Differentiator: Building Secure and Tailored AI Systems — Mani Khanuja, AWS
Match your data pipeline (Bedrock Data Automation + Knowledge Bases + Guardrails) to the specific GenAI use case — there's no one-size-fits-all RAG architecture for enterprise data.
Cohere for VPs of AI: Vivek Muppalla
Cohere differentiates on enterprise-grade embeddings + rerank, built-in citations, customization and any-cloud/on-prem deployment rather than chasing the biggest base model.
Understanding AI Stakes to Break Production Code: Philip Rathle
Match your retrieval architecture (vector → graph) to the stakes of the use case, and let LLMs orchestrate deterministic tools when the answer must be exact.
Building efficient hybrid context query for LLM grounding: Simrat Hanspal
Use a permissions-aware GraphQL data layer like Hasura to make RAG retrieval hybrid (semantic + structured) and to lock down LLM-generated query injection.