Scaling Enterprise-Grade RAG: Lessons from Legal Frontier - Calvin Qi (Harvey), Chang She (Lance)

5.8K views · Jul 29, 2025 · 16:40 min · Watch on YouTube ↗

Takeaway

Frontier enterprise RAG needs tiered eval-driven development plus a multimodal lakehouse foundation (not just a vector DB) to handle complex jurisdiction-spanning legal queries.

Summary

Harvey (legal AI assistant) and LanceDB walk through retrieval at three scales: assistant uploads (1-50 docs), Vaults (deal-room scale), and Data Corpuses (entire country legislation, tens of millions of docs).
Complex legal queries mix semantics, implicit date filters, citation IDs (EU 2019/2062, Article 129 CRR), and multi-part jurisdictional reasoning — a single retrieval method can't handle it.
Eval-driven development with tiered evals: expert review (high fidelity, costly), expert-labeled criteria (medium), and automated retrieval precision/recall (fast iteration); 'no silver bullet eval'.
Infra needs: data privacy/retention per customer, telemetry, fast online queries plus offline reingestion for ML experiments at tens of millions of docs.
LanceDB positions as 'AI-native multimodal lakehouse' for vector + keyword + metadata over multimodal data.

raglegallancedb

Original description

In domains like law, compliance, and tax, building enterprise-grade RAG means very large scale, spikey workloads, a focus on accuracy, and non-negotiable privacy. In this talk, we'll share war stories and battle scars of how Harvey has built the world's most advanced AI agents for the legal profession on top of a highly optimized retrieval architecture. We'll cover how to get better retrieval via both sparse and dense retrieval methods, why domain-specific reranking is essential, and how to handle ambiguity in real-world queries. We'll also touch on how LanceDB's search engine enables this architecture by delivering low-latency, high-throughput retrieval across millions of documents of varying sizes without compromising privacy. This solid foundation enables Harvey to build a product that brings highly accurate answers to hundreds of law firms and professional services firms across 45 countries.

About Chang She
Two decades of building data tools for ML/AI. Pandas co-author. Building LanceDB, the database for multimodal AI.

About Calvin Qi
Calvin works on Retrieval Augmented Generation at Harvey for expert use cases in Legal, Tax, and more.

Recorded at the AI Engineer World's Fair in San Francisco. Stay up to date on our upcoming events and content by joining our newsletter here: https://www.ai.engineer/newsletter