EyeLevel Launch: Your RAG is Tripping, Here's the Real Reason Why

791 views · Feb 06, 2025 · 6:12 min · Watch on YouTube ↗

Takeaway

RAG fails on enterprise docs because of ingestion, not retrieval — vision-model layout parsing plus semantic objects beat vector chunking for complex documents.

Summary

EyeLevel claims 98% accuracy on complex enterprise docs, outperforming popular RAG by up to 120% — they don't use vector DBs at all, using 'semantic objects' with multi-field search instead.
RAG errors (up to 35% hallucination on enterprise docs) come from content ingestion, not LLMs or prompts: bad text extraction, lost surrounding context after chunking, and unextracted visual elements.
Ingestion pipeline: fine-tuned vision model identifies tables/images/text regions, then dedicated multimodal pipelines extract each — preserves cross-chunk context.
Air France uses it for call-center agent copilot over hundreds of thousands of complex internal docs.

ragdocument-parsingmultimodal

Original description

95% of RAG hallucinations are generated by the RAG, not the LLM. But why? In this talk, we'll discuss the hard data engineering problems of building highly accurate RAG systems and how to fix them. You'll see how companies like Air France are getting 95% accuracy or better.

Recorded live in San Francisco at the AI Engineer World's Fair. See the full schedule of talks at https://www.ai.engineer/worldsfair/2024/schedule & join us at the AI Engineer World's Fair in 2025! Get your tickets today at https://ai.engineer/2025

About Benjamin
Over a two decade career, Ben has designed CMOS chips for quantum computing, developed VPN architectures for mobile and pioneered consumer AI at IBM Watson and Weather Channel. He has 21 patents and a PhD in physics. In 2019, he launched EyeLevel.ai to help connect the world’s private data to LLMs.