RAG and the MongoDB Document Model: Ben Flast

1.6K views · Feb 08, 2025 · 13:12 min · Watch on YouTube ↗

Takeaway

Co-locate embeddings and business data in MongoDB documents so RAG can blend transactional context with vector search via one query language.

Summary

Ben Flast (MongoDB) pitches Atlas Vector Search as a RAG store that lets you put HNSW vector indexes directly on existing JSON/BSON documents — no separate vector DB needed.
Document model is a superset of tabular/key-value/geospatial/graph data; embeddings live as a field alongside business data, allowing combined transactional + semantic queries.
Vector search supports up to 4,096-dimensional vectors; configured via index definition with type, path, dimensions, similarity function; queried via $vectorSearch aggregation stage with numCandidates (HNSW entry points) and limit + optional pre-filter.
'Search Nodes' decouple scaling — vector indexes can run on separate nodes from the transactional primary+secondary topology, tuning compute independently.
Integrations into LangChain, LlamaIndex, Microsoft Semantic Kernel, AWS Bedrock, Haystack (includes chat-message-history, semantic caching, vector stores) — pitching combined RAG over vectors + business data.

ragmongodbvector-search

Original description

In this talk, we will explore the cutting-edge techniques for Retrieval Augmented Generation with MongoDB. We will focus on leveraging Vector Search, specifically Atlas Vector Search, over MongoDB data to improve information retrieval and generation processes.

We will show how to build a RAG system using a Parent Child Retrieval Strategy to enable more efficient and accurate retrieval of relevant information. Additionally, we will show how all of this can be done within the MongoDB document model rather than relying on implementing these relationships in the application layer. And finally, we will introduce the concept of Search Nodes which enable you to serve vector search workloads at scale.

This talk is aimed at developers, ML engineers, and data scientists interested in building AI powered experiences with RAG. By the end of the session, attendees will have a solid understanding of how Retrieval Augmented Generation, Vector Search, and MongoDB can be leveraged to build innovative and scalable AI-powered applications.

Recorded live in San Francisco at the AI Engineer World's Fair. See the full schedule of talks at https://www.ai.engineer/worldsfair/2024/schedule & join us at the AI Engineer World's Fair in 2025! Get your tickets today at https://ai.engineer/2025

About Ben
Ben Flast is a Director of Product at MongoDB focused on Search, Vector Search, and various AI Integrations. He’s been at MongoDB for the past 5 years and is excited about the new wave of real-time AI powered applications that are emerging. With a deep interest in Large Language Models, Embedding Models, and agentic experiences, Ben loves to stay on the pulse of new and emerging AI capabilities.

When not immersed in the world of AI, Ben enjoys hitting the slopes skiing, playing strategy games, and a bit of city gardening. He’s based in Brooklyn, New York, and is very excited about the number of AI startups popping up in the city.