← back
OpenRAG: An open-source stack for RAG — Phil Nash
Original: OpenRAG: An open-source stack for RAG — Phil Nash
Takeaway
OpenRAG gives a production-grade open-source RAG baseline (Docling + OpenSearch/JVector + LangFlow) with agentic retrieval out of the box.
Summary
- IBM's OpenRAG bundles three OSS projects: Docling (document parsing), OpenSearch (hybrid vector+keyword search), and LangFlow (visual agent orchestration).
- Docling has multiple pipelines — simple (markdown/HTML), ASR for audio/video, standard PDF (layout/table/image small models), and a VLM pipeline using granite-docling 258M; outputs 'doc tags' XML.
- Uses OpenSearch with JVector KNN plugin (DiskANN-based) by default for live indexing without keeping the full index in RAM.
- Generation side uses agentic retrieval — the LLM decides which searches to run via tools rather than single top-K retrieval.
- Whole stack can run fully offline (air-gapped) with Ollama embeddings; supports OpenAI/Anthropic/Ollama LLMs in LangFlow.
ragdoclingopensearch
Original description
There are many variables in building RAG applications, from document parsing to the language model you pick for generation and everything in between. Combining Docling for document parsing, OpenSearch for retrieval, and Langflow for orchestration, plus local and remote models, OpenRAG is an opinionated, agentic, open-source stack for building the RAG application of your dreams. Just because it has opinions doesn't make it inflexible though. In this talk we'll look at how OpenRAG gives you a great baseline for RAG and how you can tune it and evaluate the outcomes to create RAG applications that work well with your data. You'll learn how to get the best out of your documents with Docling, how OpenSearch provides more than just vector search, and how Langflow makes it easy to customise your pipeline to interact with your data the way you want to. You’ll leave with a playbook of options to improve your RAG app and a stack you can extend without reinventing everything. Phil Nash - Developer relations engineer, IBM Phil is a developer relations engineer for DataStax and Google Developer Expert living in Melbourne, Australia. He's been working in developer relations for a decade, speaking at conferences since 2012, and writing JavaScript since before jQuery. Away from the keyboard, Phil enjoys travel, live music, and hanging out with his mini sausage dog, Ruby. Socials: https://x.com/philnash https://linkedin.com/in/philnash https://philna.sh https://github.com/philnash