← back
"Data readiness" is a Myth: Reliable AI with an Agentic Semantic Layer — Anushrut Gupta, PromptQL
Takeaway
Stop waiting for clean data — couple LLM plan generation with deterministic DSL execution and a steerable agentic semantic layer that learns tribal business knowledge over time.
Summary
- Gupta (PromptQL) argues 'data readiness' is a chronic myth: Snowflake/Databricks (2019), MDM, static semantic layers (2023) and knowledge graphs all fail because business definitions ('at-risk deal', 'active customer', 'quarter') are tribal and change quarterly.
- McKinsey cited: Fortune 500 loses avg $250M/year to poor data quality; semantic layers can't pre-define every edge case.
- Solution is an agentic semantic layer: PromptQL generates a deterministic DSL plan covering retrieval/compute/semantics, executes it in a deterministic runtime (no LLM hallucination at answer-time), and corrects itself when it hits messy data (e.g., realizes 'succeeded' status doesn't exist, actual values are 'paid'/'pending').
- Treats AI like a day-zero analyst that learns business-specific tribal knowledge as users steer it — must be correctable, explainable, accurate-by-default.
semantic-layerpromptqlreliable-ai
Original description
The rapid progress in LLM capability has not translated to increased reliability for business critical AI use cases. The root-cause? Data is ""not ready"". Conversational analytics doesn't go beyond the analyst team because it's hard to verify if the generated queries are actually doing what they are supposed to. RAG based systems often fail to handle the breadth and depth of real world use-cases because it requires a prohibitive amount of preparation & maintenance of an underlying knowledge graph. Agentic AI systems need to hard-code specific workflows to work reliably and end up looking more like software engineering with LLM calls instead of delivering on the promise of truly agentic workflows. In all of these failure modes, the common culprit is that the planning or reasoning done by the LLM fails to accurately capture the user's intent or the domain's context aka the lack of a well prepared semantic data layer. Enterprise data is silo-ed and vastly varying levels of quality and the perfect ""semantic layer"" and ""metadata"" is a moving target. New data is continuously being created and business definitions are rapidly changing and often entirely on-demand. In this talk we'll share how you can build and maintain a semantic data layer that is maintained entirely by AI, and show (with live examples) how that dramatically improves reliability of the AI system that needs dynamic access to data. We'll demonstrate how this sufficiently augments existing RAG, text-to-SQL and tool calling techniques and starts opening the door to reliable AI deployments. ---related links--- https://www.linkedin.com/in/anushrut-gupta/ https://promptql.hasura.io/