← back
Voice Agent Engineering — Nik Caryotakis, SuperDial
Takeaway
Voice AI in 2025 wins on vertical integrations and conversational design, not speech-to-speech realism — and the voice AI engineer is a distinct multimodal role.
Summary
- SuperDial automates US healthcare admin phone calls (prior auth, insurance) — customers upload via CSV/API/EHR and get structured answers; the bot escalates to humans when needed.
- Saved 100K+ human phone hours, on track for millions in 2025, with only 4 engineers building full-stack web app, EHR integrations, and the bot itself.
- Argues speech-to-speech models still aren't production-ready due to unreliable, non-speech outputs; favors reliability over realism via cascaded STT+LLM+TTS pipelines.
- Last-mile work is per-customer script customization, pronunciation handling, async Python plumbing, and dealing with audio hallucinations and real-time latency.
- Insists differentiation lives in conversational content and vertical integrations, not voice realism or interruption handling.
voicehealthcareagents
Original description
Does your AI voice agent really need to be able to laugh? …Cry? If the answer is no, then you’re probably better off staying a version behind. In 2025, we’ve seen leaps of progress in the Voice AI Stack – particularly with the release of voice to voice models (e.g. the OpenAI Realtime API). At Superdial however, where we’re automating millions of back-office healthcare phone calls, we’ve learned that what’s special about our product isn’t our realistic voices or natural interruption handling – it’s our conversations. By leveraging open source Voice AI orchestration tooling along with the “old school” STT/LLM/TTS sequenced approach, we’re able to build reliable voice agents that navigate phone trees, conduct sensitive healthcare conversations, and learn from human examples. In this talk, you'll learn our blueprint to navigate the Voice AI vendor landscape, avoid common scaling pitfalls, and design conversations that matter. Recorded live at the Agent Engineering Session Day from the AI Engineer Summit 2025 in New York. Learn more at https://ai.engineer and purchase tickets to our next event, the AI Engineer World's Fair, in SF June 3 - 5 here: https://ti.to/software-3/ai-engineer-worlds-fair-2025 About Nik Nik is a staff engineer at Superdial, a platform for automating inbound and outbound healthcare phonecalls with Voice AI. Superdial's agents call your insurance company, navigate their phone trees, wait on hold, and get all your questions asked, answered, and returned to you via API or EHR integrations. Prior to Superdial, Nik graduated from Stanford University with a BS & MS in Computer Science, where he contributed to research in automated literature mining and competed on the varsity water polo team. Today, Nik is based in NYC and continues to play water polo, but has recently ventured into endurance running events.