← back
Realtime Conversational Video with Pipecat and Tavus — Chad Bailey and Brian Johnson, Daily & Tavus
Takeaway
Production real-time voice/video AI needs a dedicated orchestration layer (Pipecat) on top of voice or video-generation models like Tavus replicas.
Summary
- Real-time conversational video requires three layers: models (STT+LLM+TTS or voice-to-voice), orchestration (Pipecat), and deployment.
- Tavus offers a conversational video interface around their Sparrow Zero and Raven Zero models giving ~600ms response replicas — sometimes too fast, so artificially slowed.
- Pipecat (open-source, vendor-neutral) built around frames (typed data chunks), processors (frame transformers), and pipelines (async DAGs minimizing latency).
- Tavus and Pipecat partnered after Tavus rebuilt orchestration in-house and found Pipecat covered turn detection, response timing, observability, and signals.
- ai.engineer 'Talk to AIE' button powered by Pipecat with Gemini Live; production voice apps need orchestration beyond Gemini's own browser tools.
pipecattavusvoice
Original description
Tavus shipped the world's first realtime video avatar platform last year. Developers use Tavus' conversational video APIs to create education, social, and customer support agents. The Tavus team built their innovative product using the Pipecat open source framework and Daily's global WebRTC infrastructure. Join us for a technical deep dive into conversational video. About Char Bailey Chad Bailey started his career testing software for the Space Shuttle. After many years of building web apps, he's spent the last several working on real-time communication at Daily. Most recently, he's been building the Pipecat framework, and a series of increasingly ridiculous voice bots to show it off. About Brian Johnson Brian Johnson is a Staff Engineer at Tavus, a market-leading generative AI video research company building foundational models and operating systems for human-AI interaction. With a background in electrical engineering and law, he brings decades of experience building and scaling systems across frontend, backend, and ML infrastructure. At Tavus, Brian leads development of real-time AI systems that power lifelike digital humans. His work focuses on combining technical precision with human-centered design to push the boundaries of conversational video AI. Recorded at the AI Engineer World's Fair in San Francisco. Stay up to date on our upcoming events and content by joining our newsletter here: https://www.ai.engineer/newsletter