← back

How to build world-class AI products — Sarah Sachs (AI lead @ Notion) & Carlos Esteban (Braintrust)

3.4K views · Jun 27, 2025 · 103:45 min · Watch on YouTube ↗
Takeaway

World-class AI products spend the majority of engineering time on evals and observability, not prompts — that's how Notion ships fast at consumer scale.

Summary

  • Notion AI lead claims 10% of time prompting, 90% looking at evals and usage in Braintrust — the right balance to ship reliably at consumer scale (100M+ users).
  • Notion AI predates ChatGPT — evolved from inline AI Writer → autofill in databases → multilingual RAG Q&A → enterprise search → deep research with parallel tool use.
  • Free trials force scale: a new feature can have more usage than paid enterprise plans, demanding production-grade observability from day one.
  • Self-dogfooding in Notion generates evaluation data; human labelers focus on quality over quantity, especially for fine-tuning insights.
  • Latest 2-week launch: AI meeting notes with STT, transcription, AI summaries, action-item extraction into task databases.
evalsproductnotion
Original description
Join us for a hands-on workshop where you'll learn practical strategies to evaluate AI applications throughout their lifecycle—from initial testing of prompts to ongoing monitoring in production. We’re excited to host Sarah Sachs, AI Lead at Notion, who will share insights into how Notion built their acclaimed Notion AI.

About Carlos Esteban
Carlos Esteban is a Solutions Engineer at Braintrust. Previously, he helped enterprises secure and scale infrastructure at HashiCorp. He’s also a former tennis player turned yoga enthusiast, still auditioning his next full-time sport.

Recorded at the AI Engineer World's Fair in San Francisco. Stay up to date on our upcoming events and content by joining our newsletter here: https://www.ai.engineer/newsletter