← back

Why should anyone care about Evals? — Manu Goyal, Braintrust

Original: Why should anyone care about Evals? — Manu Goyal, Braintrust

13.1K views · Jun 27, 2025 · 5:41 min · Watch on YouTube ↗
Takeaway

Evals are the laboratory that lets you iterate offline and turn production traffic into the next training set — without them you ship blind.

Summary

  • Manu's journey from self-driving cars taught him you can tune the model forever but you need evals to know if it ships
  • Evals are not just unit tests — without them, the only feedback is expensive, slow, risky production
  • Investing in evals builds a 'laboratory' where 90% of the product iteration loop happens before prod, enabling faster, more confident shipping
  • Applying offline eval metrics to online production data identifies which examples are most valuable for the next training loop — closing the data flywheel
  • Braintrust offers a platform for evals plus prompt tweaking, observability and dataset management; Kevin Weil, Gary Tan, Mike Krieger, Greg Brockman publicly endorse evals
evalsobservabilitydata-flywheel
Original description
An introduction to the evals track

About Manu Goyal
Manu Goyal is the founding engineer at Braintrust. Previously, he developed autonomous systems at Nuro. He has an 8 year old Pomeranian named Hendrix.

Recorded at the AI Engineer World's Fair in San Francisco. Stay up to date on our upcoming events and content by joining our newsletter here: https://www.ai.engineer/newsletter