← back
Why should anyone care about Evals? — Manu Goyal, Braintrust
Original: Why should anyone care about Evals? — Manu Goyal, Braintrust
Takeaway
Evals are the laboratory that lets you iterate offline and turn production traffic into the next training set — without them you ship blind.
Summary
- Manu's journey from self-driving cars taught him you can tune the model forever but you need evals to know if it ships
- Evals are not just unit tests — without them, the only feedback is expensive, slow, risky production
- Investing in evals builds a 'laboratory' where 90% of the product iteration loop happens before prod, enabling faster, more confident shipping
- Applying offline eval metrics to online production data identifies which examples are most valuable for the next training loop — closing the data flywheel
- Braintrust offers a platform for evals plus prompt tweaking, observability and dataset management; Kevin Weil, Gary Tan, Mike Krieger, Greg Brockman publicly endorse evals
evalsobservabilitydata-flywheel
Original description
An introduction to the evals track About Manu Goyal Manu Goyal is the founding engineer at Braintrust. Previously, he developed autonomous systems at Nuro. He has an 8 year old Pomeranian named Hendrix. Recorded at the AI Engineer World's Fair in San Francisco. Stay up to date on our upcoming events and content by joining our newsletter here: https://www.ai.engineer/newsletter