← back

Building AI Products That Actually Work — Ben Hylak (Raindrop), Sid Bendre (Oleve)

2.9K views · Jul 24, 2025 · 18:42 min · Watch on YouTube ↗
Takeaway

Build AI products by shipping, observing real production behavior and iterating — evals are necessary but they cannot tell you how good your product actually is.

Summary

  • Ben Hylak (Raindrop CTO, ex-Apple/SpaceX) and Sid Bendre (Oleve co-founder — 4-person team scaled viral apps to $6M ARR) focus on how to iterate on AI products rather than rehash evals.
  • Argues prompt engineering won't die with AGI — communication itself is hard; even partners and new hires misinterpret instructions, and capable models surface more undefined behavior, not less.
  • Calls out high-profile failures (Virgin Money chatbot threatening customers over the word 'virgin', Grok unsolicited 'white genocide' tangents, Google Cloud confusing Azure/Roblox credits) as evidence you cannot define product behavior up front.
  • Eval misconceptions: evals only measure failure modes you already know about, suffer Goodhart's law, and recent model launches sometimes regress on evals while being better in real use — eval scores are not product quality.
  • Raindrop's pitch: a 'stealth frontier lab' that ingests production events from companies like Clay.com and coding assistants, exposing tools like deep search and few-shot classifiers built from production data.
productiterationraindrop
Original description
You've made the demo. How do you make the product? A lot of AI products don't actually work. Even worse, a lot of the techniques being advertised for making AI products better don't work either. We'll cover the challenges + techniques we've seen actually work in the real world.

About Ben Hylak
Ben Hylak is co-founder at Raindrop, building Sentry for AI products. He was previously a designer at Apple for 4 years, building the Apple Vision Pro.

About Sid Bendre
Sid Bendre is the co-founder of Oleve, a company building a portfolio of iconic consumer software across multiple verticals. With a lean team, Oleve has already launched two virally successful consumer AI products that have amassed over 250 million views across social media platforms. One of their products reached #4 on the App Store's Education charts in 2024 and #5 in 2025, competing alongside giants like Photomath (Google) and Duolingo. Backed by Neo, Cal Henderson (co-founder of Slack), Russell Kaplan (President of Cognition), and Maria Zhang (ex-CTO of Tinder), Oleve is building the AI infrastructure to run a $1B portfolio of consumer software over the next decade. At Oleve, Sid leads technical and AI efforts, running the “Platform” team responsible for the underlying AI infrastructure that powers their lean scaling approach. Before Oleve, Sid led AI experimentation efforts at a startup hedge fund and worked at Slack, Zendesk, and Microsoft.

Recorded at the AI Engineer World's Fair in San Francisco. Stay up to date on our upcoming events and content by joining our newsletter here: https://www.ai.engineer/newsletter