Real ROI: Lessons from Enterprises that have already succeeded with LLMs at Scale: Raza Habib

7.6K views · Dec 31, 2024 · 20:00 min · Watch on YouTube ↗

Takeaway

Enterprise LLM ROI comes from domain experts in the loop, simple 4-component apps, and rigorous up-front evaluations — not exotic architectures.

Summary

Raza Habib (Humanloop) shares patterns from hundreds of enterprise deployments — Filevine launched 6 LLM products and roughly doubled revenue; we're past the 'will this deliver value' stage.
Most LLM apps are 4 simple components: base model, prompt template, data selection (RAG/API), function calling — chained together. Complexity comes from making each component good, not from architecture.
Successful teams need less ML expertise than expected (generalist full-stack engineers > training-focused PhDs) and far more domain experts directly editing prompts (Duolingo linguists ship prompts, engineers can't edit them).
People-Ideas-Machines order: right team + clear eval criteria up front + then tooling — not the other way around.
Predicts chains/scaffolding will shrink over time as models improve at tool selection; don't over-engineer around current model flaws.

enterpriseevalshumanloop

Original description

I'll share practical insights I've learned from AI leaders at Duolingo, Gusto, Vanta, Filevine, Ironclad and Sourcegraph who have succeeded with LLMs in production. We'll cover the skills your team needs, tips for RAG in production, how to choose evals and what it takes to succeed with agents.

Recorded live in San Francisco at the AI Engineer World's Fair. See the full schedule of talks at https://www.ai.engineer/worldsfair/2024/schedule & join us at the AI Engineer World's Fair in 2025! Get your tickets today at https://ai.engineer/2025

About Raza
Raza Habib is the CEO and Cofounder of Humanloop where he's helped hundreds of companies get AI into production. He has a PhD in Deep Learning from UCL and studied physics at Cambridge. Sifted named him one of Europe's 20 Gen AI power players.