← back

Keynote: The AI developer experience doesn't have to suck – why and how we built Modal

Original: Keynote: The AI developer experience doesn't have to suck – why and how we built Modal

1.5K views · Feb 22, 2025 · 21:38 min · Watch on YouTube ↗
Takeaway

Sub-second container start + Python-native serverless makes Modal feel like local iteration while scaling to thousands of GPUs.

Summary

  • Modal CEO Erik Bernhardsson (ex-Spotify recsys) built a serverless Python infra to make ML/AI deployment feel as fast as local iteration.
  • Solving fast container cold-start in the cloud required custom scheduler, custom filesystem, and multi-year foundational infra work.
  • Decorator-based API: @modal.function turns any Python function into a serverless GPU function (H100, A100, L4, T4) with container images defined in code (pip install torch etc.).
  • Live demo fans out 10,000 invocations across thousands of containers; map-style parallelism for batch jobs.
  • Heavy use cases: diffusion (AI music/video/images), medical imaging, computational bio/protein folding, fine-tuning, batch embeddings, LLM inference.
infrastructureserverlessmodal
Original description
Modal provides infrastructure for AI and other compute-heavy applications. In order to deliver the developer experience we wanted, we realized early that we would have to throw out Kubernetes and Docker and start over. This is the story of a deep rabbit hole of how we built our own container system in the quest of a great developer experience.