How agents broke app-level infrastructure - Evan Boyle

577 views · Jun 03, 2025 · 13:32 min · Watch on YouTube ↗

Takeaway

Agentic workloads break Web 2.0 infrastructure assumptions about latency and reliability; we need durable execution layers built for seconds-to-hours requests.

Summary

Genisx founder Evan Boyle (ex-Pulumi early employee) argues classical Web 2.0 infrastructure assumes ms-latency requests; LLM apps run for seconds-to-hours and break those assumptions.
Real-world: stitched-together LLM providers all suffer correlated outages — OpenAI 99.9% uptime is misleading, and Gemini outages overlap.
Rate limits and bursty batch workloads (e.g., crawling user inboxes/GitHub) cause failures that Web 2.0 infra never had to handle.
Calls for new compute primitives — durable execution, long-running workflows, fallback orchestration — purpose-built for agentic workloads.

agentsinfrastructuredurable-execution

Original description

LLMs have completely broken our assumptions about app-level workloads. Compared to querying a database, LLMs are extremely flakey and slow. In web 2.0, p99 latency was just a few hundred milliseconds - anything higher and the on call is getting paged. 

But today any API that uses LLMs has a p1 latency of a couple of seconds. Yet, the infrastructure we build on top of hasn't caught up with these new assumptions. There isn't a single serverless provider that supports running code for more than a few minutes!

In this session we'll take about infrastructure patterns that used to be niche, but today require attention from anyone building on top of LLMs:

- Durable execution
- Long running workflows and APIs
- Durable execution
- Agent-scoped storage