How we solved Context Management in Agents — Sally-Ann Delucia

23.6K views · May 10, 2026 · 16:16 min · Watch on YouTube ↗

Takeaway

The right context strategy lets agents remember what they need and forget what they don't — context management is a product problem, not just a token-limit problem.

Summary

Arize's agent Alex was stuck in a vicious loop: it analyzed Arize span/trace data, which kept growing past context limits, causing failures and more data accumulation.
Solution combined three things: strategic context selection (not 'shove everything in'), structured summarization of historical spans, and sub-agents for scoped retrieval over the trace corpus.
Long conversations break agents — even within token budgets, irrelevant context degrades reasoning, so they treat context engineering as a product/UX problem not just engineering.
Sub-agents help isolate context bloat but introduce coordination challenges still being solved.
Quotes Karpathy's '+1 to context engineering over prompt engineering' as the framing for the talk.

agentscontext-engineeringarize

Original description

The naive solution is truncation. The obvious solution is summarization. Neither worked — and the Arize team found out the hard way while building an AI agent that had to analyze the very trace data it was generating.

A year of lessons from building Alyx, starting with the vicious loop that defined the problem: Alex runs on trace data, the spans grow, the context limit hits, it fails and tries again. The talk covers why truncation breaks reasoning, why summarization gives the LLM too much control, and how head/tail preservation with a retrievable memory store is what actually held. Then: long session evals, sub-agents as the answer when one context accumulates too much, and what they found when they went looking for secrets in the Claude Code source release.

Speaker info:
- https://www.linkedin.com/in/sallyann-delucia-59a381172/

Timestamps:
0:00 Introduction and speaker background
1:02 Overview of the AI agent, Alyx
1:29 The problem: Context engineering vs. prompt engineering
4:06 The vicious loop of data growth in AI agents
5:16 Why naive truncation failed
6:14 Why summarization proved unreliable
6:46 The solution: Smart truncation and memory stores
8:02 Handling long session challenges
9:23 Offloading tasks to sub-agents
11:19 Ongoing challenges and future work
12:57 Findings from the Claude Code source release
13:44 Final key takeaways on context management
14:58 Q&A session