🤖 Agents
LLM agents that plan, call tools, and act in loops. LangGraph, CrewAI, AutoGen, custom orchestration, multi-agent systems, agent reliability.
The workflow
flowchart LR
A[User goal] --> B[Planner LLM<br/>breaks goal into steps]
B --> C{Tool needed?}
C -->|Yes| D[Tool call<br/>search, code, API]
C -->|No| E[Reasoning step]
D --> F[Observe<br/>tool result]
F --> G{Goal complete?}
E --> G
G -->|No| B
G -->|Yes| H[Final answer<br/>+ trace]
The plan-act-observe loop every modern agent runs. The hard part is reliability across hundreds of iterations.
Key takeaways
Videos (110)
Don't Build Agents, Build Skills Instead – Barry Zhang & Mahesh Murag, Anthropic
Stop building bespoke agents per domain — package expertise as portable, file-based 'skills' on top of a universal code-executing agent.
How We Build Effective Agents: Barry Zhang, Anthropic
Use agents only when tasks are ambiguous, valuable, and reversible, keep the architecture minimal, and debug by stepping inside the model's context.
12-Factor Agents: Patterns of reliable LLM applications — Dex Horthy, HumanLayer
Reliable agents come from applying classic software engineering modularity to small LLM components, not from giving a single agent more tokens and tools.
State of the Claw — Peter Steinberger
Open Claw's hyper-growth has made it the largest target for AI-generated security reports, forcing the project to rebuild around hardening and a healthier maintainer bus factor.
Building AI Agents that actually automate Knowledge Work - Jerry Liu, LlamaIndex
Real knowledge-work automation needs a document toolbox (parse, extract, index, manipulate) — not just RAG retrieval.
Architecting Agent Memory: Principles, Patterns, and Best Practices — Richmond Alake, MongoDB
Treat agent memory as a first-class architectural concern with multiple memory types, retrieval pipelines, and forgetting mechanisms — not as 'stuff everything into the context window'.
Building Multimodal AI Agents From Scratch — Apoorva Joshi, MongoDB
Multimodal agents are built by combining LLM reasoning with perception, tools, and persistent memory — but use them only when simpler workflows can't handle the task.
Reinforcement Learning for Agents - Will Brown, ML Researcher at Morgan Stanley
Reinforcement learning, not just bigger models, is the missing piece to take agents from 70%-pipelines to reliable long-horizon autonomous systems.
Claude Agent SDK [Full Workshop] — Thariq Shihipar, Anthropic
Reuse Claude Code's harness (bash + file system + container) instead of building your own agent loop from scratch — most agent tasks become tractable when the model can write code.
The Multi-Agent Architecture That Actually Ships — Luke Alvoeiro, Factory
Long-running multi-agent coding works only with adversarial validation contracts and serial-by-default execution plus structured handoffs — not by parallelizing more agents.
How to Train Your Agent: Building Reliable Agents with RL — Kyle Corbitt, OpenPipe
Always prompt your way to a working agent first; reach for RL only when prompting plateaus, and design rewards carefully to avoid hacks.
3 ingredients for building reliable enterprise agents - Harrison Chase, LangChain/LangGraph
Reliable enterprise agents come from picking high-value tasks, mixing workflows with agents, and using observability to shrink uncertainty rather than chasing pure autonomy.
Rise of the AI Architect — Clay Bavor, Cofounder, Sierra w/ Alessio Fanelli
Customer-facing AI agents are emerging as the new branded channel after website/app, requiring a new hybrid CX-engineering-brand role to operate them well.
The Agent Development Life Cycle — Zack Reneau-Wedeen, Sierra
Treat every customer-facing agent as a product with a real engineering lifecycle and dedicated agent engineers, not a one-shot prompt.
Creating Agents that Co-Create — Karina Nguyen, OpenAI
Each scaling paradigm shift opens new product UX patterns; the next frontier is agents that actively co-create with humans rather than just respond.
AgentCraft: Putting the Orc in Orchestration — Ido Salomon
To scale beyond 1-2 parallel agents, treat orchestration like an RTS game: visualize the file system, batch approvals, and offload planning/review rather than per-step babysitting.
Paperclip: Open Source Human Control Plane for AI Labor — Dotta Bippa
Paperclip operationalizes 'zero-human companies' by giving every part of the app an agentic surface inside an org-chart abstraction with built-in QA/approval roles.
Scaling Agents for Gen AI Products - Anju Kambadur, Bloomberg Head of AI Engineering
Finance-grade agents require semi-autonomous architectures with mandatory guardrails and tool APIs as rigorously documented as numerical libraries.
Using agents to build an agent company: Joao Moura
CrewAI is using its own multi-agent platform to run marketing, sales qualification and docs at scale, and is doubling down on code-executing, trainable crews.
Rethinking how we Scaffold AI Agents - Rahul Sengottuvelu, Ramp
Stop scaffolding around weak models; build agent systems where LLMs orchestrate and you scale by throwing more compute (parallel sampling + verifiers) at the problem.
Proactive Agents – Kath Korevec, Google Labs
The next devex shift is from reactive command-line agents to proactive collaborators that observe workflows, anticipate, and stay aligned via human-in-the-loop refinement.
From Chaos to Choreography: Multi-Agent Orchestration Patterns That Actually Work — Sandipan Bhaumik
Treat multi-agent AI as distributed systems engineering and pick choreography vs. orchestration based on workflow complexity and autonomy needs.
Personal, Local, Private AI Agents: Soumith Chintala
Personal agents must be local and context-rich (a home Mac Mini is the practical substrate today) because the right private context matters more than raw model intelligence.
Katelyn Lesse – Evolving Claude APIs for Agents, Anthropic
Anthropic is shipping memory + context-editing + sandboxed code execution as primitives so developers don't build agent harness infra themselves.
Lets Build An Agent from Scratch
An agent emerges once an LLM can read/write its own todo list and decide when to stop — frameworks just wrap these primitives.
Trust, but Verify: Knowledge Agents for Finance Workflows - Mike Conover
Financial knowledge agents need verticalized scaffolding, multi-pass self-verification, and human-in-the-loop nudges—chat alone won't do trust-but-verify work.
OpenAI + @Temporalio : Building Durable, Production Ready Agents - Cornelia Davis, Temporal
Combining OpenAI's Agents SDK with Temporal turns agentic loops into durable, crash-tolerant workflows without the developer hand-coding retry, timeout and resume logic.
Ralph Loops: Build Dumb AI Loops That Ship — Chris Parsons, Cherrypick
A dumb-but-persistent agent loop ('Ralph') beats complex orchestration for getting real work shipped, in code and beyond.
Why the Best AI Agents Are Built Without Frameworks (Primitives over Frameworks) — Ahmad Awais, CHAI
Build agents from a small set of cloud-scaled primitives (memory, parser, chunker, threads, tools) rather than monolithic AI frameworks.
How we solved Context Management in Agents — Sally-Ann Delucia
The right context strategy lets agents remember what they need and forget what they don't — context management is a product problem, not just a token-limit problem.
The New Application Layer - Malte Ubl, CTO Vercel
Agents unlock a much larger market of automatable software, and the highest-leverage archetypes are simple research-compression and information-surfacing, not autonomous heroics.
Agents vs Workflows: Why Not Both? — Sam Bhagwat, Mastra.ai
Build with both agents and workflows using readable fluent APIs — don't make your team learn graph theory for production agentic systems.
Building Intelligent Research Agents with Manus - Ivan Leo, Manus AI (now Meta Superintelligence)
Manus is positioned as a general-purpose agent you embed anywhere — its new API plus browser-operator let you ship complex research workflows from a single prompt.
Full Workshop: Build Your Own Deep Research Agents - Louis-François Bouchard, Paul Iusztin, Samridhi
Most 'agent' use cases are really workflows — start simple, climb the autonomy slider only when you must, and engineer the research/writing loop deliberately.
Hard Won Lessons from Building Effective AI Coding Agents – Nik Pash, Cline
In 2025, the leverage in coding agents has shifted from harness cleverness to building RL environments with outcome-only verifiers that train the next model generation.
Agents need more than a chat - Jacob Lauritzen, CTO Legora
Long-horizon vertical agents need verifiability proxies, task decomposition, guardrails, and continuous human steering — chat is the wrong interaction model.
The Web Browser Is All You Need - Paul Klein IV, Browserbase
Every AI agent needs a browser tool because the 'unsexy internet' will never expose first-party APIs.
Build an AI Research Agent: Apoorva Joshi
Use agents only when the task truly needs multi-step reasoning, tools and memory — and pick ReAct or Reflection patterns to keep the planning loop tractable.
Effective agent design patterns in production — Laurie Voss, LlamaIndex
Build agents around LLMs' real strength — compressing messy text into structure — and lean on frameworks like LlamaIndex to skip boilerplate.
How Deep Research Works - Mukund Sridhar & Aarush Selvan, Google DeepMind
Deep Research trades latency for comprehensiveness via an editable plan, iterative grounded planning, and robust state management over a noisy web environment.
Building Applications with AI Agents — Michael Albada, Microsoft
Build agentic systems incrementally up the agency ladder, minimize tool surface area, and keep deterministic business rules out of the model.
Why Agent Engineering — swyx
Agent engineering is emerging as its own discipline at the intersection of MLE and software engineering, driven by cheaper intelligence, better tools, and clear PMF in coding/support.
From Stateless Nightmares to Durable Agents — Samuel Colvin, Pydantic
Long-running agents need durable execution — wrap LLM and tool calls as Temporal activities so crashes don't restart deep-research-style workflows.
Containing Agent Chaos — Solomon Hykes, Dagger
Coding agents need first-class containerized environments — sandboxes for execution aren't enough; the agent itself needs to develop inside the container.
Patrick Dougherty: How to Build AI Agents that Actually Work
Build agents around reasoning and small retrieval tools, then iterate obsessively on the agent-computer interface (tool names, schemas, response formats) per model.
Memory Masterclass: Make Your AI Agents Remember What They Do! — Mark Bain, AIUS
Graph-structured memory preserving causal relationships, not just embeddings, is the missing primitive for trustworthy agentic recall and reasoning.
Demand-Driven Context: A Methodology for Coherent Knowledge Bases Through Agent Failure
Stop monolithic retrieval layers; decompose institutional knowledge into demand-driven context blocks discovered from real agent failures.
A Piece of Pi: Embedding The OpenClaw Coding Agent In Your Product — Matthias Luebken, Tavon
Coding agents become product primitives when you reuse their tool-loop + shell runtime with thin extension APIs for UI and session events.
How to build Enterprise Aware Agents - Chau Tran, Glean
Stop choosing between workflows and agents—train agents on your golden workflows and let agents mine new workflows from real user usage.
I Gave an AI Agent the Keys to My Life (Here's What Happened) — Radek Sienkiewicz (@velvetshark-com)
A personal AI agent becomes life-changing not from a big-bang install but from incremental layering of channels, automations and a connected knowledge base.
Keynote: Why people think "agent" is a buzzword but it isn't
Agents aren't a buzzword — they're just hard because step failure compounds; the unlock is models with stronger planning plus better tools and memory.
Agents on the Canvas in tldraw — Steve Ruiz, tldraw
Putting agents directly on a shared canvas with visible state turns AI from a sidebar assistant into a multi-agent collaborator on spatial work.
Stateful Agents — Full Workshop with Charles Packer of Letta and MemGPT
Memory/statefulness — not bigger models or more tools — is the binding constraint preventing agents from becoming actually useful in production.
How to Build Planning Agents without losing control - Yogendra Miraje, Factset
Insert a natural-language blueprint step between user intent and planner to keep enterprise planning agents controllable, debuggable, and tractable.
Skill Issue: How We Used AI to Make Agents Actually Good at Supabase — Pedro Rodrigues, Supabase
Skills (markdown + scripts with progressive disclosure) outperform stuffing tools into MCP contexts for teaching agents how to work inside a product like Supabase.
Code Mode: Let the Code do the Talking - Sunil Pai, Cloudflare
Drop JSON tool calls for code generation against a typed runtime — vastly fewer tokens, faster execution, and emergent state-machine inhabitation.
Architecting and Testing Controllable Agents: Lance Martin
Use LangGraph to encode the parts of an agent's control flow you want fixed while letting the LLM steer the parts that need flexibility.
Building durable Agents with Workflow DevKit & AI SDK - Peter Wielander, Vercel
Wrap long-running agents in a workflow library to get durability, retries, observability and human approval without manual queue plumbing.
The 3 Pillars of Autonomy – Michele Catasta, Replit
For non-technical users, autonomy means offloading every technical decision — anchored by verification and context, not just long runtimes.
Introducing Strands Agents, an Open Source AI Agents SDK — Suman Debnath, AWS
Strands treats agent building as just-model-plus-tools, leveraging MCP and LLM reasoning rather than hand-written scaffolding.
Useful General Intelligence — Danielle Perszyk, Amazon AGI
Amazon's bet is reliable browser-based agents (NovaAct) that augment humans rather than replace them — automation has to lead to augmentation, not echo-chamber lock-in.
Multi Agent AI and Network Knowledge Graphs for Change — Ola Mabadeje, Cisco
Network change management is a natural fit for multi-agent systems backed by a graph-shaped digital twin of the network.
Case Study + Deep Dive: Telemedicine Support Agents with LangGraph/MCP - Dan Mason
For regulated healthcare agents, pair LangGraph's visual orchestration with hybrid human-in-the-loop and MCP tools so doctors stay in control.
Identity for AI Agents - Patrick Riley & Carlos Galan, Auth0
Production agents need a dedicated identity layer—async user approval, token vaulting, and fine-grained authorization—not just API keys in environment variables.
The Devops Engineer Who Never Sleeps — Diamond Bishop, Datadog
Build vertical, evaluation-first devops agents that augment runbooks rather than generalist assistants, and make their reasoning visible for trust.
Automating Large Scale Refactors with Parallel Agents - Robert Brennan, OpenHands
Parallel agent orchestration in cloud sandboxes is unlocking automated remediation of tech debt at a scale impossible for single-agent or human workflows.
Hacking Subagents Into Codex CLI — Brian John, Betterup
You can bolt Claude-Code-style subagents onto Codex CLI with a wrapper-script + file-handoff pattern, but tuning the sandbox permissions is the real engineering challenge.
Building Agents at Cloud Scale — Antje Barth, AWS
Cloud-scale agents are enabled by model-driven SDKs like Strands plus retrieval over tool catalogs to dodge context-window limits.
Agents for Everything Else — swyx
Agents are leaving the IDE — running a real $9M business with 9 people via coding agents proves they can take over ops work too.
UX Design Principles for Semi Autonomous Multi Agent Systems — Victor Dibia, Microsoft
For agents that act in stateful environments, design the UX around streaming plan visibility and a mix of deterministic tools and autonomous exploration.
Make your own event-sourced agent harness using stream processors — Jonas Templestein, Iterate
Event-sourced, URL-addressable agents with composable plugins offer a more debuggable and extensible alternative to monolithic agent harnesses.
Building Agents (the hard parts!) - Rita Kozlov, Cloudflare
Build agents as four loosely-coupled components (client/AI/workflows/tools) and use durable, stateful compute primitives like Cloudflare Durable Objects to back them.
Give Your Agent a Computer — Nico Albanese, Vercel
Production agents need their own sandboxed computer — Vercel Sandbox + AI Gateway makes that an OIDC-authenticated primitive.
Building Multi agent Systems with Finite State Machines
Pair LLMs with finite state machines + actors to get governable, auditable agentic systems that compensate for LLM unpredictability.
This video was edited with AI agent. But how?
Code-generating agents plus a browser-renderable JS video library make programmatic video editing feasible end-to-end.
Viktor: AI Coworker That Lives in Slack — Fryderyk Wiatrowski
Building a company-level AI coworker in Slack hinges on permissioned multi-user memory and collapsing Slack's many interaction modes into a coherent agent context.
The Age of the Agent: Flo Crivello
Single AI assistants like Lindy are stepping stones; the real unlock is multi-agent societies that build their own integrations and tools.
The Demo I Wish I'd Had: OpenAI's Agents SDK... serverless! - Brook Riggio
A production-friendly agent stack today is Next.js + Vercel + Inngest + OpenAI Agents SDK Python, glued by serverless functions and event-driven orchestration.
How to Improve Your Agents: Academic Lit Review
Agent progress maps onto self-driving-style autonomy levels — most production agents sit at L2/L3, with multi-task L4 emerging.
Emergence Launch: AI Agents and the future enterprise: Dr. Satya Nitta
Emergence is pitching orchestration plus self-improving web agents as the substrate for autonomous enterprise workflow automation.
Building Agents with Amazon Nova Act and MCP - Du'An Lightfoot, Amazon (Full Workshop)
AWS's Strands + Nova Act + MCP combo lets developers build browser-control agents in tens of lines, with Q CLI as the default productivity surface.
Building Reliable Agentic Systems: Eno Reyes
Reliable agentic systems borrow robotics-style state filtering, MPC replanning, consensus sampling, and unapologetic hardcoded plan criteria.
Cohere: Building enterprise LLM agents that work (Shaan Desai)
Enterprise agents that actually work come from disciplined tool-spec engineering, conservative multi-agent design, and step-level evaluation—not framework choice.
How agents will unlock the $500B promise of AI - Donald Hruska, Retool
The agent itself is easy in 100 lines of code; productionizing security, observability, connectors and compliance is why most enterprises will buy a managed agent platform.
Events are the Wrong Abstraction for Your AI Agents - Mason Egger, Temporal.io
Putting events at the center of agent architectures hides the real workflow logic; durable-execution engines model business intent more honestly.
The Unbearable Lightness of Agent Optimization — Alberto Romero, Jointly
Single-axis context engineering hits a ceiling; multi-dimensional meta-controllers that pick the right tool per task profile beat uniform pipelines on both quality and cost.
Your Support Team Should Ship Code – Lisa Orr, Zapier
Zapier's Scout agent turns the support team into shippers of integration fixes by orchestrating diagnosis and codegen MCP tools end-to-end inside their existing Jira/GitLab workflow.
Agents are Robots Too: What Self-Driving Taught Me About Building Agents — Jesse Hu, Abundant
Treat coding/browser agents as digital robots — invest in closed-loop sensing, sampling rates, and an offline simulation/eval stack, not just better models.
Breaking the Chain: Agent Continuations for Resumable AI Workflows - Greg Benson
Agent continuations let you snapshot, persist and resume in-flight agent state across hosts so HITL approval and long-running workflows survive process restarts.
Hyperspace More Nodes Is All You Need: Nicolas Schlaepfer
A decentralized peer network plus a fine-tuned planning DAG model can give power users an editable agentic workflow over diverse open models.
Ship it! Building Production Ready Agents — Mike Chambers, AWS
Bedrock Agents productionizes the five components of an agent (model, prompt, loop, history, tools) without managing GPU infra.
Agentic Workflows on Vertex AI: Rukma Sen
Agents on Vertex AI are model+tools+orchestration with explicit responsibility for safety, privacy, and clear gen-vs-fact UX as wearables make AI ambient.
Stop Using RAG as Memory — Daniel Chalef, Zep
Model agent memory after your business domain using a typed knowledge graph instead of dumping facts into a vector store.
Beyond Conversation: Why Documents Transform Natural Language into Code - Filip Kozera
Move from chat to documents-as-specs to repeatable background agents triggered by emails/meetings — human role becomes Tinder-style swipe-approve.
The Future of Work: Toran Bruce Richards, Silen Naihin et al
AutoGPT is doubling down on open-source, benchmark-driven, safety-conscious generalist agents — with code agents as the priority milestone toward AGI.
Effective AI Agents Need Data Flywheels, Not The Next Biggest LLM – Sylendran Arunagiri, NVIDIA
Sustained agent quality comes from a closed-loop data flywheel that uses production traces and feedback to fine-tune, not from bigger base models.
From Copilot to Colleague: Trustworthy Agents for High-Stakes - Joel Hron, CTO Thomson Reuters
For high-stakes domains, agentic AI requires expert-driven evals, tunable agency dials, and exposing legacy systems as agent tools.
Building Code First AI Agents with Azure AI Agent Service — Cedric Vidal, Microsoft
Azure AI Agent Service moves agent state, tools, and data integrations to the cloud, trading flexibility for a much simpler stateful dev model.
Stateful environments for vertical agents — Josh Purtell, Synth Labs
Wrap vertical-agent workspaces in resetable stateful environments to enable model swaps, multi-agent, rollbacks, and tree search out of the box.
Claude plays Minecraft!
Minecraft is a great agent playground; Amazon Bedrock Agents with Return-of-Control let you wire LLM tool use into Mineflayer for emergent in-world behavior.
Real AI Agents Need Planning, Not Just Prompting - Yuval Belfer
ReAct agents can't see the forest for the trees — explicit pre-execution planning (form- or code-based) is what makes complex agents reliable.
Beyond APIs: How AI Web Agents Are Automating the "Long Tail" of Knowledge Work
Browser-side AI web agents like Retriever automate the long tail of knowledge work where APIs don't exist — multi-tab scraping + dynamic third-party tool calls.
Building Reliable Support Agents Using the Effect Typescript Library - Michael Fester
Effect brings functional-programming rigor to TypeScript so LLM agents get retries, fallbacks, observability and type-safe dependency injection without rolling your own infra.
Which Jobs Can Be Replaced Today: Fryderyk Wiatrowski and Peter Albert
Replace the reactive layer of any job first via a trigger aggregator + browser actions; advance through prompt → cog-arch → FT → RL only as needed.
Grounded Reasoning Systems for Cloud Architecture - Iman Makaremi
Cloud-architecture AI needs grounded multi-agent reasoning over both text requirements and graph topology, not just IaC generation.
The missing pieces of workflow automation — Shirsha Chaudhuri, Thomson Reuters Labs
Enterprise agentic workflow automation is bottlenecked by legacy connectors, reliability metrics, standardization, and human-agent collaborative UX — not by model capability.
The Current State of Browser Agents - Jerry Wu and Wyatt Marshall
Browser agents are now feasible for read tasks and creeping into write tasks, but evaluation is hard and the underlying browser infrastructure can swing performance as much as the model.
Machines of Buying and Selling Grace - Adam Behrens, New Generation
Agentic commerce needs intent-level rails — unified product APIs, delegated payment auth, and dynamic merchant interfaces — not just bots clicking buy buttons on legacy sites.
Stop Guessing: Build Robust AI with Layered CoT
Adding a verification step after each reasoning hop converts brittle Chain-of-Thought into a self-correcting, auditable pipeline suitable for multi-agent systems.
Ionic Launch: Opening the economy to AI agents
Agents need agent-native commerce infrastructure (rich SKU data + single-step transactions) and a merchant-paid monetization model to scale beyond consumer subscriptions.