π» Code Generation
AI coding tools and agents. Cursor, Devin, Copilot internals, SWE-bench, agentic refactoring, repo-scale understanding.
The workflow
flowchart LR
A[Task / prompt] --> B[Retrieve repo context<br/>files + symbols]
B --> C[LLM emits diff<br/>or whole file]
C --> D[Execute / test<br/>in sandbox]
D --> E{Pass?}
E -->|No| F[Self-repair<br/>read error β patch]
F --> C
E -->|Yes| G[PR / commit<br/>+ review]
Coding agents are the first place where AI delivers full economic value. The eval is "did the test pass?"
Key takeaways
Videos (79)
"Software Fundamentals Matter More Than Ever" β Matt Pocock
AI coding works best when paired with classical software discipline β shared design concepts, ubiquitous language, and resistance to entropy.
Full Walkthrough: Workflow for AI Coding β Matt Pocock
Treat LLM context as a finite 'smart zone' and structure coding workflows (small tasks, looping plans, compaction) to stay inside it.
No Vibes Allowed: Solving Hard Problems in Complex Codebases β Dex Horthy, HumanLayer
Advanced context engineering β small, intentional, reviewed context windows β beats raw model power for complex brownfield coding work.
The Infinite Software Crisis β Jake Nations, Netflix
AI codegen optimizes the mechanics Brooks said were never the bottleneck, so the only sustainable fix is to keep doing the hard human work of designing simple, untangled systems.
Does AI Actually Boost Developer Productivity? (100k Devs Study) - Yegor Denisov-Blanch, Stanford
Rigorous private-repo measurement shows AI coding tools yield a real but modest ~15β20% net productivity gain, largely offset by rework on AI-generated code.
Building pi in a World of Slop β Mario Zechner
Coding-agent harnesses should be tiny, observable, and self-modifiable so the model β already trained as a coding agent β owns its own context.
Claude Code & the evolution of agentic coding β Boris Cherny, Anthropic
Coding models are advancing faster than coding products, so Anthropic ships Claude Code as a minimal unopinionated surface to let users discover the right agentic UX.
Harness Engineering: How to Build Software When Humans Steer, Agents Execute β Ryan Lopopolo, OpenAI
Treat agents as infinitely-parallel implementers and invest your time in writing guardrails and persona-oriented docs that make 'a good job' legible to them.
Defying Gravity - Kevin Hou, Google DeepMind
Antigravity reframes the IDE around an agent manager that orchestrates editor and browser agents, leveraging Gemini 3's longer-horizon multimodal tool use.
How Claude Code Works - Jared Zoneraich, PromptLayer
Claude Code works because it strips scaffolding and trusts a tool-calling-tuned model in a simple agentic loop β the lesson is delete complexity, not add it.
From Vibe Coding To Vibe Engineering β Kitze, Sizzy
Embrace AI-assisted coding but layer engineering practices on top so you graduate from vibe coding's gambling loop to disciplined vibe engineering.
Vibes won't cut it β Chris Kelly, Augment Code
AI accelerates code generation but production software engineering β changing systems safely at scale β still requires human judgment, context, and rigorous practices.
How Windsurf writes 90% of your code with an Agentic IDE - Kevin Hou, Windsurf
Tight editor-agent integration via a shared action timeline is what lets Windsurf write the majority of code while keeping users in flow.
Agentic Engineering: Working With AI, Not Just Using It β Brendan O'Leary
Productivity gains from AI coding come from explicit context engineering and treating agents like fast but judgment-poor juniors you must direct.
AI Engineering at Jane Street - John Crepezzi
For obscure internal languages, custom fine-tuning on workspace snapshots + build-state transitions beats off-the-shelf coding models.
2026: The Year The IDE Died β Steve Yegge & Gene Kim, Authors, Vibe Coding
The future of coding tools is swarms of specialized agents orchestrated through a UI, not a single muscular agent inside an IDE.
Making Codebases Agent Ready β Eno Reyes, Factory AI
The bottleneck for agent productivity in enterprises is automated-validation rigor in the codebase, not which coding agent you buy.
Amp Code: Next Generation AI Coding β Beyang Liu, Amp Code
Effective coding agents come from curated tools, dedicated subagents for search/reasoning, and a review UI β not from piling on MCP servers.
Collaborative AI Engineering: One Dev, Two Dozen Agents, Zero Alignment β Maggie Appleton, GitHub
Coding agents have collapsed implementation cost so alignment must become continuous and multiplayer β not a PR-time afterthought.
Spec-Driven Development: Agentic Coding at FAANG Scale and Quality β Al Harris, Amazon Kiro
Spec-driven development with EARS requirements and property-based testing gives agentic coding a verifiable throughline from intent to shipped code.
Self Coding Agents β Colin Flaherty, Augment Code
An AI coding agent given basic process and tool primitives can write the bulk of its own codebase, including features, tests, and self-profiling optimizations.
OpenAI Codex Masterclass β Vaibhav Srivastav & Katia Gil Guzman
Codex is positioned as a full software-engineering teammate across surfaces, with sub-agents, work trees, and plugins as the new productivity primitives.
Building Cursor Composer β Lee Robinson, Cursor
Cursor Composer trades raw frontier-model intelligence for 4x speed by RL-training on a production-mirroring environment with custom MoE kernels.
The emerging skillset of wielding coding agents β Beyang Liu, Sourcegraph / Amp
Coding agents need a new operating mode (direct edits, oversight not approval, usage-based pricing, model-coupled scaffolding) β old chatbot intuitions actively hurt.
The State of AI Code Quality: Hype vs Reality β Itamar Friedman, Qodo
You don't break the AI productivity glass ceiling with better autocomplete β you break it with agentic, dynamically-learning quality workflows across the whole SDLC.
Developer Experience in the Age of AI Coding Agents β Max Kanat-Alexander, Capital One
Make your codebase, tooling, and documentation match how the rest of the industry works so agents inherit decades of training-set fluency.
Building your own software factory β Eric Zakariasson, Cursor
Treat your codebase like a factory: structure for discoverability, add guardrails reactively, and invest in verifiable end-to-end checks.
Software Development Agents: What Works and What Doesn't - Robert Brennan, OpenHands
Coding agents become genuinely productive when you scope tasks tightly, sandbox aggressively, and treat the agent as the inner-loop coder while humans own architecture.
Devin 2.0 and the Future of SWE - Scott Wu, Cognition
Coding agents are scaling on a 70-day doubling curve; product strategy must re-baseline every model generation as Devin moves from migrations to multi-file feature work.
Software Engineering Is Becoming Plan and Review β Louis Knight-Webb, Vibe Kanban
As coding agents take longer per run, your job collapses to planning and reviewing β invest in specs to amortize review cost across parallel agents.
Real World Development with GitHub Copilot and VS Code β Harald Kirschner, Christopher Harrison
Vibe coding maps to a maturity curve from YOLO prototyping to disciplined guardrails β VS Code + Copilot agent mode supports all three stages.
Rust is the language of the AGI - Michael Yuan
Rust's strict compiler turns 'hard to write' into a strong reward signal, making it the ideal target language when machines, not humans, are doing most of the coding.
The Making of Devin by Cognition AI: Scott Wu
Devin demonstrates that autonomous software engineers using human tooling (terminal, git, PRs) are the next phase beyond code-completion.
Beyond the Prototype: Using AI to Write High-Quality Code - Josh Albrecht, Imbue
Quality AI-generated code comes from synchronous prevention plus detection loops β plan, spec, lint, test, LLM-review β not from picking the smartest base model.
Building AI Agents with Real ROI in the Enterprise SDLC: Bruno (Booking.com) & Beyang (Sourcegraph)
Real enterprise AI-coding ROI comes from picking honest KPIs, training daily users, and giving agents deep codebase context for legacy migration work.
Replacing 12K LoC with a 200 LoC Skill β David Gomes, Cursor
Cursor replaced ~12K lines of worktree/best-of-N feature code with ~200 lines of markdown agent skills, demonstrating skills can collapse complex agent features.
Mergeable by default: Building the context engine to save time and tokens β Peter Werry, Unblocked
Reliable background coding agents need a real context engine β curated, relationship-aware organizational knowledge β not just RAG or MCP plumbing.
Piloting agents in GitHub Copilot - Christopher Harrison, Microsoft
Treat GitHub Copilot like a pair programmer β readable code, clear instructions and explicit context matter more than clever prompting.
Vibe Engineering Effect Apps β Michael Arnaldi, Effectful
To make coding agents useful on undocumented libraries, drop the library repo straight into your project so the agent's code-focused training kicks in.
Future-Proof Coding Agents β Bill Chen & Brian Fioca, OpenAI
Don't port prompts across models β match instructions to the model's trained habits, or use a co-developed model+harness like Codex.
Windsurf everywhere, doing everything, all at once - Kevin Hou, Windsurf
The next leap for AI coding tools is a shared human-AI timeline that lets parallel agents understand context and act without losing the user's intent.
RL for Autonomous Coding β Aakanksha Chowdhery, Reflection.ai
Reinforcement learning with automated verifiers (tests, compilers) is the unlock for autonomous coding because correctness can be checked cheaply at scale.
Your Coding Agent Just Got Cloned And Your Brain Isn't Ready - Rustin Banks, Google Jules
Async cloud coding agents like Jules unlock parallel multitasking and multi-variation development, but require AI at both ends of the workflow to remain sane.
Vibe Coding with Confidence β Itamar Friedman, Qodo
CLI-based agentic workflows that fold review and testing into authoring are the path from vibe-coded prototypes to trustworthy enterprise software.
Ship Production Software in Minutes, Not Months β Eno Reyes, Factory
Agent-native development means standardizing how your org thinks so droids can ingest your context and orchestrate work across the whole software lifecycle.
Code Generation and Maintenance at Scale: Morgante Pell
Large-scale code migrations need static analysis + semantic search + agentic execution β vanilla embedding RAG retrieves too much irrelevant code to be useful.
AI powered entomology: Lessons from millions of AI code reviews β Tomas Reimers, Graphite
AI code review only works when comments stay in the 'LLM-can-catch AND human-wants-it' quadrant β measured by downvotes plus whether developers actually act on them.
LLM codegen fails and how to stop 'em β Danilo Campos, PostHog
Fight LLM codegen hallucination by tool-calling fresh markdown docs into context β context windows are big enough that RAG isn't required.
The Cure for the Vibe Coding Hangover β Corey J. Gallon, Rexmore
Replace vibes with discipline β specs not prompts, atomic features, dependency-driven order, and executable tests before any agent writes code.
How to Improve your Vibe Coding β Ian Butler
Vibe coding only works with explicit rules, careful context management, and thinking models β naive setups generate alert fatigue from false positives.
GitHub Copilot: The World's Most Widely Adopted AI Developer Tool
Copilot has become a full-stack developer assistant β IDE completion, chat, PR summaries, knowledge-base Q&A β across every major IDE and SCM.
Welcome to AIE CODE - Jed Borovik, Google DeepMind
AIE CODE Summit positions AI coding as the single most important applied-AI problem and brings a focused, single-track audience together around it.
Building AI For All: Amjad Masad & Michele Catasta
Replit is bundling AI coding assistance plus inference (Model Farm) free to all users to ensure the next billion developers aren't gated from AI-enhanced programming.
Enhancing Quality and Security in CI: Gunjan Patel
Use CI's slack time as the venue for slow, deliberate AI passes (rename/comment/test/security) β Copilot does the fast thinking, GhostPilot does the slow thinking.
GPT Web App Generator - 10,000 apps created in a month: Matija Sosic
Constraining the codegen target (Wasp + React/Node) and tiering GPT-4 planning with GPT-3.5 implementation produces cheap, reliable full-stack app generation.
The Many Ends of Programming - Ray Myers
Map AI's impact on programming across six scenarios (from extreme completion to garbage pile) instead of treating it as a binary apocalypse.
To the moon! Navigating deep context in legacy code with Augment Agent β Forrest Brazeal, Matt Ball
Augment Agent positions its proprietary context engine as the real differentiator for AI on legacy codebases β model quality matters less than what you feed it.
The Code AI Maturity Model and What It Means For You: Ado Kukic
Code AI follows a six-level autonomy ladder from autocomplete to full SDLC ownership β most teams sit at L2-L3 today.
What Data from 20m Pull Requests Reveal About AI Transformation β Nick Arcolano, Jellyfish
Interactive coding tools are delivering measurable 2x throughput and faster cycles without quality regressions, while fully autonomous agents remain pre-production at most companies.
Supercharging developer workflow with Amazon Q Developer - Vikash Agrawal
Amazon Q Developer covers the full SDLC β CLI + IDE slash commands plus GitHub and AWS console integration β letting one developer hand off planning, testing, docs and deploy to the agent.
Vision: Zero Bugs β Johann Schleier-Smith, Temporal
Aerospace shows zero-bug software is possible with N-version, formal-method, and process-driven engineering β a discipline AI-generated code increasingly needs.
Mastering Engineering Flow with Windsurf - Eashan Sinha, Windsurf
Windsurf bets that real productivity gains come from tracking implicit user intent and pairing agentic tool use with explicit workflows, not just better chat.
The AI Evolution: Mario Rodriguez, GitHub
GitHub Copilot's $100M+ ARR rests on UX (ghost text), latency engineering and Codex/GPT-3.5 β not just the model β and post-PMF success requires global infra and rigorous online evals.
Move Fast Break Nothing: Dedy Kredo
Reliable codegen needs a generator+critic pair: behavior-driven tests plus automated review flag the security and correctness issues humans miss.
Copilots Everywhere: Thomas Dohmke and Eugene Yan
GitHub's bet is human-centric copilots that bridge issue-to-PR while keeping the developer in flow β augmentation, not autonomous replacement.
The Rise of the AI Software Engineer: Jesse Han
Morph's bet: combine static analysis + embeddings + graphs into a queryable code index that both retrieves context for any coding agent and generates fine-tuning data.
Mentoring the Machine β Eric Hou, Augment Code
Engineers should treat coding agents like mentees β invest in teaching them the codebase so they absorb the firefighting that destroys deep work.
[Full Workshop] Vibe Coding at Scale: Customizing AI Assistants for Enterprise Environments
Move enterprises through yolo -> structured -> spectrum vibes by adding guardrails (workspace settings, popular stacks, auto-approval policies) without killing the creative speed.
How Coding Agents change Software Development Forever - Hailong Zhang
Effective coding agents are narrow, evaluation-grounded, and built on a shared agent-OS abstraction β not monolithic 'software engineer' bots.
Developing Taste in Coding Agents: Applied Meta Neuro-Symbolic RL β Ahmad Awais, CommandCode
Coding agents need a learned, inspectable 'taste' layer β meta neuro-symbolic preference modeling β to move beyond sloppy defaults and brittle rules files.
Vibe Coding at Scale: Customizing AI Assistants for Enterprise Environments - Harald Kirshner,
Enterprise vibe coding works when you graduate from YOLO mode to templates plus MCP-based tools that constrain agents to your stack and conventions.
Collaborating with Agents in your Software Dev Workflow - Jon Peck & Christopher Harrison, Microsoft
Treat GitHub Copilot as a pair programmer reading your code β better naming, comments, and explicit intent are still the highest-leverage prompt-engineering for it.
[Workshop from Microsoft] Github Copilot - The World's Most Widely Adopted AI Developer Tool
GitHub Copilot's effectiveness leans heavily on the context (open tabs, comments, examples) developers provide; chat unlocks broader code Q&A while business code stays out of training.
The Eyes Are The (Context) Window to The Soul: How Windsurf Gets to Know You β Sam Fertig, Windsurf
Coding agents that 'know you' rely on heuristics (user behavior) + hard evidence (codebase) plus relevance-first context retrieval β not bigger windows.
The Agent Awakens: Collaborative Development with Copilot - Christopher Harrison, GitHub
Treat GitHub Copilot like an AI pair programmer β clear code, clear intent, and pick the right mode from completions to autonomous coding agent.
Don't get one-shotted: Use AI to test, review, merge, and deploy code β Tomas Reimers, Graphite
Build AI-native outer-loop tooling (review, CI, merge, deploy) because the inner-loop speedup is creating a review bottleneck and security risk.
Unlocking AI Powered DevOps Within Your Organization β Jon Peck, GitHub
Roll out AI DevOps by training brownfield prompting habits, codifying team standards in copilot-instructions.md, and curating exemplar knowledge bases.
GitHub Next Explorations: Rahul Pandita
GitHub Next bets on staged exploration: code-completion is evolving to multi-location 'Next Edit Suggestions' and to full inner-loop task completion via Copilot Workspace.
GitHub's AI Powered Security Platform: Sarah Khalife
GitHub is shifting from Copilot-only AI to AI embedded across the platform, with AppSec autofix and triage as a major investment area.