← back
Infra that fixes itself, thanks to coding agents — Mahmoud Abdelwahab, Railway
Takeaway
Combine durable workflows with a headless coding agent (OpenCode) so infrastructure issues become reviewable pull requests instead of pages.
Summary
- Railway demo shows services with memory leaks, 94% error rates and 30s response times triaged by a scheduled coding-agent workflow that opens PRs to fix them.
- Workflow: every 10-30 min fetch project architecture, per-service resource and HTTP metrics, identify threshold-breaching services, gather logs/upstream status, write a detailed plan.
- Uses Inngest-style durable execution so each step is cached and retried independently — failed DB writes don't re-run transcription.
- Plan handed to OpenCode (open-source Claude Code alternative with headless server) which clones the repo, creates a todo, implements fixes, opens a PR.
- Argues threshold-based alerts are noisy — time-slice analysis with full context produces better detection.
coding-agentssredurable-workflows
Original description
This talk shows how we built Railway Autofix, a plug-in template you can drop into any Railway project to monitor your infrastructure, and open PRs with fixes when issues are detected. We use OpenCode as our coding agent, as well as Inngest for durable execution The final code will be live at https://github.com/m-abdelwahab/railway-autofix Mahmoud Abdelwahab is a Software Engineer who works at the intersection of Product, Marketing, Education and Community. He loves building over-engineered demos and playing around with the latest technologies. --- Socials: - LinkedIn: https://linkedin.com/in/thisismahmoud - X (Twitter): https://x.com/thisismahmoud - GitHub: https://github.com/m-abdelwahab - Company: Railway (https://railway.com)