← back
Building a Chess Coach — Anant Dole and Asbjorn Steinskog, Take Take Take
Takeaway
For domain skills LLMs can't do (chess), let dedicated systems compute truth and use the LLM purely as a grounded translator into natural language — and let Claude Code self-improve the pipeline via feedback.
Summary
- Play Magnus team built an AI chess coach pipeline: Stockfish gives 'truth' evaluations, custom detectors find tactics (pins, forks, skewers) and positional themes, U-Toronto's Maia model gives the human-rated probability of each move; the LLM only translates this structured info into English to avoid hallucination.
- LLMs alone are terrible at chess (Grok shown losing badly in a Kaggle Game Arena tournament) but DeepMind's transformer-trained on millions of (position, Stockfish-eval) pairs plays at grandmaster level — architecture isn't the issue, training data is.
- Closes the loop via user-feedback channel: when users mark commentary bad, it goes to Slack and to Claude Code via a 'Channel' MCP server that injects events into a running session; Claude Code runs a commentary-triage skill to iterate on prompts and detectors.
- Latency vs quality trade-offs discussed for a consumer app.
chessllm-groundingclaude-code
Original description
LLMs can explain things clearly but can't play chess reliably. Take Take Take (Magnus Carlsen's app) solved this by separating concerns: Stockfish handles position evaluation, tactical and positional detectors extract concepts like forks, pins, and structural weaknesses, and the LLM's only job is translating those structured signals into English. Keeping the model as a translator rather than a reasoner is what makes it work at sub-3-second latency for a consumer app. Anant Dole and Asbjørn Steinskog also walk through how they closed the feedback loop. When a user flags bad commentary, it posts to Slack and injects the event into a running Claude Code session via Channels, a new MCP feature in research preview. Claude investigates the position, modifies prompts or detectors, regenerates the commentary, and asks clarifying questions back through Slack. During the live demo, Anant was reviewing the PR from his phone. Speaker info: - https://www.linkedin.com/in/asbj%C3%B8rn-ottesen-steinskog-a8000241/ - https://www.linkedin.com/in/anantdole/