← back
MCP Agent Fine tuning Workshop - Ronan McGovern
Takeaway
Bootstrap a stronger MCP agent by running Qwen-3 + Playwright MCP, saving multi-turn reasoning traces, and fine-tuning on its own successful runs.
Summary
- Ronan McGovern (Trellis Research YouTube) walks through running an MCP-tool-using agent (Playwright browser, 25 tools), collecting traces, then fine-tuning Qwen-3-30B-A3B (MoE, 3B active) on RunPod with VLLM.
- Three integration points: convert MCP tool info to OpenAI JSON schema, convert MCP tool responses into the format LLM expects, and detect/extract tool calls from LLM text — uses Hermes parser even for Qwen.
- System prompt instructs LLM to emit tool calls as JSON inside <tool_call> XML tags; max_model_len 32K, automatic tool choice, reasoning parser to detect <think> tokens.
- Naively truncates long tool responses (e.g., Playwright accessibility tree from a page) — flagged as needing deeper handling in real impls.
- Generates traces with the same model family being fine-tuned (Qwen→Qwen) because OpenAI models don't expose reasoning traces — keeps reasoning style consistent.
fine-tuningmcpqwen
Original description
This is a hands on workshop where students will run an agent with access to MCP servers (a playwright browser, although others can be added), generate high quality reasoning traces, and then train a Qwen3 model on those traces. Students will learn: - How to generate high quality MCP agent reasoning traces, via an OpenAI style endpoint - How to save tools and multi-turn traces - Fine-tune a Qwen3 model on those traces with unsloth - Run the fine-tuned model