← back

MCP Agent Fine tuning Workshop - Ronan McGovern

2.6K views · Jun 03, 2025 · 35:29 min · Watch on YouTube ↗
Takeaway

Bootstrap a stronger MCP agent by running Qwen-3 + Playwright MCP, saving multi-turn reasoning traces, and fine-tuning on its own successful runs.

Summary

  • Ronan McGovern (Trellis Research YouTube) walks through running an MCP-tool-using agent (Playwright browser, 25 tools), collecting traces, then fine-tuning Qwen-3-30B-A3B (MoE, 3B active) on RunPod with VLLM.
  • Three integration points: convert MCP tool info to OpenAI JSON schema, convert MCP tool responses into the format LLM expects, and detect/extract tool calls from LLM text — uses Hermes parser even for Qwen.
  • System prompt instructs LLM to emit tool calls as JSON inside <tool_call> XML tags; max_model_len 32K, automatic tool choice, reasoning parser to detect <think> tokens.
  • Naively truncates long tool responses (e.g., Playwright accessibility tree from a page) — flagged as needing deeper handling in real impls.
  • Generates traces with the same model family being fine-tuned (Qwen→Qwen) because OpenAI models don't expose reasoning traces — keeps reasoning style consistent.
fine-tuningmcpqwen
Original description
This is a hands on workshop where students will run an agent with access to MCP servers (a playwright browser, although others can be added), generate high quality reasoning traces, and then train a Qwen3 model on those traces.

Students will learn:
- How to generate high quality MCP agent reasoning traces, via an OpenAI style endpoint
- How to save tools and multi-turn traces
- Fine-tune a Qwen3 model on those traces with unsloth
- Run the fine-tuned model