MCP Agent Fine tuning Workshop - Ronan McGovern

2.6K views · Jun 03, 2025 · 35:29 min · Watch on YouTube ↗

Takeaway

Bootstrap a stronger MCP agent by running Qwen-3 + Playwright MCP, saving multi-turn reasoning traces, and fine-tuning on its own successful runs.

Summary

Ronan McGovern (Trellis Research YouTube) walks through running an MCP-tool-using agent (Playwright browser, 25 tools), collecting traces, then fine-tuning Qwen-3-30B-A3B (MoE, 3B active) on RunPod with VLLM.
Three integration points: convert MCP tool info to OpenAI JSON schema, convert MCP tool responses into the format LLM expects, and detect/extract tool calls from LLM text — uses Hermes parser even for Qwen.
System prompt instructs LLM to emit tool calls as JSON inside <tool_call> XML tags; max_model_len 32K, automatic tool choice, reasoning parser to detect <think> tokens.
Naively truncates long tool responses (e.g., Playwright accessibility tree from a page) — flagged as needing deeper handling in real impls.
Generates traces with the same model family being fine-tuned (Qwen→Qwen) because OpenAI models don't expose reasoning traces — keeps reasoning style consistent.

fine-tuningmcpqwen

Original description

This is a hands on workshop where students will run an agent with access to MCP servers (a playwright browser, although others can be added), generate high quality reasoning traces, and then train a Qwen3 model on those traces.

Students will learn:
- How to generate high quality MCP agent reasoning traces, via an OpenAI style endpoint
- How to save tools and multi-turn traces
- Fine-tune a Qwen3 model on those traces with unsloth
- Run the fine-tuned model