Everything you need to know about Fine-tuning and Merging LLMs: Maxime Labonne

30.3K views · Sep 25, 2024 · 17:52 min · Watch on YouTube ↗

Takeaway

Fine-tune only when prompt engineering plateaus, then layer SFT + DPO + LoRA/QLoRA and consider model merging to combine specialized variants cheaply.

Summary

LLM lifecycle: pre-training → SFT → preference alignment (DPO most popular among DPO/PPO/KTO/IPO).
Decision flow: start with prompt engineering, fine-tune only if eval gap + you can build an instruction dataset; a16z survey shows enterprises value customizability/control.
Recommended libraries: Unsloth, TRL, Axolotl, LLaMA-Factory; SFT data should optimize accuracy, diversity, and complexity (Chain-of-Thought).
SFT techniques: full FT > LoRA > QLoRA on a cost/quality tradeoff; learning rate is the top hyperparameter, push it high until loss explodes.
Model merging combines weights of multiple fine-tunes without retraining; covers data dedup, reward-model filtering, and topic clustering for dataset quality.

fine-tuningloramodel-merging

Original description

Fine-tuning LLMs is a fundamental technique for companies to customize models for their specific needs. In this talk, we will cover when fine-tuning is appropriate, popular libraries for efficient fine-tuning, and key techniques. We will explore both supervised fine-tuning (LoRA, QLoRA) and preference alignment (PPO, DPO, KTO) methods.

Recorded live in San Francisco at the AI Engineer World's Fair. See the full schedule of talks at https://www.ai.engineer/worldsfair/2024/schedule & join us at the AI Engineer World's Fair in 2025! Get your tickets today at https://ai.engineer/2025

About Maxime
Maxime Labonne is a Senior Staff Machine Learning Scientist at Liquid AI, serving as the head of post-training. He holds a Ph.D. in Machine Learning from the Polytechnic Institute of Paris and is recognized as a Google Developer Expert in AI/ML. An active blogger, he has made significant contributions to the open-source community, including the LLM Course on GitHub, tools such as LLM AutoEval, and several state-of-the-art models like NeuralBeagle and Phixtral. He is the author of the best-selling book “Hands-On Graph Neural Networks Using Python,” published by Packt. Connect with him on X and LinkedIn @maximelabonne.