LLM Quality Optimization Bootcamp: Thierry Moreau and Pedro Torruella

1.1K views · Feb 08, 2025 · 53:05 min · Watch on YouTube ↗

Takeaway

Fine-tune only after prompt-eng and RAG plateau; small Llama 3 + OpenPipe can deliver 47% accuracy gains and 200x cost cuts on narrow tasks like PII redaction.

Summary

OctoAI's Thierry Moreau & Pedro Torruella position fine-tuning as stage 3 of LLM quality optimization (crawl=prompt eng, walk=RAG, run=fine-tuning).
Use case demo: PII redaction—fine-tuning Llama 3 via OpenPipe achieved 47% better accuracy than GPT-4 Turbo and a 99.5% cost reduction (200x) when deployed on OctoAI.
Best for narrow specific tasks (classification, extraction, formatting, function calling) where prompt+RAG plateau; full deployment cycle: data collection → fine-tune → deploy → eval, then iterate.
Demystifies fine-tuning as accessible via SaaS (OpenPipe + OctoAI) without managing GPUs or cloud instances.

fine-tuningopenpipecost-optimization

Original description

Lunch & Learn: In this bootcamp we demonstrate how smaller open source models, fine-tuned to excel at specialized tasks (e.g. classification, function calling etc.) can deliver comparable if not improved quality over proprietary state of the art models, resulting in significant cost reduction (over 60x) in your GenAI usage costs.

We’ll learn how to (1) collect and curate a fine tuning dataset, (2) fine tune an open source model (Llama 3), and (3) deploy the model fine tune into production with OctoAI and OpenPipe.

Follow the prerequisite instructions contained within the collab notebook (takes about 2 mins of setup): https://colab.research.google.com/drive/1DVw6vfEtzYV7QfcVXhmmjTVBOiJiJ02b#scrollTo=fUT1waisoXFs

Recorded live in San Francisco at the AI Engineer World's Fair. See the full schedule of talks at https://www.ai.engineer/worldsfair/2024/schedule & join us at the AI Engineer World's Fair in 2025! Get your tickets today at https://ai.engineer/2025

About Thierry
Thierry Moreau (University of Washington Ph.D.) is a co-founder of OctoAI and firm believer in open source, and spends his time to educating people on how to build more efficient and safer AI systems as OctoAI’s head of DevRel

About Pedro
Pedro Torruella is a Senior DevRel Engineer at OctoAI. Pedro started his engineering career in Real Time Video Processing. He has implemented algorithms in both hardware and software, coordinated and led multinational teams, and founded his own startup. He shines on making the bridge between tech and people, and currently focuses on helping others use AI to create products that users love.