Fine tune 20 Llama Models in 5 Minutes: Santosh Radha

1.3K views · Feb 09, 2025 · 6:25 min · Watch on YouTube ↗

Takeaway

With Python decorators, Covalent lets you fan out fine-tuning and deploy auto-scaling inference endpoints across heterogeneous GPUs without touching Docker or Kubernetes.

Summary

Covalent is an open-source/SaaS Python decorator layer that ships local Python functions to remote GPUs (H100/L4/V100) without Docker/Kubernetes.
Decorator specifies hardware (e.g. 24 CPU + 1 H100 48GB, 18hr max), function runs remotely, you pay only for actual seconds used (one eval ran 6 min for 87 cents).
Demo workflow: loop over many models, fine-tune each on assigned GPU, evaluate on CPU machine, sort, pick best, deploy as an auto-scaling /generate endpoint — all in one Python script.
Custom autoscaling (scale to zero, scale at 9am daily, scale on GPU util %, scale on request count); auth and infra abstracted away.
Lets users bring their own compute (cloud or on-prem) or use Covalent's managed GPU cluster.

fine-tuningcovalentgpu-orchestration

Original description

The complexity and scarcity of deploying GPUs can bring AI development to a standstill. What if there was a better way to train, fine-tune, and serve models on accelerated compute infrastructure entirely in Python? See how during this session where we fine-tune 20 Llama models without doing any infrastructure work. Startups and enterprises can finally gain unprecedented speed and agility to build, iterate, and deploy anything that they can imagine, from multi-agent, multi-modal AI applications, to digital twins for real world simulation. What used to take weeks with dozens of best-in-class engineers can now be accomplished in hours from a single notebook.

Recorded live in San Francisco at the AI Engineer World's Fair. See the full schedule of talks at https://www.ai.engineer/worldsfair/2024/schedule & join us at the AI Engineer World's Fair in 2025! Get your tickets today at https://ai.engineer/2025

About Santosh
Santosh is the Head of Product/Research at Agnostiq, where he plays a pivotal role in shaping the company's product strategy, particularly through the development of Covalent which is designed to significantly enhance the scalability and performance of next-generation AI applications and large-scale scientific simulations across multi-cloud environments. Santosh holds a Ph.D in theoretical physics from Case Western Reserve University.