Training Albatross An Expert Finance LLM: Leo Pekelis

1.9K views · Feb 13, 2025 · 16:20 min · Watch on YouTube ↗

Takeaway

Building finance-expert LLMs requires both domain pretraining (with membership-inference-based data curation) and long-context extension because finance work demands both depth and document length.

Summary

Leo Pekelis (chief scientist, Gradient) presents Albatross, Gradient's finance-domain LLM, built in their AI Foundry of custom models + workflow primitives.
Generalist LLMs are weak on long-tail technical finance data; research shows even 176B models need thousands of relevant documents in pretraining to exceed 50% accuracy.
Automated data pipeline curates a large finance corpus and uses membership-inference techniques to filter out documents the base model has likely already seen.
Pairs the domain LLM with a context-length extension (~40x growth in long-context capability over the past year) for finance applications that demand long documents.

domain-llmfinancelong-context

Original description

The challenge with financial agents successfully completing complex workflows like tabular reasoning or sentiment analysis often comes down to the reliability of executing numerous chained tasks together. Establishing the p99s necessary has to happen at the model level, yet most finance domain-specific LLMs are either only pre-training (BloombergGPT) or using supervised fine-tuning (FinBERT).

This presentation reveals how we transformed an open-source model into Albatross (https://huggingface.co/gradientai/v-alpha-tross), capable of performing at the top of the leaderboard on chat as well as domain-specific tasks. Our journey involved an intensive data pipeline and training regiment, incorporating a combination of continual pre-training, fine-tuning, and preference optimization, to customize the model for the intricacies of financial tasks. We'll share our insights on overcoming the execution hurdle, which is often the downfall of AI projects in specialized domains.

Recorded live in San Francisco at the AI Engineer World's Fair. See the full schedule of talks at https://www.ai.engineer/worldsfair/2024/schedule & join us at the AI Engineer World's Fair in 2025! Get your tickets today at https://ai.engineer/2025

About Leo
As Gradient's Chief Scientist, Leo leads our research team. Prior to Gradient, Leo led CloudTruck's ML and data science orgs pioneering applied ML to operational challenges. Prior to CloudTrucks, Leo held leadership roles across Opendoor, Optimizely, and Disney. Leo holds a bachelors degree in economics from Stanford, as well as a masters and PhD in statistics from Stanford.