← back

Insights from Snorkel AI running Azure AI Infrastructure: Humza Iqbal and Lachlan Ainley

208 views · Feb 08, 2025 · 20:46 min · Watch on YouTube ↗
Takeaway

Enterprise fine-tuning needs SME-in-the-loop programmatic data development and domain-specific benchmarks — and PyTorch+Horovod+NFS on Azure scales it from one node to dozens.

Summary

  • Snorkel AI (spun out of Stanford AI Lab) develops data for fine-tuning enterprise LLMs — off-the-shelf models hit 'final mile' gaps for Fortune 500 banks and insurers.
  • Research focus: SME-in-the-loop with max-value-of-time, programmatic and auditable data dev, dynamic domain-specific benchmarks (LLM Sys-style leaderboards miss industry specifics).
  • Active projects: fine-grained long-context eval (needle-in-haystack isn't enough), enterprise alignment for regulatory compliance, multimodal alignment via LVLMs generating synthetic training data.
  • Azure stack: PyTorch + Horovod for multi-node gradient sync across A100/H100 VMs sharing a single NFS for checkpoints and datasets.
  • Microsoft pitches 300+ Azure datacenters, mixed accelerators (AMD, NVIDIA, in-house Maia), and learnings from training OpenAI/Mistral models democratized to other customers.
fine-tuningdataazure
Original description
Join us and hear from the Snorkel AI team about their experience and lessons learned using Azure AI infrastructure powered by NVIDIA GPUs. Learn about how Snorkel Researchers were able to run experiments quickly from small projects to large-scale distributed jobs on multiple GPUs reliably and with full monitoring mechanisms.

Recorded live in San Francisco at the AI Engineer World's Fair. See the full schedule of talks at https://www.ai.engineer/worldsfair/2024/schedule & join us at the AI Engineer World's Fair in 2025! Get your tickets today at https://ai.engineer/2025

About Humza
I do research in machine learning robustness. I've developed techniques to make models more robust to distribution shift and adversarial examples as well as understanding the underpinnings of models and understanding why machine learning works. I work at a startup Securiti.ai to help enterprises understand customer data and comply with privacy regulations using machine learning.

About Lachlan
Driven IT marketing and sales professional with extensive experience in Software Applications (ERP, Business Intelligence, CRM, Data Management), Hardware (Server, Storage and Networks), and Cloud based solutions (IaaS, SaaS). 

Over 10 years experience with global systems integrator, Fujitsu, and global Software companies Nintex, Microsoft, International across Sydney, Tokyo and Seattle.

I am a team player who leads by example, taking an energetic approach to achieving successful business outcomes. A combination of strong business acumen and exceptional interpersonal skills has regularly served me well in creating a highly productive work environment for my team, regardless of culture, language or geography.

As a competitor and keen sports enthusiast, I know how to leverage team strengths to achieve collective goals and create win/win environments. My approach to analytical decision making is analytical, I evaluate the information available and learn from collective past experiences.