AI Frontiers in Trust and Safety Combatting Multifaceted Harm on Tinder at Scale: Vibhor Kumar

1.1K views · Dec 02, 2024 · 14:36 min · Watch on YouTube ↗

Takeaway

Fine-tuning open-source LLMs on hybrid LLM-mined + human-verified data is the practical playbook for trust-and-safety classification at consumer scale.

Summary

Tinder's Trust & Safety team faces violations from minor (social handles in bios) to severe (hate speech, pig-butchering scams) and generative AI is amplifying both spam volume and deepfake catfishing.
Mitigations rely on hybrid data generation: GPT-4 mines internal data using prompts that dodge alignment refusal, humans verify mislabels — only hundreds-to-thousands of examples needed for fine-tuning.
Fine-tuned open-source LLMs (Llama 3, Mistral) beat GPT-4 few-shot on downstream T&S classification while remaining controllable and cheap to retrain when bad-actor behavior shifts.
Toolchain: Hugging Face libraries, config-driven frameworks like Axolotl, Ludwig, Llama Factory, plus managed platforms H2O LLM Studio and Predibase.
Output logit distributions give classification confidence — a key advantage of running your own fine-tuned model over an API.

trust-and-safetyfine-tuningtinder

Original description

Harassment, Hate Speech, Pig Butchering Scams, and Underage users. These are just some of the possible categories of the (very) long tail of harm on Tinder. How can we possibly train, serve, and maintain models for all of these, at global, real-time scale? We build off of pre-trained models and an increasingly mature open-source ecosystem. In this talk, we'll cover how we've dramatically accelerated our modeling pipeline with (1) human-AI hybrid dataset generation for different harm vectors, (2) automated parameter-efficient fine-tuning of open-source large language and multimodal models for violation detection, and (3) serving fine-tuned adapters efficiently in real-time and at scale using LoRAX and cascade classification.

Recorded live in San Francisco at the AI Engineer World's Fair. See the full schedule of talks at https://www.ai.engineer/worldsfair/2024/schedule & join us at the AI Engineer World's Fair in 2025! Get your tickets today at https://ai.engineer/2025

About Vibhor
Vibhor Kumar is a computer scientist, amateur neuroscientist, and armchair philosopher of science. He likes working at the intersection of the theoretical and applied. His work has been involved in mapping fly brains, catching financial fraud, and generating assets of various types using AI.

He is currently a software engineer in Trust and Safety at Tinder, a hands-on advisor to AI startups including Togethr.ai, a contributor to open-source AI projects, and an angel investor.