← back

What We Learned from Using LLMs in Pinterest — Mukuntha Narayanan, Han Wang, Pinterest

2.1K views · Jul 16, 2025 · 18:12 min · Watch on YouTube ↗
Takeaway

Fine-tuned LLMs on rich pin text (incl. VLM captions and user-engagement annotations) lift Pinterest search relevance 12-20%, productionized via knowledge distillation.

Summary

  • Pinterest handles 6B+ searches/month across 45 languages and 100 countries; LLMs now score query-pin semantic relevance on a 5-point scale via cross-encoder + MLP head.
  • Fine-tuned 8B Llama gives 12% lift over multilingual BERT and 20% over Pinterest's in-house SearchSAGE embedding baseline.
  • Best text representation combines pin title/description + VLM-generated image captions + user-action features (board titles where saved, top-engagement queries) — last two add meaningful gains.
  • Productionization uses knowledge distillation to bring the heavyweight LLM teacher into a serveable student model at Pinterest scale.
searchpinterestfine-tuning
Original description
Pinterest Search integrates Large Language Models (LLMs) to enhance relevance scoring by combining search queries with rich multimodal content, including visual captions, link-based text, and user curation signals. A semi-supervised learning framework enables scaling to large and multilingual datasets, going beyond English and limited human labels. These LLM-driven models are distilled into efficient architectures for real-time serving, with experimental validation and large-scale deployment demonstrating substantial improvements in search relevance for Pinterest users worldwide.


Timestamps
[00:00] Introduction to Pinterest and its search functionality.

[01:52] Overview of the Pinterest search backend architecture.

[02:29] The search relevance model.

[02:55] Key learnings from using LLMs for search relevance.

[05:04] The value of VLM-generated captions and user actions as content annotations.

[07:16] Productionizing LLMs with knowledge distillation.

[12:14] The utility of relevance-tuned LLM embeddings as general-purpose semantic representations.

[13:55] Q&A session.