← back
Latent Space Paper Club: AIEWF Special Edition (Test of Time, DeepSeek R1/V3) — VIbhu Sapra
Takeaway
Latent Space is splitting Paper Club into a curriculum-based 'Test of Time' track (foundational papers over 6 months) alongside the existing trending-paper weekly format.
Summary
- Vibhu Sapra recaps Latent Space Paper Club: 1.5 years of weekly Wednesday-noon sessions, ~100 attendees average, 300 live for DeepSeek V3; authors from Nvidia, Meta, AI2, Amazon, Together, Writer.
- Announces 'Test of Time Paper Club' V2 — curriculum-based reading group running July-December, 24 weeks, 2-4 papers per session covering foundations (attention, optimizers, BERT, GPT-2), scaling laws (Chinchilla), distillation, inference (FlashAttention, speculative decoding), modalities (Whisper, CLIP, Stable Diffusion).
- First in-person SF section alongside continued remote sessions; existing weekly trending-paper club continues unchanged.
- Goal: cover the foundational 50-100 papers an AI engineer should know, in addition to weekly cutting-edge coverage.
paper-clubai-educationcommunity
Original description
Recorded at the AI Engineer World's Fair in San Francisco. Stay up to date on our upcoming events and content by joining our newsletter here: https://www.ai.engineer/newsletter Timestamps: 00:00:00 Paper Club Year in Review & Future Plans 00:08:00 DeepSeek Paper Discussion 00:09:10 DeepSeek R1 (May 28th Update) 00:12:40 DeepSeek Distillation 00:16:51 Original DeepSeek Model Overview (DeepSeek V3 and R1) 00:21:15 Development of reasoning capabilities through a pure RL process 00:24:46 DeepSeek R10 00:39:05 DeepSeek R1 four-stage training pipeline 00:35:01 Emergence of "reflection moments" and "aha moments" 00:44:15 Distillation Strategy 00:52:34 Community and Call to Action