← back
120k players in a week: Lessons from the first viral CLIP app: Joseph Nelson
Takeaway
CLIP's text/image embedding similarity enables open-ended multimodal games and apps with a fraction of the code traditional CV required.
Summary
- paint.wtf: AI Pictionary built in 48 hours hit 120k players in week 1, 7 RPS peak; GPT-3 generated prompts, users drew, CLIP scored cosine similarity of text embedding vs image embedding.
- Demonstrates a new primitive: open-set understanding instead of fixed class lists — CLIP can score abstract prompts like 'a bumblebee that loves capitalism'.
- Built live in <50 lines of Python using Roboflow's open-source inference server (50k+ pretrained models on Roboflow Universe).
- Observed CLIP similarity range only ~13–45% across 200k submissions; rescaled UI to 0–100 for users.
- Lessons on letting strangers send you images: content moderation, prompt-injection-like attacks via text in drawings (top-scoring entries embedded the word 'tractor').
clipmultimodalroboflow
Original description
When OpenAI released CLIP, the Roboflow team built an AI Pictionary game called paint.wtf. Players were given a prompt like “a giraffe in the arctic,” and players drew depictions. CLIP judged which image embedding most closely matched the text embedding. Over 120,000 players played in the first week, peaking at 7 submissions per second. Fast forward, and multimodality apps are ready. Come learn the trials (strangers on the internet submitting drawings) and successes (infra scaled without outage) of building with foundation models. Recorded live in San Francisco at the AI Engineer Summit 2023. See the full schedule of talks at https://ai.engineer/summit/schedule & join us at the AI Engineer World's Fair in 2024! Get your tickets today at https://ai.engineer/worlds-fair About Joseph Nelson Joseph is Co-founder/CEO at Roboflow, which makes tools over 250,000 developers use to build better computer vision models, faster. Roboflow is backed by Y Combinator, Craft Ventures, Floodgate, the founders of OpenAI, among others. He previously Co-founded and sold an NLP company that sorted the US Congress's mail and worked at Facebook. Joseph learned to code writing programs for TI-84 calculators.