🔬 Research

Frontier research talks — new architectures, training techniques, theoretical insights, paper deep-dives.

4 videos · agiworld-modelscode-generationmeta-fairrlenvironments

The workflow

flowchart LR
    A[Open problem] --> B[Hypothesis<br/>+ experiment design]
    B --> C[Run + ablations]
    C --> D[Compare to<br/>strong baselines]
    D --> E{Holds up?}
    E -->|No| B
    E -->|Yes| F[Write-up +<br/>code release]

The cutting edge — usually 6-18 months ahead of production.

Key takeaways

Meta's Code World Model predicts program execution traces as an autoregressive sequence so agents can imagine outcomes before running code.

Scaling RL is now a talent and tooling problem; opening up RL environments and infra is how Prime Intellect plans to widen the researcher pool.

AGI's hardest problems (memory, alignment, deception, idioms, hive-mind) map nicely onto sci-fi memes — and graph-based grounding is one tool worth taking seriously.

AGI measurement needs interactive game-based benchmarks with hidden test sets so model intelligence can't be confused with memorized training data or developer-injected priors.

Videos (4)

Code World Model: Building World Models for Computation – Jacob Kahn, FAIR Meta

Meta's Code World Model predicts program execution traces as an autoregressive sequence so agents can imagine outcomes before running code.

11.6K views · Dec 17, 2025

RL Environments at Scale – Will Brown, Prime Intellect

Scaling RL is now a talent and tooling problem; opening up RL environments and infra is how Prime Intellect plans to widen the researcher pool.

9.6K views · Dec 09, 2025

Top Ten Challenges to Reach AGI — Stephen Chin, Andreas Kollegger

AGI's hardest problems (memory, alignment, deception, idioms, hive-mind) map nicely onto sci-fi memes — and graph-based grounding is one tool worth taking seriously.

842 views · Jul 22, 2025

Measuring AGI: Interactive Reasoning Benchmarks for ARC-AGI-3 — Greg Kamradt, ARC Prize Foundation

AGI measurement needs interactive game-based benchmarks with hidden test sets so model intelligence can't be confused with memorized training data or developer-injected priors.

491 views · Jul 16, 2025