.avif)
.avif)
AI Reading Group | Genie: Generative Interactive Environments


Genie: Generative Interactive Environments
Jake Bruce, Michael Dennis, Ashley Edwards, Jack Parker-Holder, Yuge Shi, Edward Hughes, Matthew Lai, Aditi Mavalankar et al.
Continuing our world models track with Leo Pinetzki, this session takes a big step from MuZero's task-specific learned models toward something more ambitious: world models as generative engines.
Our paper is Genie: Generative Interactive Environments (Bruce et al., 2024).
Where MuZero learned an internal model to plan within a single game, Genie learns to generate entire interactive environments from scratch. Trained on over 200,000 hours of unlabelled Internet videos of 2D platformer games, this 11B-parameter model can take a text prompt, a photo, or even a sketch and produce a playable virtual world — frame by frame, responding to user actions.
This is a very different flavour of world model from MuZero, and the contrast is productive: Is a model that generates plausible-looking next frames really "understanding" dynamics, or just doing very good video prediction? What's gained and lost when you move from compact latent models to full pixel-level generation? And how far is this approach from being useful outside of 2D games?
Become a part of the AI Campus.
There are many ways to join our community. Sign up to our newsletter below, or select one of the other two options and get in touch with us:

.avif)
.avif)
.avif)