Researchers solved the Context Window Limit
·17:44
Context windows cap what you can reliably ask an LLM to reason over—and as inputs grow, “context rot” can make quality drop fast. This video breaks down an MIT paper proposing recursive language models: a way to process arbitrarily long prompts at inference time without changing the core model.
Key takeaways
- Covers why stuffing more tokens into a prompt can degrade retrieval and reasoning, even before hitting the physical limit.
- Walks through the RLM setup: storing the long prompt in a Python/REPL environment and giving the model tools to search it.
- Explains the “recursive” step—re-querying relevant sections to go deeper without summarization or compression.
- Reviews how the approach is evaluated on long-context tasks (e.g., BrowseComp+, Oolong, code repository understanding) and what tradeoffs show up in cost variance.