Claude Code’s 1M-token memory can backfire without session discipline

Power users say Claude Code’s million-token context enables longer autonomous workflows—but also invites “context rot” as logs and dead ends pile up. The fix is active session management: rewind, compact with guidance, start fresh, or offload noisy work to subagents.

claude cover

TL;DR

  • 1M-token context enables long Claude Code runs but increases context pollution; large windows shift failure modes, not eliminate them
  • Context rot: performance can degrade around 300–400k tokens, as tool output, file reads, and dead ends accumulate
  • Session-shaping tools: Continue, /rewind (Esc Esc), /clear, /compact, subagents; each turn is a branching decision point
  • Workflow rule: new task generally means new session; adjacent tasks may reuse context to avoid re-reading files
  • Rewind habit: remove failed reasoning by rewinding after relevant setup, then re-prompt; rewind remains a prompt cache hit
  • Compaction guidance: compacts are lossy and can degrade late; compact proactively and “give it a hint” on summary focus

Claude Code power users are learning that a million-token context window cuts both ways. In a thread on X, Thariq outlined how Claude Code sessions can grow into a real asset for longer, more autonomous work—and just as easily drift into “context pollution” without deliberate session management.

The key point: a larger context window doesn’t remove the need to manage context. It mostly changes the failure modes. Instead of hard-stopping early, long sessions can quietly degrade as more tool output, file reads, and dead-end attempts accumulate.

Context, compaction, and “context rot”

In the thread, Thariq notes that for the “1MM context model,” some level of context rot can show up around ~300–400k tokens, while emphasizing that it depends on the task.

When a session approaches the context limit (a hard cutoff), Claude Code relies on compaction—summarizing the session into a smaller description to continue in a fresh window. Compaction can also be triggered manually.

Every turn is a branching point

A practical framing in the post is that once Claude finishes a turn, the next message is a decision point—because each additional “continue” adds more weight (and more noise) to the working set.

Thariq lists a set of alternatives to simply continuing:

  • Continue in the same session
  • /rewind (Esc Esc) to jump back to a previous message and drop everything after it from context
  • /clear to start a new session with a distilled brief
  • Compact to replace the full transcript with a summary
  • Subagents to delegate work to a fresh context and bring back only the result

The emphasis isn’t that any one option is always best, but that these are tools for shaping what stays in the model’s “line of sight.”

When to start fresh vs. when to keep going

A rule of thumb offered in the thread: a new task should generally mean a new session. The gray area is adjacent work where reusing context may be efficient—documentation for a feature that was just implemented is given as an example. In that case, keeping the session can avoid re-reading the same files, at the cost of carrying extra context that may not matter for the doc-writing step.

Rewind as a “good habit” for correction

If there’s one behavior Thariq calls out as a signal of strong context management, it’s rewind.

Rather than stacking corrective instructions on top of a failed attempt (“that didn’t work, try X instead”), Thariq suggests rewinding to just after relevant file reads or setup steps, then re-prompting with what was learned. That trims away the dead-end reasoning and output that would otherwise linger in context.

In a follow-up reply, Thariq also says rewind remains a prompt cache hit, addressing a concern about whether rewinding a long debugging conversation would cause a cache miss.

Why compacts go bad—and how to steer them

Compaction is described as lossy: it keeps moving, but it requires trusting Claude to decide what matters. Thariq’s explanation for “bad compacts” centers on predictability: if the model can’t infer where the work is headed, it may summarize the wrong things—especially when an autocompact fires after a long detour (like debugging), and the next step pivots to a different thread.

Two notable points from the thread:

  • Because of context rot, the model may be at its least sharp when compacting late in a long session.
  • With a 1M window, there’s more room to compact proactively and give guidance, such as focusing the summary on a specific refactor and dropping irrelevant debugging.

When asked about best practice for /compact, Thariq’s short prescription was: give it a hint.

Subagents as a way to avoid dragging tool output around

For work that generates lots of intermediate output that won’t be needed later, Thariq recommends subagents. The thread describes subagents as running in their own clean context window, returning only a synthesized result to the parent session.

A simple heuristic is provided: will the work require the raw tool output later, or just the conclusion? If it’s the conclusion, it’s a subagent candidate.

This also comes up in response to a request for “surgical” removal of unhelpful tool-call tokens: Thariq notes that removing tokens from a transcript can confuse Claude, and points to subagents as the answer for search-heavy workflows.

Open questions from the thread

The replies read like a checklist of what developers still want from long-context coding tools:

  • A way to see what percentage of the context window is left
  • More control over selectively removing unwanted context (without full compaction)
  • Desktop feature parity questions (for example, /rewind availability)
  • Discussion of LLM cache management and prompt caching mechanics

The underlying theme is consistent: context management is becoming a first-class part of AI-assisted coding workflows, especially as session length stops being the limiting factor.

Original source: https://x.com/trq212/status/2044548257058328723

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community