A simple “fresh eyes” prompt can make AI reviews tougher

A recent post by Theodore Ts’o explores an “adversarial review” prompt that pushes agentic systems to scrutinize their own work more skeptically. By using separate subagents and a competitive framing, it can surface more issues than typical self-checks.

A simple “fresh eyes” prompt can make AI reviews tougher

TL;DR

  • “Look at this again with fresh eyes”: Simple phrase that often triggers more skeptical re-checking of completed work
  • Self-review limitations: Conflicting goals when a system judges its own output can make reviews unreliable
  • Adversarial review: Use a separate subagent to examine the work instead of the original agent
  • Cross-model reviews: Claude for coding paired with Codex for review; different models can improve scrutiny
  • Competitive prompt pattern: Two subagents compete; “five points” to whoever finds the most serious issues
  • Stronger incentives: “Cookie” rewards or explicit targets (e.g., “16 significant problems”) can push deeper review

A post on blog.fsck.com outlines a simple prompt for getting agentic systems to review their own work with something closer to skeptical scrutiny. The author points to the phrase “Look at this again with fresh eyes” as a surprisingly effective way to encourage an agent to re-check completed work.

The note argues that self-review can be unreliable because the system is being asked to judge its own output, creating conflicting goals. To get around that, the post recommends “adversarial review,” where a separate subagent examines the work instead of the original one doing the checking.

The setup gets more interesting when a different model is involved. The post mentions that people coding with Claude often use Codex for review tasks, and suggests that even without multiple providers, a competitive prompt can nudge models to dig harder for problems.

One example prompt asks for two subagents to review the work and awards “five points” to whoever finds the most serious issues. The author also notes that a “cookie” can apparently work just as well, and that an explicit challenge like “I’ll be disappointed if they don’t find at least 16 significant problems” can sometimes push the review further.

Source: blog.fsck.com

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community