Graphite Diamond Outshines Noisy Async Coding Agents, Jessie Frazelle Finds

Jessie Frazelle’s Agentic Engineering session tested async coding agents on real production code. Most added noisy, low-value comments; Graphite Diamond stood out by interjecting only on substantive issues, boosting usefulness while still requiring human verification.

Graphite Diamond Outshines Noisy Async Coding Agents, Jessie Frazelle Finds

TL;DR

  • Most async agents generate low-value noise; Copilot and Sourcery noted for commenting on every PR, producing hallucinated summaries, and sometimes editing reviewer comments.
  • Graphite Diamond: doesn't comment unless it finds substantive issues; flagged real bugs across Rust, C++, Python, and TypeScript (math bugs in CAD, subtle logic errors, DSL variable mismatches); configurable prompts and CI-friendly behavior. https://diamond.graphite.dev/
  • Trust, but verify: suggestions treated as a second pair of eyes, not a replacement; inspect outputs for high-stakes subsystems (geometry solvers, manufacturing instruction generators) and unfamiliar languages.
  • Practical takeaway: async agents are not a panacea; value comes from restraint, meaningful interventions, and careful integration into workflows.
  • Further resources: full session on YouTube; Agentic Engineering; Zed Download now; Join the team

Jessie Frazelle’s recent Agentic Engineering session evaluated a range of async coding agents against real production code, and the standout was Graphite Diamond — not because it talked the most, but because it spoke only when it had something substantive to add.

Most async agents just add noise

The session began with a familiar problem: many async agents generate low-value output that clutters review workflows. Tools like Copilot and Sourcery were noted for commenting on every PR regardless of usefulness, producing auto-summaries that sometimes hallucinate, and occasionally editing original reviewer comments in ways that felt invasive. Verbose analyses or generated flow diagrams often appeared authoritative while missing critical details—raising the question of whether those artifacts could be trusted without careful human verification.

The one that worked

The clear exception was Graphite Diamond. Jessie described it as “more signal than noise”: it doesn't comment unless it has something real. In practice, Diamond flagged real issues across Rust, C++, Python, and TypeScript codebases—catching math bugs in CAD code, subtle logic errors, and variable mismatches in a custom DSL. The team’s reaction was telling: the agent often ran quietly and only revealed itself when it found a legitimate concern. Customizable system prompts and CI-friendly behavior (for example, surfacing product notices in CI rather than spamming PR threads) contributed to its fit within an existing workflow.

Trust, but verify

Adoption came with caveats. Jessie emphasized a conservative approach: suggestions are treated as a second pair of eyes, not a replacement for human review. This matters particularly in high-stakes areas—geometry solvers, manufacturing instruction generators, and other components where silent math errors can be costly to debug or dangerous if shipped. While trust in the agent increased over time, every suggestion is still inspected, especially in unfamiliar languages or complex subsystems.

The right role for AI

The session’s practical conclusion is modest and pragmatic: async agents are not a panacea, and many remain overconfident or clumsy. When designed for restraint and integrated thoughtfully, however, an agent can quietly amplify a team’s ability to catch things that humans sometimes miss. The most valuable agents are those that prioritize meaningful intervention over constant commentary.

Further resources

Original source: https://zed.dev/blog/up-your-async-game-jessie-frazelle

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community