Cursor Bugbot hits nearly 80% AI review resolution rate

Cursor has just rolled out learned rules for Bugbot, turning real PR feedback into continuously updated review behavior. The company says Bugbot’s suggestions are now addressed nearly 80% of the time before merge, up from 52% at launch.

cursor cover

TL;DR

  • Bugbot resolution rate: Increased from 52% (July 2025 launch) to ~80%, based on 50,310 PRs
  • Metric definition: “Resolution rate” tracks whether comments get addressed before merge, judged via an LLM judge
  • Public-repo benchmark: Bugbot 78.13% vs Greptile 63.49%, CodeRabbit 48.96%, Copilot 46.69%
  • Signal vs volume: Resolution rate rising even as Bugbot reports more bugs, not merely reducing output
  • Learned rules (beta): Live learning from PR traffic; 110,000+ repos enabled, producing 44,000+ learned rules
  • Rule lifecycle and signals: Uses reactions, replies, human-reviewer comments; rules can be promoted/disabled, edited in UI; backfill available in https://cursor.com/dashboard/bugbot/repository-rules

Cursor’s Bugbot is showing a notably higher “signal-to-noise” ratio in AI code review, at least by Cursor’s chosen yardstick: whether a comment actually gets addressed before a PR merges. In the company’s latest update, Bugbot’s resolution rate has climbed from 52% at its July 2025 launch to nearly 80% today, based on an analysis of 50,310 PRs.

That improvement matters because it frames a persistent complaint about AI review assistants—too many findings that teams ignore as false positives—as something measurable and, more importantly, optimizable.

Measuring “resolution rate” across AI code review tools

Cursor compared several AI code review products using public repositories only. The method: for each comment produced by a given tool, Cursor checked whether it was addressed by the time the PR merged, using an LLM judge to evaluate the before/after state.

In Cursor’s dataset, Bugbot leads on resolution rate:

  • Cursor Bugbot: 78.13% (50,310 PRs analyzed)
  • Greptile: 63.49% (11,419 PRs)
  • CodeRabbit: 48.96% (33,487 PRs)
  • GitHub Copilot: 46.69% (24,336 PRs)
  • Codex: 45.07% (19,384 PRs)
  • Gemini Code Assist: 30.93% (21,031 PRs)

Cursor also notes Bugbot’s resolution rate is increasing even as it finds more bugs, suggesting the system is not improving merely by staying quiet.

From offline tuning to live learning

Until now, Cursor says Bugbot’s gains came from offline experiments: change the system, test whether the resolution rate improves, and ship if it does.

The shift announced here is an attempt to use what Bugbot already has in abundance—real PR traffic. Bugbot reviews hundreds of thousands of PRs per day, and each merge creates a feedback trail that can be translated into iterative improvements.

The mechanism Cursor is rolling out is learned rules (now in beta). Cursor says more than 110,000 repos have enabled learning so far, generating more than 44,000 learned rules.

How learned rules work

Learned rules are framed as additional instructions that customize Bugbot’s behavior—helping it focus on specific issues, business context, and patterns—based on signals from real reviews. Cursor highlights three key signal sources:

  1. Reactions to Bugbot comments, where negative reactions (like downvotes) indicate the finding wasn’t useful.
  2. Replies to Bugbot comments, where developers explain what was wrong or how a suggestion should be improved.
  3. Comments from human reviewers, which can flag issues Bugbot missed.

Bugbot turns these into candidate rules, evaluates them on incoming PRs, and can promote them to active status when evidence accumulates. If an active rule starts generating consistent negative signals, Bugbot can disable it. Rules are also editable or removable directly in the UI.

Enabling learning and backfilling recent PRs

Learned rules can be managed in the Cursor Dashboard, including the option to run a backfill across recent PRs to seed early rule formation. Cursor also links to the feature documentation in its Bugbot docs.

Source: Bugbot’s Improved Resolution Rate and Live Learning

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community