In a post on X, Paweł Huryn claimed that a model priced at 2x per token can end up "4x cheaper" when the task is finding bugs rather than answering questions. His numbers come from 60 metered audits of Fable 5 versus Opus 4.8, run against the same three files and billed from Claude Code session traces.
Huryn’s graphic, branded "FABLE 5 • AUDIT ECONOMICS • $122 OF API SPEND, METERED," puts the comparison on public API list prices and says both models were run at matched effort, "xhigh," with validation to the cent against CLI billing totals. The reported all-in spend was $122.27.
On a per-audit basis, Fable 5 came out higher: $2.93 median versus $1.17 for Opus 4.8, or "2.5x" wider than the sticker price would suggest. Huryn attributes part of that gap to Fable writing more. The chart’s fine print also notes that Fable’s cheapest run was $2.04, while Opus’s most expensive was $1.61.
The economics narrow when the unit shifts to findings. Huryn reports a median of 14 distinct findings per Fable report versus 7 for Opus, putting the cost at about $0.21 per finding for Fable and $0.17 for Opus. The post cautions that findings are not interchangeable units, and the counts were parsed from numbered report items.
The strongest divergence appears in a planted bug that spanned two files: a contradiction between a 50K cap and a 75K cap. Fable caught that issue in 20 of 30 audits, while Opus caught it in 2 of 30. Huryn estimates expected spend to surface it once at $4.40 for Fable and $17.55 for Opus, which is where the "quarter" figure comes from.
The same post also notes that the result is not universal. For another target, a same-file exclamation-mark rule clash favored Opus 16 to 12. That leaves the comparison looking less like a blanket model ranking and more like a task-specific cost curve, where the meaningful metric is the bug that actually gets caught.
Source: X post


