OpenRouter launches Fusion API that "nears Fable-level performances", at "half the price"

OpenRouter has just rolled out Fusion API, a server-side “compound model” that fans prompts out to multiple models, then judges and synthesizes a final response. The company says it nears Fable-level performance on DRACO at about half the price, with limited benchmark coverage so far.

OpenRouter launches Fusion API that "nears Fable-level performances", at "half the price"

TL;DR

  • OpenRouter introduced Fusion API, a server-side compound model exposed as openrouter/fusion
  • Benchmarked on Perplexity DRACO: 100 hard research tasks, 10 domains, ~39 weighted criteria, negative weight for wrong answers
  • Claimed gains from panels: outperforming solo models, “beyond-frontier” via frontier panels, budget panels beating frontier at lower cost
  • Charted results: Fusion panels scored 93/100 tasks; several combinations in mid-to-high 60% range
  • Method: parallel prompts with web search + bash tools, judge extracts consensus/contradictions, synthesizer produces final answer
  • Integration: tools array {"type":"openrouter:fusion"}; supports custom participants/synthesizers; works anywhere OpenRouter is supported, including OpenCode

OpenRouter on Saturday introduced Fusion API, a server-side system the company describes as a compound model that can deliver “Fable-level intelligence at half the price.” The launch rests on a small set of benchmark claims, and OpenRouter itself cautions that the system has so far been evaluated on only one deep-research benchmark, which did not cover long-horizon tasks.

According to the company, Fusion was tested on 100 hard research tasks across 10 domains on Perplexity’s DRACO benchmark, with each task graded against roughly 39 weighted criteria and wrong answers carrying negative weight. OpenRouter claims panels of models “consistently outperform” individual models, that frontier panels can push beyond-frontier performance, and that budget panels can beat frontier models at materially lower cost.

The company’s charts place Fusion panels above solo models on a bar chart scoring 93 of 100 tasks, with several fusion combinations clustered in the mid-to-high 60% range. One panel pairing Gemini 3 Flash, Kimi K2.6, and DeepSeek V4 Pro is shown beating solo GPT-5.5 and solo Opus 4.8, while landing within 1% of Claude Fable 5 at roughly half the price, according to OpenRouter’s own figures.

OpenRouter also breaks down the reported lift behind Fusion: roughly three quarters appears to come from synthesis, and about one quarter from diversity. The company says the process fans a prompt out to several models in parallel with web search and bash tools enabled, then uses a judge model to extract consensus points, contradictions, partial coverage, unique insights and blind spots before a synthesizer writes the final response.

The evaluation was not free of complications. OpenRouter noted that when web search was first enabled, some models surfaced the DRACO rubric online, so the company excluded those domains and reran the benchmark; it says the published numbers come from that cleaned setup. In replies, the company also said the benchmarks were run earlier in the week, before Fable was taken down, and that cost comparisons included cache hits.

Fusion is exposed as a single server-side slug, openrouter/fusion, and OpenRouter also says developers can pass {"type": "openrouter:fusion"} in a tools array to let the system decide when to use it. The company says custom participant models and synthesizers can also be supplied, and that the setup works anywhere OpenRouter is supported, including OpenCode.

Source: OpenRouter

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community