OpenRouter adds Alibaba’s Qwen3.7-Max with prompt caching

OpenRouter has just rolled out Alibaba’s Qwen3.7-Max, positioning it as the flagship Qwen3.7 model for agent-centric coding, productivity, and long-horizon execution. The launch highlights claimed benchmark gains over Qwen3.6 and explicit prompt caching, as users press for more proof.

May 22, 2026

•

Qwen

TL;DR

Qwen3.7-Max live on OpenRouter: Alibaba Qwen’s Qwen3.7 flagship positioned for agent-centric coding and productivity tasks
Claimed benchmark gains vs Qwen3.6: “Big jumps” in coding and agent benchmarks highlighted in the announcement
Explicit prompt caching: Support for repeated context to reduce re-sending prompts; linked guide for Qwen caching usage
Community reaction: cautious optimism; requests for stronger evidence (tool calls, checkpoints, interventions, diff quality, cost per accepted change)
Deployment questions: latency comparisons vs Qwen3.6 raised; caching suggested to help agent loops by reducing repeated scaffolding
Thread-only performance claims: 115 TPS mentioned for Qwen3.7-Max vs 24 TPS for Qwen3.6 Max Preview

OpenRouter posted on X that Qwen3.7-Max from Alibaba Qwen is now live on its platform, describing it as the flagship of the Qwen3.7 series for "agent-centric" work such as coding, office and productivity tasks, and long-horizon autonomous execution. The post also claims "big jumps" in coding and agent benchmarks over Qwen3.6, alongside explicit prompt caching for repeated context.

OpenRouter also linked to a try-it page and a separate guide on using explicit caching with Qwen models.

The response from the thread was mostly cautious enthusiasm. One commenter called "agent-centric" the key phrase, while another argued that claims like these should come with more evidence, including recovered tool calls, checkpoint frequency, human interventions, final diff quality, and cost per accepted repo change.

Source: OpenRouter

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community

Hugging Face cofounder touts Qwen 27B on MacBook Pro

Hugging Face cofounder Julien Chaumond says running Qwen3.6 27B locally via Llama.cpp in Pi felt “pretty magical,” nearing Claude Opus for real code tasks. Replies quickly honed in on RAM, speed, and battery life trade-offs.

Apr 24, 2026

1 shared tag

Alibaba previews Qwen3.6-Max

Alibaba’s Qwen team has unveiled Qwen3.6-Max-Preview as an early look at its next flagship model. The pitch: stronger agentic coding, improved instruction following, and better “real-world” reliability—alongside hints that more Qwen3.6 models are coming.

Apr 20, 2026

1 shared tag

Continue the conversation on Slack

Related Articles

Hugging Face cofounder touts Qwen 27B on MacBook Pro

Alibaba previews Qwen3.6-Max