Qwen 3.5 Max Preview is starting to show up in leaderboard chatter again, with the Qwen team pointing to fresh placements that—at least on paper—suggest a meaningful step up in reasoning-heavy evaluations.
In a post on X, the official Qwen account said the “Max Preview” variant recently reached #3 in Math, landed in the Top 10 in “Arena Expert”, and placed Top 15 overall. The same post framed the release as a preview build and noted that work is already underway on “optimizing the preview experience,” with “sharper performance” promised later.
What the rankings are (and what people are asking for)
The announcement itself focused on rank positions rather than detailed methodology, but the replies quickly converged on practical questions—especially around where this model actually shows up in day-to-day tooling.
Several developers and users asked whether Qwen 3.5 Max Preview is available in Qwen Chat and Qwen Code, and whether it’s already accessible through an Alibaba API or Alibaba Cloud. Others asked for a blog post and benchmarks to accompany the leaderboard placements.
Coding performance comes up immediately
Even with a strong Math placement, multiple replies pressed on whether the preview model improves “actual coding logic” over prior versions. Another thread asked about performance on complex API integrations and nested function calls, pointing to context switching and multi-step coding workflows as the place where many models still stumble.
One reply also flagged a more operational concern: token generation speed, with a complaint that output throughput in a paid coding plan felt “ridiculously slow.”
Open source strategy remains a recurring question
A notable slice of the replies wasn’t about math or arena scores at all—it was about distribution and licensing. People asked whether Qwen will continue open-sourcing models, and whether there will be an open source Qwen 4. The thread, as posted, doesn’t include answers to those questions.
A cautious read on “Math #3” in a preview
Not everyone treated the overall rank as the headline. One reply argued that #3 in Math is the part to watch, since math-style evaluations can correlate with structured reasoning that tends to matter in coding, analysis, and multi-step tasks—while also noting that “Preliminary” leaderboard labels and confidence intervals can shift.
For now, what’s concrete is limited to those reported placements and the fact that Qwen is iterating on the preview.
Source: https://x.com/Alibaba_Qwen/status/2034658901321560549
