Qwen debuts 3.5 Medium models with Flash and 1M context

One week after the announcement of the Qwen Plus model, Qwen has introduced the Qwen 3.5 Medium Model Series, a set of new models positioned around a familiar developer-friendly theme: more capability at lower compute, plus a hosted option aimed at production deployments.

The lineup includes Qwen3.5-Flash, Qwen3.5-35B-A3B, Qwen3.5-122B-A10B, and Qwen3.5-27B. In its announcement, Qwen highlighted that Qwen3.5-35B-A3B surpasses Qwen3-235B-A22B-2507 and Qwen3-VL-235B-A22B, framing it as evidence that architecture, data quality, and RL can matter as much as scaling parameter counts.

What’s in the Qwen 3.5 Medium series

The emphasis this time is split between two tracks:

Open-weight releases via model hubs
A hosted “Flash” variant meant to be used directly as an API-backed model

Qwen links to distribution on both Hugging Face and ModelScope, alongside a dedicated Qwen3.5-Flash API endpoint at https://t.co/82ESSpaqAF.

Qwen3.5-Flash: long context and built-in tools

For teams thinking in terms of agents and tool-using workflows, Qwen3.5-Flash is described as the hosted production version aligned with 35B-A3B. Two details stand out in the published bullet points:

1M context length by default
Official built-in tools

Qwen also provided links to run the models in Qwen Chat, including Flash (https://t.co/UkTL3JZxIK), 27B (https://t.co/haKxG4lETy), 35B-A3B (https://t.co/Oc1lYSTbwh), and 122B-A10B (https://t.co/hBMODXmh1o).

Why developers are paying attention: “medium” models for local inference

A notable thread in replies centers on practicality: sizes that fit into local workflows without abandoning ambitious tasks. One response from Unsloth AI claims Qwen3.5-35B-A3B can run locally on a 24GB Mac/RAM device via GGUFs, pointing to https://t.co/5G4DYTCGtL. Others asked whether GGUFs could be shipped alongside the main release to shorten the time between official weights and widely usable local quantizations.

There were also repeated questions about smaller “tiny” variants (1B/3B/7B), along with general curiosity about how these models compare in agentic coding and tool-heavy settings—areas where Qwen explicitly says the 122B-A10B and 27B are “narrowing the gap” with frontier models, “especially in more complex agent scenarios.”

Source: https://x.com/Alibaba_Qwen/status/2026339351530188939

Qwen debuts 3.5 Medium models with Flash and 1M context

TL;DR

What’s in the Qwen 3.5 Medium series

Qwen3.5-Flash: long context and built-in tools

Why developers are paying attention: “medium” models for local inference

Continue the conversation on Slack