Qwen3.5-397B-A17B is now out as the first open-weight release in the Qwen3.5 series, positioning the lineup around native multimodal capabilities and a systems-level focus on agent-style workloads. Alongside the open-weights drop, the broader family also includes Qwen3.5‑Plus, described as a native vision-language model built on a hybrid linear-attention + sparse MoE design aimed at improving inference efficiency, with an extended 1M-token context window for long-context multimodal reasoning and agent workflows.
What’s shipping in Qwen3.5-397B-A17B
Qwen’s announcement centers on a model with 397B total parameters and an A17B configuration (as named), with the release framed explicitly around real-world agent behavior. The company highlights:
- Native multimodal support
- Training “for real-world agents”
- Hybrid linear attention + sparse MoE
- Large-scale RL environment scaling
- 201 languages & dialects
- Apache 2.0 licensing
- A reported 8.6x–19.0x decoding throughput vs Qwen3-Max
Separate follow-up posts also point to slides or figures covering efficiency, infrastructure, and both LM and VLM performance.
Why the architecture choice matters for agentic coding
The technical throughline here is efficiency on long-running generations. Observers in the thread note the intuition behind the combo: linear attention targets the quadratic costs that show up as contexts grow, while sparse MoE increases total capacity without activating all parameters every token—useful when a model is expected to sustain long rollouts across multi-step tasks.
That theme is echoed by multiple replies emphasizing throughput as the practical constraint for multi-agent coding pipelines, where latency and sustained decoding costs can dominate overall task time.
There’s also already ecosystem motion: Unsloth AI says it has published local-run GGUFs here: https://t.co/j5vIkGID5y
Open questions: requirements and agent evaluation
Notably, the replies quickly converge on practical deployment questions—especially minimum and recommended system requirements—plus requests for more complete agent evaluation artifacts, like tool-call traces or end-to-end task success suites across languages and multi-step workflows.
Original source: https://x.com/i/status/2023331062433153103
