OpenAI is rolling out its newest small-model lineup with GPT-5.4 mini and nano, aiming squarely at high-volume, low-latency production workloads—the sort that power coding assistants, subagent systems, and multimodal apps that need to respond quickly without leaning on the largest (and slowest) model tier.
Both models inherit much of GPT-5.4’s “Thinking” family strengths, but with an emphasis on speed, efficiency, and tool reliability. OpenAI positions them for scenarios where latency is part of the product: fast debugging loops, rapid tool calls, and computer-use flows that involve interpreting screenshots and acting on them.
GPT-5.4 mini: a faster mini model that tracks closer to full GPT-5.4
GPT-5.4 mini is described as more than 2× faster than GPT-5 mini while improving across coding, reasoning, multimodal understanding, and tool use. On headline evaluations, mini narrows the gap with the larger GPT-5.4 model:
- SWE-Bench Pro (Public): 54.4% (vs. 57.7% for GPT-5.4; 45.7% for GPT-5 mini)
- OSWorld-Verified: 72.1% (vs. 75.0% for GPT-5.4; 42.0% for GPT-5 mini)
- Toolathlon: 42.9% (vs. 54.6% for GPT-5.4; 26.9% for GPT-5 mini)
OpenAI frames this as a performance-per-latency play: the model is intended to stay responsive while still handling professional-grade coding tasks like targeted code edits, codebase navigation, front-end generation, and debugging loops.
A better fit for subagent architectures
A key theme is composing systems with mixed model sizes. In Codex, a larger model such as GPT-5.4 can take on planning and final judgment, delegating narrower subtasks to GPT-5.4 mini subagents—for example, searching a codebase, reviewing a large file, or processing supporting documents. OpenAI also links to its documentation on subagents in Codex: developers.openai.com/codex/subagents/.
GPT-5.4 nano: the cheapest GPT-5.4 variant for simpler supporting work
GPT-5.4 nano is the smallest and lowest-cost GPT-5.4 option, positioned for classification, data extraction, ranking, and simpler coding subagent tasks—the “supporting” work that can run in parallel without requiring deep reasoning.
Even at this tier, OpenAI reports major benchmark gains versus GPT-5 mini. On SWE-Bench Pro (Public), nano scores 52.4%, ahead of GPT-5 mini’s 45.7%.
Better but Costlier
GPT-5.4 mini is available “today” across the API, Codex, and ChatGPT. In the API it supports text and image inputs, tool use, function calling, web search, file search, computer use, and skills, with a 400k context window. Pricing is $0.75 per 1M input tokens and $4.50 per 1M output tokens—a 2.25× increase over GPT-5 mini, making it a meaningful step up in cost for teams running at scale.
In Codex, GPT-5.4 mini is available in the app, CLI, IDE extension, and web, and it uses 30% of the GPT-5.4 quota.
In ChatGPT, GPT-5.4 mini is available to Free and Go users via the “Thinking” feature in the + menu, and for other users it appears as a rate limit fallback for GPT-5.4 Thinking.
GPT-5.4 nano is API-only, priced at $0.20 per 1M input tokens and $1.25 per 1M output tokens, meaning a 3x increase over GPT-5 nano. For safeguards detail, OpenAI points to a system card addendum on its Deployment Safety Hub: appendix-gpt-5.4-mini.
