Antigravity adds Gemini 3.5 Flash Low to cut tokens 45%

Antigravity has just rolled out Gemini 3.5 Flash (Low), aiming to use about 45% fewer tokens than the Medium setting while still topping Gemini 3 Flash (High) on SWE tasks. Product lead Varun Mohan also says Gemini quotas were reset for all plans after user feedback.

May 26, 2026

•

Gemini LLM Skills

TL;DR

New mode: Gemini 3.5 Flash (Low) added after feedback about high token consumption for simple tasks
Token efficiency: Internal tests show ~45% fewer tokens than Gemini 3.5 Flash (Medium)
SWE performance: Claimed to generally outperform Gemini 3 Flash (High) on SWE tasks
Quota update: Gemini quota reset across all plans, intended to provide more room for the next week
Implementation details: Effort level adjusted only; no changes to system prompt or context compaction
User feedback: Reports of reset issues; requests for higher image limits, tiered quotas, caching, plan mode, browser-agent improvements

Antigravity product lead Varun Mohan posted on X that the company is adding a new “Gemini 3.5 Flash (Low)” mode after hearing concerns that the product “consumes many tokens for simple tasks.” Mohan claimed internal testing shows the mode uses “around 45% fewer tokens” than Gemini 3.5 Flash (Medium) and “generally outperforms Gemini 3 Flash (High)” on SWE tasks.

In the same thread, Mohan also stated that Gemini quota had been reset across paid plans, later clarifying that the reset applies to “all plans.” He wrote that “everyone needs to build,” suggesting the quota refresh was meant to give users more room to work for the next week.

Mohan later addressed questions about why the change was made, saying the team had been “using the model for a while internally” but had a “blind spot in measuring token usage for a set of simpler tasks.” He added that the company had optimized for “making the product fast at solving complex tasks” and would “improve going forward.”

He also pushed back on speculation that the update cut corners on prompts or context handling. According to Mohan, the release “purely modifies the effort level for the model,” and “does not cut corners on the system prompt or context compaction.” For simpler work, he said, the goal is to “optimize cost,” while more complex tasks should use a “higher effort level.”

The replies suggest the announcement landed unevenly. Some users reported quota-reset problems, others asked for higher image-model limits, separate quotas by model tier, and better handling of caching, plan mode, and browser-agent behavior. A few commenters also questioned whether “low” mode would keep its SWE performance edge once the token savings take effect.

Source: X post by Varun Mohan

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community

How to build AI agents from first principles, not frameworks

Anshuman Mishra lays out a bottom-up recipe for agent training using a tiny text-to-diagram task. The key: start with a strict environment and reward loop, use SFT to learn valid actions, then apply RL to optimize behavior—and watch for reward hacking.

May 22, 2026

1 shared tag

Zed makes the case for local AI models in its editor

Zed has published a new post arguing that local AI delivers stronger privacy guarantees, steadier costs, and less reliance on cloud policy changes. It says local model usage in Zed’s agent has tripled in 10 weeks, with setup tips for LM Studio, Ollama, and llama.cpp.

May 22, 2026

1 shared tag

Google launches Gemini 3.5 Flash with 4x faster coding

Google has just rolled out Gemini 3.5 Flash, touting "frontier-level" agentic and coding performance at 4x the speed and often under half the cost. It’s available now in the Gemini app, Search AI Mode, and developer tools, with mixed early reactions.

May 20, 2026

1 shared tag

Continue the conversation on Slack

Related Articles

How to build AI agents from first principles, not frameworks

Zed makes the case for local AI models in its editor

Google launches Gemini 3.5 Flash with 4x faster coding