Antigravity adds Gemini 3.5 Flash Low to cut tokens 45%

Antigravity has just rolled out Gemini 3.5 Flash (Low), aiming to use about 45% fewer tokens than the Medium setting while still topping Gemini 3 Flash (High) on SWE tasks. Product lead Varun Mohan also says Gemini quotas were reset for all plans after user feedback.

gemini cover

TL;DR

  • New mode: Gemini 3.5 Flash (Low) added after feedback about high token consumption for simple tasks
  • Token efficiency: Internal tests show ~45% fewer tokens than Gemini 3.5 Flash (Medium)
  • SWE performance: Claimed to generally outperform Gemini 3 Flash (High) on SWE tasks
  • Quota update: Gemini quota reset across all plans, intended to provide more room for the next week
  • Implementation details: Effort level adjusted only; no changes to system prompt or context compaction
  • User feedback: Reports of reset issues; requests for higher image limits, tiered quotas, caching, plan mode, browser-agent improvements

Antigravity product lead Varun Mohan posted on X that the company is adding a new “Gemini 3.5 Flash (Low)” mode after hearing concerns that the product “consumes many tokens for simple tasks.” Mohan claimed internal testing shows the mode uses “around 45% fewer tokens” than Gemini 3.5 Flash (Medium) and “generally outperforms Gemini 3 Flash (High)” on SWE tasks.

In the same thread, Mohan also stated that Gemini quota had been reset across paid plans, later clarifying that the reset applies to “all plans.” He wrote that “everyone needs to build,” suggesting the quota refresh was meant to give users more room to work for the next week.

Mohan later addressed questions about why the change was made, saying the team had been “using the model for a while internally” but had a “blind spot in measuring token usage for a set of simpler tasks.” He added that the company had optimized for “making the product fast at solving complex tasks” and would “improve going forward.”

He also pushed back on speculation that the update cut corners on prompts or context handling. According to Mohan, the release “purely modifies the effort level for the model,” and “does not cut corners on the system prompt or context compaction.” For simpler work, he said, the goal is to “optimize cost,” while more complex tasks should use a “higher effort level.”

The replies suggest the announcement landed unevenly. Some users reported quota-reset problems, others asked for higher image-model limits, separate quotas by model tier, and better handling of caching, plan mode, and browser-agent behavior. A few commenters also questioned whether “low” mode would keep its SWE performance edge once the token savings take effect.

Source: X post by Varun Mohan

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community