Microsoft plans GitHub Copilot token billing, tighter limits

Leaked docs say Microsoft is preparing a shift from request quotas to token-based billing for GitHub Copilot. The changes could also pause some new individual signups, tighten rate limits, and cut model access on lower-cost plans.

Microsoft plans GitHub Copilot token billing, tighter limits

TL;DR

  • Billing shift planned: Move from request-based allowances to token-based billing tied to compute usage; timing unclear
  • Rising operating costs: Microsoft cites Copilot running costs nearly doubling week-over-week since January
  • Signup changes: Temporary pause planned for new signups to Student, Copilot Pro, and Copilot Pro+ tiers
  • Limits and trials: Further rate-limit tightening across individual plans (and some Business/Enterprise); paid individual trials suspended to fight abuse
  • Model access reductions: Anthropic Opus removed from $10 Copilot Pro; additional removals expected on Pro+
  • Recent/ongoing model changes: Claude Opus 4.6 Fast retired from Pro+; Opus 4.6/4.5 to be removed as Copilot shifts to Claude Opus 4.7

Microsoft is preparing a major set of changes to GitHub Copilot, including a move away from today’s request-based allowances toward token-based billing, alongside tighter rate limits and model access changes, according to leaked internal documents viewed by Where’s Your Ed At. Microsoft has also confirmed some of the details in a GitHub blog post about changes to Copilot plans for individuals.

The documents describe a cost-driven push: Microsoft says the week-over-week cost of running Copilot has nearly doubled since January, increasing the urgency behind changes that would more directly tie customer usage to compute spend.

From “requests” to token-based billing

Copilot’s individual plans currently use a “requests” system. GitHub’s documentation defines requests as interactions with Copilot, with plan allowances that include 300 requests/month for Copilot Pro ($10/month) and 1500 requests/month for Copilot Pro+ ($39/month). Different models effectively “cost” different amounts of requests via multipliers, with heavier models consuming more of a user’s allocation.

Under token-based billing, users would be charged based on tokens consumed (and the compute associated with them), rather than drawing down a flat “requests” bucket. The documents say it’s unclear when token-based billing will begin.

New signups reportedly set to pause for some individual tiers

The leaked documents say Microsoft intends to temporarily pause new signups for Copilot’s Student offering and its paid individual tiers, Copilot Pro and Copilot Pro+.

More rate-limit tightening, fewer models on lower-cost plans

The documents also describe further rate-limit tightening across individual plans, plus some Copilot Business and Enterprise plans, and say Microsoft plans to suspend trials of paid individual plans to “fight abuse.”

On the model side, Microsoft intends to remove Anthropic’s Opus family from the $10/month Copilot Pro plan altogether, per the documents.

GitHub has already retired Claude Opus 4.6 Fast from Copilot Pro+, describing that change as part of service reliability and model-offer “streamlining.” Additional removals are expected: the documents say Opus 4.6 and Opus 4.5 will be removed from Pro+ “in the coming weeks” as Copilot transitions to Claude Opus 4.7.

GitHub has also been using premium request multipliers to reflect model cost. As one example in the documents, GPT-5.4 Mini carries a 0.33 multiplier, while the now-retired Opus 4.6 Fast had a 30x multiplier. For Opus 4.7, GitHub is offering a 7.5x request multiplier until April 30, after which the multiplier hasn’t been clarified.

Source

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community