Models
General models
- Claude-3.7 Sonnet
- Claude-3.5 Sonnet
- Qwen-3 Coder 480B
- Gemini 2.5 Pro
- OpenAI-o3
- Grok-4
- GPT-4.1
- GPT-5-mini
- OpenAI-o4-mini
- OpenAI-o3-mini
- Deepseek v3
- Deepseek r1
Frontier models
- GPT-5
- Claude-4 Sonnet (1M Context Window)
Research models
- Claude-4 Opus
- Claude-4.1 Opus
- OpenAI-o3-pro
How do rate limits work?
Limits reset at the end of each billing cycle, and you can view your usage here.- Free tier limits are restrictive, and not designed for everyday use
- Business plans have ~2.5x the limit for research and frontier models compared to Developer plans.
- Max plans have ~8x the limit for research and frontier models compared to Developer plans.
- Credit overages will be adjusted based on pooled usage regularly, and you’ll be able to use at least your subscription amount with a generous amount more of usage.
Background Agent
Background agent limits are more strict, in that they do not allow credit overages because it was designed to scale horizontally and automate large amounts of code changes.- Free tier limits are restrictive, and not designed for everyday use
- Business plans have ~2x the limit for background agents than Developer plans.
- Max plans have ~5x the limit for background agents than Developer plans.
What if I hit a limit?
You’ll be notified explicitly that a rate limit is hit and when the rate limit will reset for that model. You can:- Use another model
- Wait for the rate limit to reset
- Upgrade to a higher tier
The next best model will be used automatically (e.g Opus converts to Sonnet) to avoid disruption, based on the given context, overall acceptance rates for each model, and speed.