Cursor Basics

Cursor Max Mode: Extend Context for Large Codebases

By Learn Cursor teamUpdated June 25, 2026

Max Mode expands the active context window to the largest size a given model supports — up to 1 million tokens for supported models. Turn it on per-request from the model selector in the chat or agent panel. It bills at token-based rates, so one Max Mode request costs significantly more than a normal one.

On this page

What does Max Mode actually do?
How do I turn on Max Mode?
When should I use Max Mode?
How does Max Mode relate to extended thinking?
How is Max Mode billed?
Where does Max Mode fit among the model categories?

What does Max Mode actually do?

Each model has a default context window — the number of tokens it reads in one pass. In normal mode, Cursor uses a working window that fits most tasks efficiently. Max Mode raises the ceiling to the model's technical maximum: Claude Sonnet 4.6 goes from 200k to 1M tokens; other models scale similarly.

A bigger context window means the agent can hold more of your codebase in view simultaneously — useful when a change touches many files, when you're debugging a sprawling call chain, or when you need the agent to reason about the whole repo without missing dependencies.

Max Mode costs more per request

Because Max Mode uses far more tokens per call, a single agent turn can consume substantially more of your usage budget. Enable it for genuinely large tasks; for most everyday coding, the standard window is sufficient.

How do I turn on Max Mode?

1Open the model selector in your chat or agent panel (click the model name at the top of the panel).
2Find the model you want to use and toggle Max Mode on.
3The toggle applies per-session — you can turn it on for a large refactor and off again for quick questions.

Max Mode is available for all state-of-the-art models in Cursor. The exact context ceiling depends on the model; the model selector shows the Max Mode window size next to each option.

When should I use Max Mode?

Task: Fix a single bug in one file
Max Mode needed?: No — standard window handles it

Task: Refactor a function and its callers across 10+ files
Max Mode needed?: Yes — benefits from seeing all call sites at once

Task: Add a feature to a large monorepo with shared types
Max Mode needed?: Yes — prevents missed dependencies

Task: Debug a request that spans API, service and DB layers
Max Mode needed?: Yes — agent needs the full chain in view

Task: Write a new component with clear isolated scope
Max Mode needed?: No — standard context is more efficient

Task: Generate documentation for the whole repo
Max Mode needed?: Yes — complete picture reduces gaps

Task	Max Mode needed?
Fix a single bug in one file	No — standard window handles it
Refactor a function and its callers across 10+ files	Yes — benefits from seeing all call sites at once
Add a feature to a large monorepo with shared types	Yes — prevents missed dependencies
Debug a request that spans API, service and DB layers	Yes — agent needs the full chain in view
Write a new component with clear isolated scope	No — standard context is more efficient
Generate documentation for the whole repo	Yes — complete picture reduces gaps

Use Max Mode for tasks where missing context causes wrong assumptions, not just for any large file.

How does Max Mode relate to extended thinking?

Some models support thinking mode — an extended reasoning pass before producing output. For Sonnet 4.6 and Opus 4.6, Cursor exposes this as a separate thinking variant in the model selector. Thinking and Max Mode are independent: you can use Max Mode without thinking enabled, and vice versa.

Two independent axes

Max Mode: Increases the context window — how much code the model can see at once.
Thinking / Extended thinking: Increases the reasoning depth — how much the model deliberates before answering.
Max Mode + Thinking: Both: large context and deep reasoning. Highest cost; best for the hardest architectural problems.

How is Max Mode billed?

Max Mode uses the standard token-based rates — there is no separate 'Max' surcharge. The cost is high simply because each request uses many more tokens. Sonnet 4.6 in Max Mode bills at the same per-token rate as normal mode, including when context exceeds 200k; there is no long-context multiplier for Sonnet 4.6.

Check your spend cap before long Max Mode sessions

If you have a monthly spend cap set, a complex Max Mode agent run can exhaust it faster than expected. Check Settings → Usage for your current consumption. Verify current pricing at cursor.com/docs/models-and-pricing.

Where does Max Mode fit among the model categories?

It helps to think of Cursor's models in three buckets. Standard models answer quickly at a normal context window — your everyday driver. Thinking models add an extended reasoning pass before they answer. Max is not a model; it is a per-request toggle that pushes whichever model you picked up to its largest context window. So Max Mode sits on top of a standard or thinking model rather than replacing it.

Standard

Normal context window.

Fast, cheapest per request.

Use for the bulk of edits and questions.

Thinking

Extended reasoning before output.

Better on tricky logic and planning.

Costs more turns of deliberation.

Max

A toggle, not a model.

Pushes context to the model's ceiling.

Reserve for genuinely large-scope tasks.

What a Max Mode chat actually costs in request terms

Cursor's usage is metered in model spend, but a handy mental model is that one normal request maps to roughly 8 cents of usage. A single Max Mode chat reads far more tokens per turn — and an agent chat is many turns — so the same conversation can map to around 30 requests of equivalent spend. That is the gap people notice when one Max Mode session draws down their allowance the way a couple dozen normal requests would.

Rough request-mapping math

One normal request: ≈ 8 cents of usage.
One Max Mode chat: Can map to ~30 requests of equivalent spend.
Why: Far more tokens read per turn, multiplied across the turns in an agent chat.

An approximation to build intuition, not an official rate card. Verify live pricing at cursor.com/docs/models-and-pricing.

Bigger context is not always better

More context can actually degrade accuracy: the model has more to wade through and can lose the thread among less-relevant code. So Max Mode is selective by design — reach for it on tasks that genuinely need the whole picture, and often pair it with plan mode so the large context is spent on producing a solid plan rather than a sprawling, hard-to-review edit.

Frequently asked questions

Why did one Max Mode chat cost 30 requests?

Because Max Mode reads far more tokens per turn, and an agent chat is many turns. If a normal request is roughly 8 cents of usage, a single Max Mode conversation can read enough tokens across its turns to map to about 30 normal requests of equivalent spend. Nothing is broken — the meter is just counting the much larger volume of tokens the bigger context window pulled in.

Does Max Mode make the agent more accurate or just give it more context?

Primarily more context. The model's intelligence is the same; what changes is how much of your codebase it can see in one pass. For most tasks, that directly improves accuracy on large codebases by reducing the chance the agent misses a dependency or type definition.

Can I use Max Mode with every model in Cursor?

Max Mode is available for all state-of-the-art models Cursor supports, though the maximum window size varies by model. The model selector shows Max Mode availability and the window size for each option.

Is Max Mode on by default?

No — it is off by default to keep costs predictable. You toggle it per-session in the model selector. Once toggled on, it stays on for that session until you turn it off.

Will Max Mode use my full usage quota faster?

Yes. A single complex Max Mode request can use the token equivalent of many standard requests. If you are close to a usage cap or spend limit, enable Max Mode deliberately for the tasks that need it.

Sources & last verified

Cursor ships frequently. Facts verified against primary sources on June 25, 2026.

Cursor Max Mode: Extend Context for Large Codebases

What does Max Mode actually do?

Max Mode costs more per request

How do I turn on Max Mode?

1Open the model selector in your chat or agent panel (click the model name at the top of the panel).
2Find the model you want to use and toggle Max Mode on.
3The toggle applies per-session — you can turn it on for a large refactor and off again for quick questions.

Max Mode is available for all state-of-the-art models in Cursor. The exact context ceiling depends on the model; the model selector shows the Max Mode window size next to each option.

When should I use Max Mode?

Task: Fix a single bug in one file
Max Mode needed?: No — standard window handles it

Task: Refactor a function and its callers across 10+ files
Max Mode needed?: Yes — benefits from seeing all call sites at once

Task: Add a feature to a large monorepo with shared types
Max Mode needed?: Yes — prevents missed dependencies

Task: Debug a request that spans API, service and DB layers
Max Mode needed?: Yes — agent needs the full chain in view

Task: Write a new component with clear isolated scope
Max Mode needed?: No — standard context is more efficient

Task: Generate documentation for the whole repo
Max Mode needed?: Yes — complete picture reduces gaps

Task	Max Mode needed?
Fix a single bug in one file	No — standard window handles it
Refactor a function and its callers across 10+ files	Yes — benefits from seeing all call sites at once
Add a feature to a large monorepo with shared types	Yes — prevents missed dependencies
Debug a request that spans API, service and DB layers	Yes — agent needs the full chain in view
Write a new component with clear isolated scope	No — standard context is more efficient
Generate documentation for the whole repo	Yes — complete picture reduces gaps

Use Max Mode for tasks where missing context causes wrong assumptions, not just for any large file.

How does Max Mode relate to extended thinking?

Two independent axes

Max Mode: Increases the context window — how much code the model can see at once.
Thinking / Extended thinking: Increases the reasoning depth — how much the model deliberates before answering.
Max Mode + Thinking: Both: large context and deep reasoning. Highest cost; best for the hardest architectural problems.

How is Max Mode billed?

Check your spend cap before long Max Mode sessions

Where does Max Mode fit among the model categories?

Standard

Normal context window.

Fast, cheapest per request.

Use for the bulk of edits and questions.

Thinking

Extended reasoning before output.

Better on tricky logic and planning.

Costs more turns of deliberation.

Max

A toggle, not a model.

Pushes context to the model's ceiling.

Reserve for genuinely large-scope tasks.

What a Max Mode chat actually costs in request terms

Rough request-mapping math

One normal request: ≈ 8 cents of usage.
One Max Mode chat: Can map to ~30 requests of equivalent spend.
Why: Far more tokens read per turn, multiplied across the turns in an agent chat.

An approximation to build intuition, not an official rate card. Verify live pricing at cursor.com/docs/models-and-pricing.

Bigger context is not always better

Frequently asked questions

Why did one Max Mode chat cost 30 requests?

Does Max Mode make the agent more accurate or just give it more context?

Can I use Max Mode with every model in Cursor?

Is Max Mode on by default?

No — it is off by default to keep costs predictable. You toggle it per-session in the model selector. Once toggled on, it stays on for that session until you turn it off.

Will Max Mode use my full usage quota faster?

Yes. A single complex Max Mode request can use the token equivalent of many standard requests. If you are close to a usage cap or spend limit, enable Max Mode deliberately for the tasks that need it.

What does Max Mode actually do?

How do I turn on Max Mode?

When should I use Max Mode?

How does Max Mode relate to extended thinking?

How is Max Mode billed?

Where does Max Mode fit among the model categories?

What a Max Mode chat actually costs in request terms

Frequently asked questions

Why did one Max Mode chat cost 30 requests?

Does Max Mode make the agent more accurate or just give it more context?

Can I use Max Mode with every model in Cursor?

Is Max Mode on by default?

Will Max Mode use my full usage quota faster?

Sources & last verified

Keep reading

What does Max Mode actually do?

How do I turn on Max Mode?

When should I use Max Mode?

How does Max Mode relate to extended thinking?

How is Max Mode billed?

Where does Max Mode fit among the model categories?

What a Max Mode chat actually costs in request terms

Frequently asked questions

Why did one Max Mode chat cost 30 requests?

Does Max Mode make the agent more accurate or just give it more context?

Can I use Max Mode with every model in Cursor?

Is Max Mode on by default?

Will Max Mode use my full usage quota faster?

Sources & last verified

Keep reading