DeepSeek V4 in Cursor: Complete Setup + the Composer Caveat Fix (2026)

Quick answer. In Cursor, open Settings → Models → Add Model, set the name to deepseek-v4-pro (or deepseek-v4-flash), point the OpenAI Base URL to https://api.deepseek.com with no /v1 suffix, paste your DeepSeek key, and click Verify. Use Chat for reasoning visibility — Cursor’s Composer panel hides DeepSeek’s reasoning_content field.

DeepSeek shipped V4 Pro and V4 Flash on April 24, 2026, and both are now first-class options for Cursor users who want a cheap, reasoning-capable alternative to Composer 2, Claude Opus 4.7, or GPT-5.5. The wiring itself is five fields in a settings panel. The footguns are everything around it: the legacy deepseek-chat alias retires on July 24, 2026, Cursor’s Composer panel doesn’t render the model’s thinking trace, and the cheaper Flash tier covers most coding sessions if you size it correctly. This guide walks the setup, the decision tree, and the workarounds — in that order.

Why use DeepSeek V4 in Cursor at all?

The pitch is cost-per-reasoning-token, not raw IQ. DeepSeek V4 Pro is a 1.6T-parameter Mixture-of-Experts model with 49B active and a 1M-token context. On the public API, list pricing is $1.74 per million cache-miss input tokens and $3.48 per million output tokens, with cache hits at a fraction of that. Through May 31, 2026, DeepSeek is running a 75% promo: $0.435 input and $0.870 output per million. V4 Flash, the 284B/13B-active sibling, lists at $0.126 input and $0.252 output per million — an order of magnitude cheaper than the premium proprietary tier.

Compared to your other Cursor options:

  • Composer 2 — Cursor’s in-house coding model, post-trained on top of Kimi K2.5. Tightly integrated with the agent loop, but you pay Cursor’s per-request rates and there’s no transparency into the underlying reasoning trace.
  • Claude Opus 4.7 and GPT-5.5 — premium agentic tier, strongest tool-use, but multiples of V4’s per-token cost.
  • DeepSeek V4 — cheaper than both, with explicit thinking-mode reasoning, a 1M context that holds most monorepos, and an OpenAI-compatible API that drops straight into Cursor’s custom-model field.

The trade-off you are accepting: Cursor’s native chat and agent panels will use V4, but Tab autocomplete stays on Cursor’s proprietary fast model, and Background Agents do not currently route to DeepSeek. If you live in Tab and Background Agents, V4 doesn’t change much for you. If you live in Chat and the Composer agent, it changes the bill.

How do you configure DeepSeek V4 as a custom model in Cursor?

Cursor accepts any OpenAI-compatible endpoint as a custom model. DeepSeek’s API is OpenAI-compatible, so the wiring is mechanical. From a clean install:

  1. Grab a DeepSeek API key. Sign in at platform.deepseek.com, create an API key, fund the wallet (a $5 top-up is more than a week of agent-heavy coding on Flash). Keys are sk-prefixed and project-scoped.
  2. Open Cursor Settings. Press Cmd + , on macOS or Ctrl + , on Windows/Linux, then navigate to Models.
  3. Disable the default OpenAI key requirement if Cursor is asking for one — you don’t need it. Scroll to the OpenAI API key field and toggle Override OpenAI Base URL.
  4. Set the OpenAI Base URL to exactly https://api.deepseek.com. Do not append /v1 — Cursor adds the /chat/completions path itself. The single most common setup failure is a duplicate /v1/v1 path.
  5. Paste your DeepSeek key in the OpenAI API key field. Cursor stores it locally and forwards it as a Bearer token.
  6. Add a custom model. In the Model Names section, click + Add model and enter exactly deepseek-v4-pro for the reasoning tier or deepseek-v4-flash for the cheap tier. Both names are case-sensitive and must match the DeepSeek API model IDs exactly. Do not use deepseek-v4 — that alias does not exist.
  7. Click Verify. Cursor fires a tiny completion request. A green check means the key, base URL, and model name all line up. A 404 means the base URL is wrong (usually a trailing /v1). A 401 means the key is wrong. A model-not-found error means the model name is wrong.
  8. Select the model in the chat dropdown. The new entry appears alongside the built-in options. Switch to it before opening a thread.

If you want to use both V4 Pro and V4 Flash side-by-side, add them as two separate entries and pick per-thread. They share the same key and base URL.

Should you use V4 Pro or V4 Flash?

Default to V4 Flash. Reach for V4 Pro when the task is reasoning-bound, not throughput-bound.

DimensionV4 ProV4 Flash
Architecture1.6T MoE, 49B active284B MoE, 13B active
Context1M tokens1M tokens
List input price$1.74 / 1M (cache miss)$0.126 / 1M
List output price$3.48 / 1M$0.252 / 1M
Promo input (through May 31, 2026)$0.435 / 1MAlready discounted
Best forArchitecture review, hairy bugs, multi-file refactors with reasoningInline edits, scaffolding, docstrings, tests, daily coding loop

Cost-per-task examples. A typical Composer agent run that consumes ~20K input and ~3K output tokens costs about $0.013 on V4 Pro at promo pricing, or $0.003 on Flash. A full-day agent-heavy session of ~150 such turns costs roughly $2 on Pro or under $0.50 on Flash. Public reports from teams running Cline + Flash inside Cursor put a five-day-a-week coding habit at under $1 in API spend.

For most loops — rename, generate test, refactor function — Flash is indistinguishable from Pro and ten times cheaper. Switch up to Pro when you’re stuck on a real bug and you want the model to actually think out loud.

Why doesn't the model's reasoning show up in Composer?

This is the caveat the existing guides bury and the one that wastes the most time. DeepSeek V4’s thinking mode returns two fields on each streamed chunk: content (the user-facing answer) and reasoning_content (the model’s chain of thought). Cursor’s Chat panel renders both. Cursor’s Composer (agent) panel only renders content.

From a user perspective, the symptom is: in Chat, you see a collapsible reasoning block followed by the answer. In Composer, you see the answer with no reasoning trace — and on long tool-call chains, the request can fail outright with a 400 from DeepSeek. The deeper cause is that DeepSeek’s thinking-mode tool-call protocol requires the full reasoning_content chain to be replayed on each subsequent request. Cursor strips that field when packaging the next agent turn, so DeepSeek rejects the request.

Three pragmatic workarounds, in order of effort:

  1. Use Chat for reasoning, Composer for execution. When you want to see the model think — debugging, design review, anything where the chain-of-thought is the deliverable — do it in Chat. Switch to Composer only when you want hands on the file system. This is the zero-effort answer and it’s the right default for most users.
  2. Disable thinking mode for Composer. If you have V4 Pro selected and you’re only seeing 400s on tool-heavy agent loops, switch the model entry to V4 Flash without thinking, or to V4 Pro with thinking disabled at the API layer. You lose the reasoning gain but you get a stable agent.
  3. Route through a proxy that injects reasoning_content back into outgoing requests. The community has shipped one (yxlao/deepseek-cursor-proxy on GitHub). Run it locally, point Cursor at http://localhost:<port> instead of api.deepseek.com, and the proxy caches the reasoning trace per session and replays it on the next tool-call turn. Composer then accepts the full agent loop, and the proxy also surfaces the thinking tokens as collapsible Markdown details in Cursor. Highest effort, most complete fix.

Cursor has acknowledged the issue on the forum; a first-class fix on their side would remove the workaround entirely. Until then, knowing which panel renders which field is the single best Cursor + V4 trick.

Companion guide

For DeepSeek V4 depth — Pro vs Flash, pricing, benchmarks, local setup — see our DeepSeek V4 complete guide for 2026.

What's the July 24, 2026 deadline?

DeepSeek is retiring the legacy deepseek-chat and deepseek-reasoner model aliases at 15:59 UTC on July 24, 2026. During the grace period both names still resolve — they’re transparently routed to V4 Flash today — but after the cutoff, any request using the old aliases returns a hard error and your Cursor sessions stop working.

If you are coming back to a setup you wired up before April 24, 2026, the migration is a one-line change in the Cursor model entry:

Before (retires July 24, 2026)After
deepseek-chatdeepseek-v4-flash (non-thinking)
deepseek-reasonerdeepseek-v4-flash (thinking) or deepseek-v4-pro

Delete the old custom-model entry in Cursor, add the new one, click Verify. Cost-wise you’re moving onto Flash pricing automatically; if you want the bigger model you swap to deepseek-v4-pro instead. Do this once, before the deadline, and never think about it again.

How does V4 in Cursor compare to Composer 2?

Composer 2 is Cursor’s in-house coding model, post-trained on top of Kimi K2.5 with continued pretraining and large-scale reinforcement learning on real Cursor agent traces. Cursor confirmed the Kimi K2.5 base in March 2026 after a developer intercepted the model ID in API traffic. On CursorBench, Composer 2 scores 61.3 — competitive with the strongest frontier models on Cursor’s own benchmark.

The practical comparison:

  • Composer 2 is RL-tuned to Cursor’s exact agent harness. Tool calls feel snappy, multi-step plans rarely derail, and you don’t deal with reasoning_content edge cases.
  • DeepSeek V4 gives you an explicit reasoning trace (in Chat), runs an order of magnitude cheaper than the premium tier, and lets you bring your own key with a real audit trail of every prompt and token.

If you want the smoothest in-Cursor agent experience and don’t care about the bill, stay on Composer 2. If you want to see the model think, pay per token instead of per request, and have one provider you can also call from outside Cursor (CI scripts, custom agents, local tooling), wire up V4. They’re not exclusive — many teams keep Composer 2 as default and switch to deepseek-v4-pro for hard bugs.

Working with multi-provider AI workflows

If your team is wiring DeepSeek V4 into Cursor, Cline, Aider, or a custom agent harness, the hard part isn’t the wiring — it’s the inference economics, prompt caching, and reasoning-trace plumbing across multiple providers without burning budget. Codersera places vetted engineers experienced with DeepSeek, Anthropic, and OpenAI inference stacks, agent orchestration, and the operational glue that keeps a multi-provider setup cheap and observable. If you’re hiring, we can shortlist developers who’ve already shipped this exact integration.

FAQ

Does Cursor Tab autocomplete use DeepSeek V4?

No. Even with V4 selected as your chat and agent model, Tab autocomplete continues to use Cursor’s proprietary fast model. The custom-model setting only routes Chat and the Composer agent.

Can I use DeepSeek V4 with Cursor Background Agents?

Not as of May 2026. Background Agents do not currently support custom DeepSeek models; they run on Cursor’s built-in models. Track the Cursor forum for updates.

Why do I get a 400 error on long tool-call chains?

DeepSeek V4’s thinking mode requires the full reasoning_content chain to be replayed on each subsequent request. Cursor’s Composer drops the field on follow-up turns, so DeepSeek rejects the call. Use Chat instead, disable thinking mode, or route through a proxy that re-injects reasoning_content.

Is the context window really 1M tokens?

The DeepSeek API supports 1,048,576-token contexts on V4 Pro and V4 Flash. Cursor itself enforces lower per-request context caps depending on plan and panel — in practice you’ll hit Cursor’s ceiling well before DeepSeek’s.

Do I need to add /v1 to the base URL?

No. Set the OpenAI Base URL to exactly https://api.deepseek.com. Cursor (and DeepSeek’s OpenAI-compatible router) handle the /chat/completions path internally. Adding /v1 produces a 404 on Verify.

Will deepseek-chat keep working until July 24, 2026?

Yes, but it transparently routes to V4 Flash already. You’re effectively on V4 today whether the model entry says deepseek-chat or not. After 15:59 UTC on July 24, 2026, requests using the legacy alias return an error. Migrate to deepseek-v4-flash or deepseek-v4-pro before then.

How much does a typical day of V4 Flash coding cost?

For an agent-heavy day in Cursor — on the order of 100–150 Composer turns, mostly under 30K tokens each — expect well under $1 in DeepSeek API spend on V4 Flash, or roughly $2 on V4 Pro at promo pricing. Cache hits drop that further if you reuse system prompts across turns.