Mistral Medium 3.5 + Le Chat Work Mode (May 2026)

Quick answer. Mistral Medium 3.5 is a 128B-parameter dense model released April 30, 2026 that folds chat, reasoning, and coding into a single set of weights with a 256k context window. It also powers Work Mode in Le Chat (currently in Preview) — an agent that executes multi-step workflows across email, calendar, documents, Jira, GitHub, and Slack with explicit approval before any sensitive action. API pricing is $1.50 / $7.50 per million input/output tokens, Le Chat Pro is $14.99/month, Team is $24.99/user/month, and Enterprise is custom. The interesting bit isn't the benchmark numbers (77.6% SWE-Bench Verified is respectable, not category-defining) — it's that Mistral is the only frontier lab shipping open weights for a model this capable, with EU-native hosting and no CLOUD Act exposure.

Mistral spent most of 2025 looking like it was being lapped by the American frontier labs. Mistral Large 3 in December was solid but didn't dent the conversation. Then in late April 2026 they shipped two things at once: Mistral Medium 3.5, a 128B dense model that consolidates their reasoning/chat/coding stack into one set of weights, and Work Mode in Le Chat (currently in Preview, rolling out across Free / Pro / Team) — their answer to ChatGPT's agentic tooling and Claude's MCP push. This piece walks through what's actually new, how the pricing breaks down, and where Mistral makes sense versus the bigger names.

What is Mistral Medium 3.5 and why does it matter?

Mistral Medium 3.5 is a 128-billion-parameter dense transformer released on April 30, 2026, with a 256k context window, vision input, and configurable reasoning effort that lets the model toggle between instant-reply and deeper test-time-compute modes. Crucially, all 128B parameters fire on every token — this is not a mixture-of-experts model where only a subset activates per forward pass.

The naming is a bit confusing if you've been following the lineage. "Medium" sits below Mistral Large 3 in the brand hierarchy, but architecturally it's a different beast: Large 3 is a sparse 41B-active / 675B-total MoE, while Medium 3.5 is a smaller-but-dense 128B model. In practice that means Medium 3.5 is easier to self-host (4-GPU setup with reasonable quantisation), more predictable in latency, and according to Mistral's own benchmarks more capable on agentic and coding tasks. Mistral officially published just two benchmark numbers at launch: 77.6% on SWE-Bench Verified (real GitHub issue patches) and 91.4% on τ³-Telecom (agentic tool use in specialised environments). MMLU, GSM8K, and GPQA scores floating around the discourse are community-reported, not Mistral-published — treat them accordingly.

What makes Medium 3.5 strategically interesting is consolidation. It replaces three previous Mistral models in the production stack:

Mistral Medium 3.1 — the previous general-purpose model in Le Chat
Magistral — their dedicated reasoning model
Devstral 2 — the coding model powering the Vibe agent

One model now does all three jobs. That's the same architectural bet Anthropic made with Claude 4.x (one model with thinking-on-demand) and OpenAI made with the GPT-5 family. Mistral arriving at the same conclusion is more interesting than the benchmark numbers themselves: the industry is converging on "one flagship that scales effort up or down" rather than separate SKUs per workload.

How does Medium 3.5 differ from Mistral Large 3?

This catches a lot of people the first time they look at Mistral's lineup. The short version: Large 3 has more total knowledge, Medium 3.5 is smarter per token activated.

	Mistral Large 3	Mistral Medium 3.5
Architecture	Sparse MoE	Dense
Total params	675B	128B
Active per token	41B	128B (all)
Context	262k	256k
Released	Dec 1, 2025	Apr 30, 2026
Reasoning mode	Separate Magistral SKU	Built-in, toggleable
Coding agent	Devstral 2	Built-in (replaces Devstral 2)
Input price	$0.50 / 1M tokens	$1.50 / 1M tokens
Output price	$1.50 / 1M tokens	$7.50 / 1M tokens
Open weights	Yes (Apache 2.0)	Yes (modified MIT)

The pricing inversion is the part everyone misses. Medium 3.5 costs roughly 4.5× more per token than Large 3 despite being the smaller-numbered SKU, because dense 128B is cheaper to train but harder to serve than a 41B-active MoE — every token has to do real work against all the weights. Mistral is signalling that Medium 3.5 is the new default for serious agentic work, with Large 3 sticking around for high-volume cheaper-per-token tasks.

What is Work Mode in Le Chat?

Work Mode is Mistral's name for what the rest of the industry has been calling "agents." It's currently in Preview (per Mistral's docs), rolling out in stages across the Free, Pro, and Team plans rather than generally available. In Le Chat, it turns the chatbox into something closer to a project assistant: a single prompt fans out into a multi-step plan, executes that plan across connected tools, and surfaces the results back to you with every tool call visible.

What it actually connects to as of the May 2026 rollout:

Email — read, draft, summarise threads
Calendar — check availability, prep for meetings, find slots
Documents — Google Drive, OneDrive, internal knowledge sources
Jira — pull tickets, create issues, update status
GitHub — open PRs, review diffs, query issues
Slack — messages, threads, channel context

Two design choices stand out. First, every tool call and reasoning step is visible — you see the agent's plan, what it's calling, what came back, and what it decided next. Second, sensitive actions require explicit approval: drafting an email is automatic, sending it is a confirm-click. Creating a Jira ticket is automatic, deleting one isn't. This is the same envelope Anthropic landed on with the agentic Claude releases and arguably the right default — the catastrophic agent failure modes are all "it took an irreversible action you didn't sanction."

Concrete workflow examples Mistral demonstrated at launch:

Monday catch-up: "What happened over the weekend?" pulls email, Slack, and Jira updates into a single brief.
Meeting prep: given a calendar invite, the agent pulls attendee context, recent emails, news on the attendees' company, and relevant docs.
Bug triage: paste a Jira ticket, the agent reads the linked PR, runs through the code, suggests root cause, drafts a comment.

None of this is unique to Mistral — ChatGPT, Claude, and Gemini all do versions of it. What's new is having a credible European option for teams that can't legally route work data through American servers.

How much does Le Chat cost across tiers?

Mistral consolidated their pricing in the all-new Le Chat launch. As of May 2026:

Tier	Price	What you get
Free	$0	~25 messages/day, basic models, no Work Mode
Pro	$14.99/month	All models (including Medium 3.5 + Large 3), Work Mode, reasoning mode, vision, image generation, larger context
Team	$24.99/user/month (or $19.99 annual)	Everything in Pro, shared libraries, admin controls, centralised billing, seat management
Enterprise	Custom (reportedly in the tens of thousands/month per third-party coverage; Mistral doesn't publish a number)	Data isolation, custom deployment (on-prem, private cloud, EU-hosted serverless), audit logging, SSO, dedicated support, optional fine-tuning

Important gotcha: Le Chat Pro does not include any API credits. The Mistral API and Le Chat are billed separately. If your team is using both the chat interface and the API in product code, you're paying for both lines.

For comparison: ChatGPT Plus is $20/month, Claude Pro is $20/month, Gemini Advanced is $19.99/month. Le Chat Pro at $14.99 is the cheapest of the four tier-one consumer plans — modestly so, but consistent with Mistral's positioning as the "sensible European choice."

What does the API actually cost?

For Mistral Medium 3.5 via the API:

Input: $1.50 per million tokens
Output: $7.50 per million tokens

That puts it in the same rough band as Claude Sonnet 4.7 ($3 / $15) and slightly above GPT-5.5-mini, but well below Claude Opus 4.7 ($15 / $75) or GPT-5.5 Pro tier. The 5× input-to-output ratio is unusual — most frontier models settle around 4× or 5× — but it makes Medium 3.5 cheap for long-context retrieval workloads and more expensive for high-output generation. Plan accordingly.

Mistral Large 3 stays cheaper at $0.50 / $1.50, which means for high-volume tasks where Medium 3.5's extra capability isn't load-bearing (classification, summarisation, simple extraction), Large 3 is the better economic choice. Mistral has been explicit that the two are meant to coexist in your stack rather than one replacing the other.

How does Mistral stack up against Claude, GPT, and Gemini?

For raw capability on English-language benchmarks, Mistral Medium 3.5 sits just below Claude Sonnet 4.7 and GPT-5.5 on most evaluations — close enough that for many practical tasks you wouldn't notice the difference, far enough that for the hardest reasoning problems you'd reach for Claude Opus 4.7 or GPT-5.5 Pro.

Where Mistral genuinely wins:

Open weights. Medium 3.5 ships under a modified MIT license. You can download it, run it on your own infrastructure, fine-tune it, audit it. No frontier US lab does this for their flagship. If "we cannot legally let our data leave our perimeter" is a real constraint, the conversation is short.
European data residency. Mistral is Paris-based and runs EU-native infrastructure. There's no CLOUD Act exposure by default — US intelligence agencies cannot compel data disclosure the way they can with American clouds. For GDPR-heavy industries (financial services, healthcare, public sector) this is a hard requirement, not a nice-to-have.
Multilingual at scale. French, German, Spanish, Italian, Dutch, Portuguese all perform measurably better on Mistral than on the American models, even after the latter's localisation pushes. If your product audience is primarily continental European, this is worth A/B testing.
Self-hosting economics. A 128B dense model is hostable on 4×H100 with reasonable quantisation. For high-volume internal workloads, self-hosted inference can land at a fraction of the per-token API rate — especially if you already have GPU capacity.

Where Mistral still lags:

Frontier reasoning. On the hardest math, planning, and multi-step reasoning problems, Claude Opus and GPT-5.5 Pro still have measurable margins.
Ecosystem. ChatGPT plugins, Claude Skills/MCP, Gemini's Workspace integration — the third-party ecosystem around the American labs is bigger and moving faster.
Coding agent maturity. Vibe (Mistral's coding CLI) is solid but the Cursor / Claude Code / GitHub Copilot ecosystem is more mature for day-to-day developer work. We cover this in detail in the AI coding agents pillar guide.

When should you pick Mistral?

Honest take from the Codersera side:

Pick Mistral if you're building for a European market, have GDPR or data-residency requirements that disqualify US clouds by default, need open weights for compliance or auditability, or want a credible self-host path for high-volume workloads. Also pick Mistral if your workload is primarily French/German/Spanish — the multilingual quality difference is real.
Skip Mistral if you're optimising for raw capability on the hardest reasoning tasks, you live inside the GitHub Copilot / Cursor / Claude Code ecosystem and your team's muscle memory is built around it, or you need the broadest third-party integration surface.
Hybrid is fine. Plenty of teams use Claude for the hardest tasks, Mistral for European-data work, GPT for fastest iteration on consumer-facing features. The frontier model space has settled into a portfolio of complementary strengths, not a one-winner race.

For the broader open-weight landscape and how Mistral fits relative to Llama 4, Qwen 3.5, DeepSeek V4, and the rest, see the open-source LLMs landscape pillar.

What is Mistral's broader strategy?

Read the April 2026 "European AI playbook" they published and the pattern is clear: Mistral isn't trying to out-scale OpenAI. They're trying to be the sovereign alternative — the lab that European governments, regulated industries, and sovereignty-conscious enterprises pick when American AI is politically or legally awkward.

The evidence is consistent. January 2026: framework agreement with France's Ministry of the Armed Forces to run Mistral models on French-controlled infrastructure. Spring 2026: deals with France and Germany for public-administration AI. The €15B European Investment Fund vehicle is partly an investment thesis around exactly this positioning. "Not American" is genuinely worth billions of dollars in the current geopolitical environment, and Mistral has built the operational story to back it up.

The Medium 3.5 + Work Mode + Vibe combination is the consumer- and developer-facing surface of that strategy. They need to be capable enough that picking Mistral isn't a sacrifice. The May 2026 release suggests they're close enough that the sovereignty premium — for the teams that need it — is now real.

FAQ

Is Mistral Medium 3.5 open source?

Open weights, not open source in the strict OSI sense. Medium 3.5 ships under a modified MIT license that allows free use, modification, and self-hosting up to a revenue threshold; high-revenue enterprises route through Mistral's paid channel. For the vast majority of users and teams the practical effect is "yes, you can download it, run it, fine-tune it."

Can I self-host Mistral Medium 3.5?

Yes. Weights are on Hugging Face, available via Ollama, and packaged for NVIDIA NIM. A 4×H100 setup with appropriate quantisation handles it comfortably; smaller setups work with more aggressive quantisation at some quality cost.

How does Work Mode compare to ChatGPT's agent mode or Claude's tool use?

Conceptually similar: multi-step planning, tool calls, visible reasoning, approval gates. The main differentiators are data residency (EU-hosted by default) and the connector list (heavier focus on the European enterprise stack alongside the usual GitHub/Slack/Jira). Capability-wise it's competitive, not category-leading.

Does Le Chat Pro include API credits?

No. Le Chat subscriptions and Mistral API usage are billed completely separately. If your team is using both, budget for both lines.

What's the difference between Medium 3.5 and Mistral Large 3?

Medium 3.5 is a dense 128B model with all parameters active per token; Large 3 is a sparse MoE with 41B active / 675B total. Medium 3.5 is more capable on agentic and reasoning tasks but ~4.5× more expensive per token. Large 3 is the better choice for high-volume cheaper-per-token workloads; Medium 3.5 for harder reasoning, coding, and agentic work.

Mistral has the strongest data-residency story of the major frontier labs by default: EU-native infrastructure, French parent company, no CLOUD Act exposure, optional on-prem and private-cloud deployments. For regulated workloads (financial services, healthcare, public sector) this materially simplifies the legal review compared to routing through US-headquartered providers.

Can Work Mode write code and open PRs autonomously?

Yes, via the GitHub connector. As with all agentic operations, sensitive actions (creating PRs, pushing to remote branches) require explicit approval. The remote coding agents in Vibe are the more powerful surface for long-running coding work — Work Mode in Le Chat handles the lighter PR-and-comment workflow.