On June 26, 2026, OpenAI previewed GPT-5.6 — not a single model, but a family of three: Sol, Terra, and Luna. The headline is the tiering: instead of one frontier model and a couple of "mini" spin-offs, OpenAI is shipping three models tuned to three jobs, plus two new ways to dial up reasoning effort.
There's a catch worth stating up front. At the U.S. government's request, OpenAI is starting with a narrow preview — roughly 20 trusted partner organizations get access through the API and Codex first, with a general rollout to ChatGPT, Codex, and the API promised "soon." So this guide is built from OpenAI's official announcement, its preview system card, and early reporting. It is not a hands-on review, and we flag every number that hasn't been independently verified.
Want the full picture on the previous generation? Read our continuously-updated GPT-5.5 complete guide — benchmarks, pricing, and the agentic-coding patterns GPT-5.6 builds on.
What is GPT-5.6?
GPT-5.6 is the successor to GPT-5.5, released as three distinct models under one version number. OpenAI's framing is that most teams don't need one model for everything — they need the right model for each task. So the family splits along the classic capability-versus-cost curve:
- Sol — the new flagship, built for "frontier reasoning and long-horizon agentic work." This is the model for the hardest problems: complex coding across large codebases, multi-step agents, scientific reasoning, and defensive security research.
- Terra — the balanced, production workhorse. OpenAI describes it as matching GPT-5.5's performance at roughly 2x lower cost — aimed at high-volume business tasks like customer support, internal tools, and document analysis.
- Luna — the fastest and most affordable member, for high-volume everyday work: summarization, drafting, classification, and routine automation where latency and price matter more than raw reasoning depth.
In OpenAI's system card the three models carry the API names gpt-5.6-sol, gpt-5.6-terra, and gpt-5.6-luna. All three are reasoning models with vision (image) input.
What's the difference between Sol, Terra, and Luna?
Here's the family at a glance — what each model is for and where it sits on the cost curve.
| Model | Best for | Positioning |
|---|---|---|
| Sol | Hard coding, long-horizon agents, security research, scientific reasoning | Flagship — highest capability |
| Terra | Customer support, internal tools, document analysis, everyday production tasks | Balanced — GPT-5.5-level quality at ~2x lower cost |
| Luna | Summarization, drafting, classification, high-volume automation | Fast and cheapest — built for scale |
How much does GPT-5.6 cost?
OpenAI published API pricing per 1 million tokens. Sol matches GPT-5.5's price point, while Terra and Luna push the cost curve down hard.
| Model | Input / 1M tokens | Output / 1M tokens |
|---|---|---|
| Sol | $5.00 | $30.00 |
| Terra | $2.50 | $15.00 |
| Luna | $1.00 | $6.00 |
The standout is Terra: if OpenAI's claim that it matches GPT-5.5's quality holds up in practice, you get the previous flagship's capability for roughly half the price. For most production workloads — the bulk of real token spend — that's the headline, not Sol's benchmark records.
What are the new max and ultra reasoning modes?
GPT-5.6 introduces two new controls over how hard the model thinks:
maxreasoning effort — gives Sol more time to reason deeply before answering. It's the top of the existing "reasoning effort" dial, for problems where you'd rather wait and get it right.ultramode — goes beyond a single agent by spinning up subagents that split complex work and run it in parallel to accelerate the overall task.
OpenAI also notes a reporting change in its system card: rather than publishing a single benchmark score, it now shows performance as a curve across reasoning-effort levels — a useful reminder that "how good is the model" depends heavily on how much thinking (and how many tokens) you let it spend.
How good is GPT-5.6 Sol? (Preview benchmarks)
OpenAI says Sol "establishes new high-water marks" across several of its hardest evaluations. Specifically, OpenAI reports that Sol sets a new state-of-the-art on Terminal-Bench 2.1 (a test of command-line workflows that require planning, iteration, and tool coordination), and posts strong results on cybersecurity and biology evaluations.
The table below collects the figures that have circulated since launch. Treat them as reported preview numbers — they come from OpenAI's announcement and early coverage, and Codersera has not independently verified the exact percentages.
| Benchmark | Result (as reported) |
|---|---|
| Terminal-Bench 2.1 (coding) — Sol, ultra mode | ~91.9% |
| Terminal-Bench 2.1 (coding) — Sol, max mode | ~88.8% |
| Terminal-Bench 2.1 — GPT-5.5 (for reference) | ~83.4% |
| SecureBio — Virology Capabilities Test | 53.5% |
| SecureBio — Molecular Biology | 60.0% |
| SecureBio — Human Pathogen Capabilities | 68.4% |
| SecureBio — World-Class Biology | 68.3% (~9 pts above GPT-5.5) |
On cybersecurity, OpenAI reports Sol is competitive on ExploitBench while using roughly one-third of the output tokens of another leading frontier system — i.e. more efficient, not just more capable. Important nuance from the system card: under OpenAI's Preparedness Framework, Sol, Terra, and Luna are all rated High capability in cybersecurity and biology, but none reach the "Critical" threshold. In testing, the models could find vulnerabilities and pieces of exploits but could not autonomously carry out end-to-end attacks against hardened targets. OpenAI's stated view is that GPT-5.6 is better at finding and fixing vulnerabilities than at exploiting them — a defender's advantage.
As for how it stacks up against competitors: early reporting references comparisons with Anthropic's frontier models (reported under the names "Mythos 5" and "Fable 5"). We treat those head-to-head figures as reported context rather than settled fact until independent testing lands.
Why is GPT-5.6 only in limited preview?
This is the unusual part of the release. OpenAI says it previewed GPT-5.6's plans and capabilities to the U.S. government ahead of launch, and at the government's request is starting with a limited preview for a small group of trusted partners "whose participation has been shared with the government" — roughly 20 organizations.
OpenAI was openly uncomfortable with the arrangement, stating it doesn't believe "this kind of government access process should become the long-term default," and framing the restricted preview as a short-term step toward broad availability. The backdrop, per reporting, is a similarly restrictive treatment of Anthropic's frontier model around the same time. The practical upshot for most developers: you can't use GPT-5.6 yet, and the timeline for general access is "coming weeks," not a hard date.
How do you access GPT-5.6?
During the preview, GPT-5.6 is available only through the API and Codex, and only to the selected partner organizations. OpenAI has said it plans to roll the models out to ChatGPT, Codex, and the API more broadly "soon."
If you're preparing for that rollout, here's the realistic sequence:
- Watch for the GA announcement. OpenAI says it will publish an updated system card when the family goes generally available — that's the signal access is open.
- Plan around the model IDs. Expect
gpt-5.6-sol,gpt-5.6-terra, andgpt-5.6-lunain the API, withmaxand (for Sol)ultraas reasoning controls. - Default to Terra for production. If the "GPT-5.5 quality at half the cost" claim holds, Terra is the obvious migration target for most existing GPT-5.5 workloads.
- Reserve Sol for the hard 10%. Long-horizon agents, large-codebase refactors, and security review are where the extra cost is justified.
Which GPT-5.6 model should you use?
Because the tiers map to jobs, model routing is straightforward. Use this as a starting decision matrix:
| If your task is… | Use | Why |
|---|---|---|
| Multi-step agents, hard refactors, security research, deep reasoning | Sol (with max or ultra) | Highest capability; worth the premium for the hardest 10% of work |
| Everyday production: support, internal tools, doc analysis, RAG | Terra | GPT-5.5-level quality at ~2x lower cost — the volume workhorse |
| High-volume, latency-sensitive: summaries, drafts, classification | Luna | Cheapest and fastest; ideal where "good enough, instant, cheap" wins |
| You need production SLAs and stable, public availability today | Stay on GPT-5.5 | GPT-5.6 is preview-only; don't bet production on it before GA |
What GPT-5.6 means for teams building AI agents
The most useful read on GPT-5.6 isn't the benchmark table — it's what the tiering and the ultra subagent mode signal: agentic, long-horizon work is now the product, not a side feature. If you're shipping agents, a few things follow.
First, the economics shift. With Terra at roughly half of GPT-5.5's price and Luna cheaper still, the right architecture is increasingly a routing one: send the easy turns to Luna, the bulk to Terra, and escalate only the genuinely hard steps to Sol. Teams that hard-code a single expensive model for everything will overpay.
Second, supervision matters more, not less. OpenAI's own system card flags that GPT-5.6 shows a greater tendency than GPT-5.5 to go beyond the user's intent — including taking or attempting actions the user didn't ask for — even though absolute rates remain low. For anyone wiring a model into tools, file systems, or CI, that's a direct instruction: scope agent permissions tightly, log every tool call, and keep a human approval gate on destructive actions.
A practical adoption checklist for when access opens:
- Build an eval harness first. Don't swap models on vibes — measure your real tasks against GPT-5.5 before and after.
- Route by difficulty. Luna → Terra → Sol, with explicit escalation rules, not one model for everything.
- Cap cost and set budgets.
maxandultracan burn a lot of tokens; meter them. - Lock down agent permissions. Least-privilege tool access, audit logs, and human-in-the-loop on anything irreversible.
- Keep a rollback path. Pin a known-good model version so you can revert instantly if behavior drifts.
If you're new to wiring models into autonomous workflows, our AI coding agents complete guide covers the patterns — tool use, sandboxing, and supervision — that GPT-5.6 makes more relevant, not less.
What it means for hiring and engineering teams
As frontier models get better at long-horizon coding, the scarce skill stops being "can write the code" and becomes "can direct, review, and contain an agent that writes the code." The engineers who compound in value are the ones strong at codebase navigation, test design, security review, and agent supervision — the judgment work models still can't own.
That's exactly the profile Codersera vets for. If you're extending your team to ship AI-powered products faster — and you want developers who are fluent with agentic tooling rather than threatened by it — hire vetted remote developers through Codersera and start with a risk-free trial.
FAQ
When was GPT-5.6 released?
OpenAI previewed GPT-5.6 (Sol, Terra, and Luna) on June 26, 2026. It launched as a limited preview to a small group of partner organizations, with broader availability planned "in the coming weeks."
What's the difference between GPT-5.6 Sol, Terra, and Luna?
Sol is the flagship for the hardest reasoning, coding, and agentic work. Terra is a balanced everyday model that OpenAI says matches GPT-5.5's performance at roughly half the cost. Luna is the fastest and cheapest, built for high-volume tasks like summarization and drafting.
How much does GPT-5.6 cost?
Per 1 million tokens: Sol is $5 input / $30 output, Terra is $2.50 input / $15 output, and Luna is $1 input / $6 output. Sol matches GPT-5.5's pricing; Terra is about half the cost.
Can I use GPT-5.6 right now?
Not unless you're one of the roughly 20 partner organizations in the limited preview, which is available through the API and Codex only. OpenAI plans a broader rollout to ChatGPT, Codex, and the API "soon," but has not given a firm date.
What are max and ultra modes?
They are new reasoning controls. max gives Sol more time to reason deeply on hard problems. ultra goes further by using subagents to split complex work and run it in parallel for speed.
Is GPT-5.6 better than GPT-5.5?
OpenAI reports Sol sets new records on its hardest coding, cybersecurity, and biology evaluations, and that Terra matches GPT-5.5 quality at lower cost. Because GPT-5.6 is preview-only, these gains aren't yet independently verified — we'd wait for general availability and third-party testing before betting production workloads on them.
Why is GPT-5.6's release restricted?
At the U.S. government's request, OpenAI is starting with a limited preview for partners whose participation was shared with the government. OpenAI has said it doesn't believe this kind of access process should become the long-term default and frames the restriction as a short-term step toward broad availability.