MiniMax M3 (M3.0): Release Date, Status, and What's Real vs Rumored (2026)
MiniMax M3 is not released as of May 2026. Here's what's actually shipping (M2.7), where the 'M3.0 released' claim came from, and how to verify it.
A collection of 9 posts
MiniMax M3 is not released as of May 2026. Here's what's actually shipping (M2.7), where the 'M3.0 released' claim came from, and how to verify it.
Is Qwen 3.7 released? As of May 2026 it isn't — no weights, API, or benchmarks. Here's what's real, what's only rumored, and what to run today.
SubQ claims to be the first fully subquadratic LLM with a 12M-token context window. Here's what's verified, what isn't, and why the architecture matters.
ZAYA1-8B is an Apache-2.0 MoE reasoning model with 760M active params, pretrained 100% on AMD MI300X GPUs with zero NVIDIA in the loop.
Kimi K2.6 vs DeepSeek V4 vs GLM-5.1 for coding in 2026 — sourced benchmarks, real cost-per-task, self-host and license comparison, plus a clear pick-X-if decision block.
A single tiered table for DeepSeek V4 Pro vs Flash VRAM at every quantization (FP8, FP4+FP8, INT4, Q2) — triangulated across official and community sources.
What OmniCoder 9B is, its lineage and license, vendor-reported benchmarks, the full GGUF quant table, and step-by-step Ollama and llama.cpp setup.
Run Qwen 3.6 locally: 27B dense vs 35B-A3B MoE explained, VRAM tables per quant, and copy-paste Ollama, llama.cpp, vLLM, and MLX commands.
Ant Group's inclusionAI shipped Ring-2.6-1T, a trillion-parameter open-weights reasoning MoE. What it is, the vendor benchmarks, how it stacks up against Kimi K2.6 and DeepSeek V4, and whether you can run it.