Open Source LLMs - Codersera Blogs

MiniMax M3

How to Run MiniMax M3 Locally: Hardware, Quants & Setup (2026)

A practical, honest guide to running MiniMax M3 (428B MoE) locally: VRAM/RAM math, quant options, and the Ollama, vLLM, and LM Studio paths.

20 Jul 2026 · 6 min read

AI

DiffusionGemma 26B-A4B: Google’s First Open Text-Diffusion Model

DiffusionGemma 26B-A4B is Google’s first open-weight text-diffusion LLM — a 25.2B MoE built on Gemma 4 that generates text in parallel for up to 4x faster output.

03 Jul 2026 · 4 min read

AI

Cohere North Mini Code 1.0: Open 30B Coding Model Guide

Cohere North Mini Code 1.0 is an open-weight 30B MoE coding model (3B active, 256K context, Apache 2.0) built for agentic software engineering. Specs, benchmarks, access.

03 Jul 2026 · 4 min read

Qwen

Qwen 3.7 vs Kimi K2.7: Best Open Agentic Coder in 2026?

Qwen3.7-Max is the closed flagship with higher vendor benchmarks and a 1M context; Kimi K2.7 Code is the open-weights, cheaper agentic specialist. We compare access, benchmarks, pricing, local feasibility, and which to use for autonomous coding.

30 Jun 2026 · 12 min read

Ornith

Ornith 1.0 vs Claude Opus 4.8 for Coding (2026)

Ornith 1.0 is a free, MIT-licensed, self-hostable coding model. Opus 4.8 is the closed frontier flagship. A benchmark-grounded, harness-honest comparison of where each wins on agentic coding in 2026.

30 Jun 2026 · 12 min read

Qwen

Qwen 3.6 27B as a Local Claude Code Replacement

A realistic, no-hype guide to running Qwen 3.6 27B locally as a Claude Code alternative: benchmarks vs Opus, the hardware and quant you actually need, how to wire it in, where it holds up, and what a hybrid setup actually saves you.

30 Jun 2026 · 14 min read

DeepSeek V4

Qwen 3.7 vs DeepSeek V4: Best Open Coding Model in 2026?

DeepSeek V4 and Qwen 3.7 post near-identical coding benchmarks, but only one is actually open. A specifics-first comparison of architecture, local-run feasibility, API pricing, and license for developers choosing a coding model in 2026.

30 Jun 2026 · 12 min read

Open Source LLMs

Ornith 1.0 vs GLM 5.2: Best Open Coding Model in 2026?

Two new MIT open-weights coding models shipped a day apart in June 2026. We compare architecture, coding benchmarks, local hardware, and API pricing for Ornith 1.0 vs GLM 5.2 — with an honest, no-hype verdict on which to pick.

30 Jun 2026 · 15 min read

GLM

GLM 5.2 Just Launched: 1M Context, Coding-First, Open Weights Next Week (Day-One Brief)

Zhipu Z.ai shipped GLM 5.2 today on every GLM Coding Plan tier with a usable 1M-token context window. Standalone API, the Z.ai chatbot, and the MIT open weights are arriving next week. No benchmarks yet — here's what's confirmed, what's not, and how it fits next to GLM-5.1.

13 Jun 2026 · 9 min read

Open Source LLMs

Mellum2: JetBrains' 12B MoE Code Model, Explained for Developers

JetBrains released Mellum2, a 12B Mixture-of-Experts model that activates just 2.5B parameters per token and ships under Apache 2.0. Here's what it is, where it fits in an AI stack, and how to put it to work.

06 Jun 2026 · 7 min read

Gemma 4

Gemma 4 vs Qwen 3.5: Open LLM Comparison (2026)

A practical, size-tier-by-tier comparison of Google's Gemma 4 and Alibaba's Qwen 3.5 — benchmarks, coding, reasoning, multilingual, and how to run each locally in 2026.

03 Jun 2026 · 8 min read

Qwen

How to Run Qwen 3.7 Locally: The Honest 2026 Answer

Qwen 3.7 weights are not on Hugging Face yet (May 20, 2026). Here are the honest ways to use it today, and exactly what to run locally instead.

20 May 2026 · 8 min read

Qwen

Qwen 3.7 vs Qwen 3.6: What's Actually Different (May 2026)

Qwen 3.6 is shipping with open weights today. Qwen 3.7-Max was announced May 20 with previews live but no weights yet. A grounded side-by-side.

20 May 2026 · 11 min read

MiniMax

MiniMax M3: First Open-Weights Reasoning + Agent Model (2026)

MiniMax M3 is not released as of May 2026. Here's what's actually shipping (M2.7), where the 'M3.0 released' claim came from, and how to verify it.

19 May 2026 · 7 min read

Qwen

Qwen 3.7: Release Date, Status, and What's Real vs Rumored (2026)

Is Qwen 3.7 released? As of May 2026 it isn't — no weights, API, or benchmarks. Here's what's real, what's only rumored, and what to run today.

19 May 2026 · 15 min read

SubQ

SubQ Explained: The First 12M-Token Subquadratic LLM (2026)

SubQ claims to be the first fully subquadratic LLM with a 12M-token context window. Here's what's verified, what isn't, and why the architecture matters.

18 May 2026 · 9 min read

ZAYA1

ZAYA1-8B: The 8B Reasoning Model Trained Entirely on AMD (2026)

ZAYA1-8B is an Apache-2.0 MoE reasoning model with 760M active params, pretrained 100% on AMD MI300X GPUs with zero NVIDIA in the loop.

18 May 2026 · 9 min read

Open Source LLMs

Kimi K2.6 vs DeepSeek V4 vs GLM-5.1: The Open-Weights Coding Verdict (2026)

Kimi K2.6 vs DeepSeek V4 vs GLM-5.1 for coding in 2026 — sourced benchmarks, real cost-per-task, self-host and license comparison, plus a clear pick-X-if decision block.

18 May 2026 · 11 min read

DeepSeek

DeepSeek V4 VRAM & GPU Requirements: Pro vs Flash, Every Quantization (2026)

A single tiered table for DeepSeek V4 Pro vs Flash VRAM at every quantization (FP8, FP4+FP8, INT4, Q2) — triangulated across official and community sources.

18 May 2026 · 8 min read

OmniCoder 9B

OmniCoder 9B: Benchmarks, GGUF Quants, and Local Setup Guide (2026)

What OmniCoder 9B is, its lineage and license, vendor-reported benchmarks, the full GGUF quant table, and step-by-step Ollama and llama.cpp setup.

18 May 2026 · 11 min read

Qwen

How to Run Qwen 3.6 Locally: 27B Dense vs 35B MoE (2026 Guide)

Run Qwen 3.6 locally: 27B dense vs 35B-A3B MoE explained, VRAM tables per quant, and copy-paste Ollama, llama.cpp, vLLM, and MLX commands.

18 May 2026 · 10 min read

Ring 2.6

Ring-2.6-1T: Ant Group's Open Trillion-Parameter Reasoning Model (Benchmarks, How It Compares, Can You Run It)

Ant Group's inclusionAI shipped Ring-2.6-1T, a trillion-parameter open-weights reasoning MoE. What it is, the vendor benchmarks, how it stacks up against Kimi K2.6 and DeepSeek V4, and whether you can run it.

15 May 2026 · 11 min read