Tag

LLM

A collection of 92 posts

AI

How to Run GLM-5.2 Locally — Hardware, Quants, and Setup

A practical walkthrough for self-hosting GLM-5.2 (744B MoE, 40B active) on llama.cpp. Quant tables, four hardware paths, exact install commands, verification, and a fallback to the Z.ai cloud API if your rig falls short.

· 8 min read
AI

VibeThinker-3B: The Complete Guide (2026)

VibeThinker-3B is WeiboAI's MIT-licensed 3B reasoning model built on Qwen2.5-Coder-3B. We unpack the viral 'Opus 4.5 performance' claim with the actual HF benchmarks.

· 8 min read
AI

GLM-5.2 complete guide (2026)

Z.ai's GLM-5.2 is the leading open-weights LLM on the Artificial Analysis Intelligence Index v4.1. 744B params (40B active), 1M-token context, MIT-licensed weights. Architecture, benchmarks, pricing, and a 3-path local-inference playbook.

· 13 min read
AI

Kimi K2.7 vs DeepSeek V4: The Open-Weights Coding Showdown (2026)

Two open-weights heavyweights from China go head-to-head for the agentic-coding throne. K2.7 leads on MCP tool-use depth; V4 leads on raw per-token economics and proven independent benchmarks. We break down cost, agentic strength, self-host paths, and pick a winner per workload.

· 6 min read
AI

GLM 5.2 vs DeepSeek V4: The Open-Weights Coding Showdown (2026)

Two open-weights heavyweights from China go head-to-head for the agentic-coding throne. GLM 5.2 leads on context window; DeepSeek V4 leads on token economics. We break down cost, agentic strength, self-host paths, and pick a winner per workload.

· 10 min read
Kimi

Kimi K2.7 vs GPT-5.5 vs Claude Opus 4.8: Coding & Agentic Comparison (2026)

How Moonshot's open-weight Kimi K2.7 Code stacks up against Claude Opus 4.8, GPT-5.5, and DeepSeek V4 for agentic coding — on price, context, and the benchmarks that exist. K2.7's scores are Moonshot-reported only, so the verdict is subject to change once independent results land.

· 8 min read
AI Models

Kimi K2.6 vs GPT-5.5 vs Claude Opus 4.8 (2026)

A practical 2026 comparison of Kimi K2.6, GPT-5.5, and Claude Opus 4.8 on coding benchmarks, reasoning, pricing, and self-host economics — plus which to pick by use case.

· 7 min read
AI

Claude Opus 4.8 Launch Guide: Benchmarks & Pricing 2026

Anthropic launched Claude Opus 4.8 on May 28, 2026: SWE-bench Pro 69.2%, GDPval Elo 1890 (+121 over GPT-5.5), Fast mode 3x cheaper than 4.7, dynamic workflows for hundreds of parallel subagents. Pricing unchanged at $5/$25 per 1M. Full launch breakdown.

· 12 min read