Tag

AI

A collection of 336 posts

AI

Grok 4.3: xAI's Cheap Frontier Model (May 2026 Guide)

xAI's Grok 4.3 lands with a 1M token context window, native video input, and aggressive pricing at $1.25 input / $2.50 output per million tokens. Here is what changed from Grok 4.20, how it benchmarks against Claude Opus 4.7, GPT-5.5, and Gemini 3.1 Pro, and when it is the right tool to reach for.

· 7 min read
AI

Anthropic Mythos: Complete Guide (2026)

Anthropic Mythos is the frontier preview model unveiled April 7, 2026: stronger than Opus 4.7 on math and security, withheld from public release, shipped only via Project Glasswing to ~50 defensive-security partners.

· 11 min read
AI

Claude Mythos vs Opus 4.7 vs GPT-5.5 (2026)

Claude Mythos, Opus 4.7, and GPT-5.5 shipped within three weeks of each other in April 2026. We break down which frontier model wins on coding, reasoning, vision, cost, and which one your team should actually pick.

· 10 min read
AI

Qwen 3.7 Max: Alibaba's May 2026 Flagship Guide

Alibaba's Qwen 3.7 Max launched May 20, 2026 with a 1M-token context, native extended-thinking mode, and benchmark wins on SWE-Pro and Terminal-Bench. Here's how it compares to Claude Opus 4.7, GPT-5.5, Gemini 3.5 Flash and DeepSeek V4, what it costs on DashScope, and when to pick it.

· 10 min read
AI

Gemini 3.5 Flash + Gemini Spark: Google I/O 2026

Google dropped Gemini 3.5 Flash and Gemini Spark at I/O 2026. A frontier-grade Flash model that outruns 3.1 Pro, and a persistent personal agent built on top of it. Here's what shipped, what's rumored, and where it fits next to Claude Opus 4.7 and GPT-5.5.

· 9 min read
AI

AI Model Releases — May 2026 Roundup

A practitioner's roundup of every AI model release that mattered in May 2026 — Anthropic Mythos, Gemini 3.5 Flash, Qwen 3.7 Max, Mistral Medium 3.5, ERNIE 5.1, and Subquadratic's 12M-token SubQ. Benchmarks, pricing, availability, and what to actually use.

· 14 min read
AI

Baidu ERNIE 5.1: Chinese LLM Cracks Global Top 5

Baidu's ERNIE 5.1, released May 8 2026, became the first Chinese LLM in the global Search Arena top 5. Here's what it does, how it compares to DeepSeek V4 and Qwen, and when teams outside China should actually use it.

· 8 min read
AI

Mistral Medium 3.5 + Le Chat Work Mode (May 2026)

Mistral Medium 3.5 is a 128B dense model with built-in reasoning, coding, and agentic capabilities. Le Chat Work Mode turns it into a multi-tool agent. Here's what's new, what it costs, and when to actually pick Mistral over Claude or GPT.

· 9 min read
AI

Manus AI in 2026: Meta Block, Desktop App, What's Next

Manus AI in May 2026: the Meta $2B acquisition was blocked by China's state planner on April 27, the Desktop app's My Computer feature puts the agent on your local machine, and pricing, access, and the autonomous-agent landscape sit where they sit.

· 10 min read
AI

SubQ: Miami's 12M Token Context Window (May 2026)

Subquadratic, a Miami startup, launched SubQ on May 5, 2026 — the first frontier LLM with a 12-million-token context window, built on a new Subquadratic Selective Attention (SSA) architecture. Here's what's verified, what's plausible, and what to actually do with it.

· 9 min read
2026

How to Use the DeepSeek V4 API: Developer Guide (2026)

Quick answer. The DeepSeek V4 API is OpenAI-compatible. Point any OpenAI SDK at https://api.deepseek.com, set your key, and call deepseek-v4-pro (top reasoning/agentic) or deepseek-v4-flash (cheap, fast). Minimal request: curl https://api.deepseek.com/chat/completions -H "Authorization: Bearer $DEEPSEEK_API_KEY" -H "Content-Type:

· 10 min read
2026

DeepSeek V4 vs Claude vs GPT-5: Which AI Coding Model Should Developers Use in 2026?

Quick answer. For pure SWE-bench Pro top score and 1M-context agentic coding, pick Claude Opus 4.7. For longest-horizon swarm runs, pick Kimi K2.6 — open-weight and roughly 8x cheaper. For broad reasoning + Codex/CLI tooling, GPT-5.5. For commodity-priced inference at frontier-adjacent quality, DeepSeek V4 Pro. Choose per workload,

· 13 min read