Run YOLOv12 (and YOLO26) on macOS: 2026 Install Guide
Last updated April 2026 — refreshed for current model/tool versions.
This guide walks through installing and running YOLOv12 on macOS in 2026 — the attention-centric detector released for NeurIPS 2025 — and shows the cleaner Ultralytics path through YOLO26 (released January 14, 2026), which most production teams should now prefer. You get exact commands for Apple Silicon (M1–M4), CPU-only Intel Macs, and the new MLX-native path that beats PyTorch MPS by 1.1×–2.6× on M4 Pro.
What changed in 2026 (read this before you copy-paste old YOLOv12 commands):YOLO26 is the new flagship. Ultralytics released YOLO26 on January 14, 2026 with native end-to-end inference: no NMS, no DFL, simpler export. The nano variant runs ~31% faster on CPU than YOLO11n (38.9 ms vs 56.1 ms ONNX) and Ultralytics quotes up to 43% faster CPU inference across the family.YOLOv12 is now research-tier, not production-tier. Ultralytics' own docs explicitly recommend YOLO11 or YOLO26 for production workloads; YOLOv12's attention blocks cause higher memory use, training instability, and slower CPU throughput.The old YOLOv12 install command is wrong. There is nogithub.com/ultralytics/yolov12.gitrepo. The actual YOLOv12 reference repo isgithub.com/sunsmarterjie/yolov12(the NeurIPS 2025 paper authors). The 2025 version of this guide had the URL wrong.MLX is now the fastest YOLO path on Apple Silicon. The communityYOLO26-MLXport runs natively on M-series GPUs (no PyTorch MPS), reaching 124.9 FPS on M4 Pro for the nano variant.FlashAttention is GPU-only. YOLOv12's reference repo listsflash-attnas a dependency — it requires NVIDIA Turing/Ampere/Ada/Hopper. On macOS you must install YOLOv12 without flash-attn or the install fails.
TL;DR — which YOLO should I run on my Mac in 2026?
| Use case | Recommended model | Why |
|---|---|---|
| Production app, real-time inference | YOLO26 (via Ultralytics) | NMS-free, 43% faster on CPU, clean CoreML/ONNX export |
| Research / paper reproduction | YOLOv12 (sunsmarterjie repo) | Original attention-centric architecture, NeurIPS 2025 weights |
| Apple Silicon, max throughput, no PyTorch | YOLO26-MLX | 1.1×–2.6× faster than PyTorch MPS on M4 Pro |
| Stable workhorse, broadest compatibility | YOLO11 | Mature, fewer surprises, still in active support |
| Edge / mobile / iOS deployment | YOLO26n via CoreML | DFL removal makes CoreML export straightforward |
If you're shipping vision into a product right now, jump to the YOLO26 install. If you specifically need the attention-centric YOLOv12 architecture (e.g. you're benchmarking against the paper), follow the YOLOv12 install.
System requirements and prerequisites
- macOS: 13 (Ventura) or newer. macOS 14 (Sonoma) or 15 (Sequoia) recommended for stable PyTorch MPS support. macOS 26 (Tahoe) currently has known PyTorch MPS regressions tracked in pytorch/pytorch#167679 — if you're on Tahoe and MPS won't initialise, fall back to CPU or use YOLO26-MLX.
- Hardware: Apple Silicon (M1, M2, M3, or M4) is strongly preferred. Intel Macs work but you're CPU-only — expect ~5–10× slower inference than M-series GPUs.
- Python: 3.10 or 3.11 (3.12+ also works for Ultralytics; the YOLOv12 reference repo pins 3.11).
- PyTorch: 2.8 or newer for clean MPS support. PyTorch 2.10 (released April 2026) and 2.11 add further MPS operator coverage.
- Disk: ~3 GB free for the env + COCO-pretrained weights (n through x).
- Xcode Command Line Tools: required for native builds. Install with
xcode-select --install. - Homebrew: optional but useful for managing Python and OpenCV. Install from brew.sh.
Verify Python and pip:
python3 --version # expect 3.10.x or 3.11.x
pip3 --versionInstall path A — YOLO26 via Ultralytics (recommended)
This is the path most readers should pick. Ultralytics ships YOLO26 through the same ultralytics Python package that hosts YOLO11, YOLOv8, and (as a community release) YOLOv12. The pip package is the official distribution.
1. Create a clean environment
python3 -m venv ~/venvs/yolo26
source ~/venvs/yolo26/bin/activate
pip install -U pip wheelConda also works:
conda create -n yolo26 python=3.11 -y
conda activate yolo262. Install PyTorch with MPS support
Apple Silicon Macs get GPU acceleration via PyTorch's MPS backend. The default pip install torch already includes MPS — no special wheel needed since PyTorch 2.0.
pip install torch torchvisionVerify MPS is reachable:
python3 -c "import torch; print('MPS available:', torch.backends.mps.is_available())"Expected output: MPS available: True. If you see False, you're either on Intel hardware (CPU-only is fine) or hitting the macOS 26 regression — use CPU for now.
3. Install Ultralytics
pip install -U ultralyticsThis pulls in OpenCV, NumPy, Pillow, PyYAML, and the rest. No separate brew install opencv is needed.
4. Run a sanity-check prediction
yolo predict model=yolo26n.pt source='https://ultralytics.com/images/bus.jpg' device=mpsOr in Python:
from ultralytics import YOLO
model = YOLO("yolo26n.pt") # auto-downloads on first run
results = model("path/to/image.jpg", device="mps")
results[0].save("out.jpg")Replace device="mps" with device="cpu" on Intel Macs.
5. Common YOLO26 commands on macOS
# Image
yolo predict model=yolo26n.pt source=image.jpg device=mps
# Video file
yolo predict model=yolo26s.pt source=clip.mp4 device=mps
# Webcam (FaceTime / iSight)
yolo predict model=yolo26n.pt source=0 device=mps show=True
# Save labels + confidences
yolo predict model=yolo26n.pt source=image.jpg save_txt=True save_conf=True
# Batch over a directory
yolo predict model=yolo26n.pt source=./photos/ device=mps6. Export to CoreML for iOS / on-device shipping
YOLO26's NMS-free design and DFL removal make CoreML export materially cleaner than YOLOv12. One line:
yolo export model=yolo26n.pt format=coremlYou'll get a .mlpackage ready to drop into Xcode. ONNX (format=onnx) and TFLite (format=tflite) work the same way.
Install path B — YOLOv12 (research / paper reproduction)
If you specifically need the YOLOv12 architecture from the NeurIPS 2025 paper (area attention, R-ELAN, the original weights), use the official reference repo. The Ultralytics package also exposes YOLOv12 through the same YOLO("yolo12n.pt") interface if you don't need the research codebase.
Option B1 — through Ultralytics (simpler)
pip install -U ultralytics
yolo predict model=yolo12n.pt source=image.jpg device=mpsAvailable variants: yolo12n.pt, yolo12s.pt, yolo12m.pt, yolo12l.pt, yolo12x.pt for detection. Segmentation/pose/classification/OBB exist as YAML configs only — no pretrained weights.
Option B2 — through the reference repo (paper reproduction)
The original repo lives at github.com/sunsmarterjie/yolov12. Important: their published install command lists flash-attn in the conda environment, which requires an NVIDIA GPU (Turing/Ampere/Ada/Hopper) and will not build on macOS. Strip it out:
# macOS-friendly variant of the reference install
conda create -n yolov12 python=3.11 -y
conda activate yolov12
pip install supervision
git clone https://github.com/sunsmarterjie/yolov12.git
cd yolov12
# Edit requirements.txt and remove the line `flash-attn==...` before this step
pip install -r requirements.txt
pip install -e .Test:
from ultralytics import YOLO
model = YOLO("yolov12n.pt")
results = model("ultralytics/assets/bus.jpg", device="mps")
results[0].save("out.jpg")Without FlashAttention, attention falls back to PyTorch's built-in scaled-dot-product attention (SDPA). Inference still works; throughput drops vs. NVIDIA. Roboflow maintains a SDPA-only fork if you prefer not to edit the requirements file.
Install path C — YOLO26-MLX (Apple Silicon, max throughput)
For Apple Silicon users who want every last frame per second, the community YOLO26-MLX port skips PyTorch entirely and runs against Apple's MLX framework. webAI's benchmarks on M4 Pro show 1.1×–2.6× faster inference than PyTorch MPS, with accuracy within 0.5% of the official PyTorch weights.
python3 -m venv ~/venvs/yolo26-mlx
source ~/venvs/yolo26-mlx/bin/activate
pip install -U pip mlx
git clone https://github.com/webai-com/yolo26-mlx.git # check repo URL on the webAI blog
cd yolo26-mlx
pip install -r requirements.txt
# Convert PyTorch weights to MLX
python tools/convert.py --weights yolo26n.pt --out yolo26n.mlx
# Run inference
python infer.py --weights yolo26n.mlx --source image.jpgThis path is best when you're (a) on Apple Silicon, (b) doing real-time work where 1.5× matters, and (c) comfortable installing from a community repo. For everyone else, stay on Ultralytics' PyTorch path.
Performance — concrete 2026 numbers
All numbers below are from Ultralytics' published COCO benchmarks (CPU = ONNX Runtime, GPU = TensorRT 10 on a T4) and webAI's M4 Pro MLX measurements. Apple Silicon MPS numbers vary heavily by chip — treat them as ballpark, not gospel.
| Model | mAPval 50-95 | CPU ONNX (ms) | T4 TensorRT (ms) | Params (M) |
|---|---|---|---|---|
| YOLO26-N | 40.9 | 38.9 | 1.7 | ~2.4 |
| YOLO26-S | 48.6 | 87.2 | — | — |
| YOLO26-M | 53.1 | 220.0 | 4.7 | — |
| YOLO26-L | 55.0 | 286.2 | 6.2 | — |
| YOLO26-X | 57.5 | 525.8 | 11.8 | ~55.7 |
| YOLO12-N | 40.6 | — | 1.64 | — |
| YOLO12-S | 48.0 | — | 2.61 | — |
| YOLO12-M | 52.5 | — | 4.86 | — |
| YOLO11-N | 39.5 | 56.1 | 1.5 | 2.6 |
Headline takeaways for macOS users:
- YOLO26-N is +1.4 mAP and ~31% faster on CPU than YOLO11-N. Across the family Ultralytics quotes up to 43% faster CPU inference.
- YOLO26 also beats YOLOv12 across the board on both accuracy and CPU throughput — at parity-or-better mAP, YOLO26 is dramatically lighter to deploy because there's no NMS post-processing and no DFL.
- On M4 Pro with MLX, YOLO26-MLX hits 124.9 FPS for nano down to 10.7 FPS for x-large.
How to choose: a 30-second decision tree
- Are you shipping into a product? → YOLO26 via Ultralytics. Done.
- Are you reproducing a research paper that names YOLOv12 specifically? → YOLOv12 via the sunsmarterjie repo (with flash-attn stripped on macOS).
- Is the workload latency-critical on Apple Silicon and you have time to integrate a non-PyTorch stack? → YOLO26-MLX.
- Do you need maximum stability and the broadest community knowledge base? → YOLO11 still works fine and is the safest pick.
- Are you exporting to CoreML for iOS / iPadOS? → YOLO26. NMS-free design + DFL removal = clean CoreML export. YOLOv12's NMS step adds friction.
Common pitfalls and troubleshooting
flash-attnwheel build fails on macOS. FlashAttention only supports NVIDIA GPUs. Remove it fromrequirements.txtwhen installing the YOLOv12 reference repo, or use Roboflow's SDPA fork.RuntimeError: MPS backend is not availableon macOS 26 (Tahoe). Tracked in pytorch/pytorch#167679. Either downgrade to macOS 15, fall back todevice="cpu", or move to YOLO26-MLX which is unaffected.- The old
github.com/ultralytics/yolov12.gitURL 404s. That repo never existed; the original 2025 guide had the URL wrong. Usegithub.com/sunsmarterjie/yolov12for the research code, or justpip install ultralyticsfor the packaged version. - OpenCV crashes on M-series Macs with arm64 vs x86_64 mismatch. Make sure your Python is native arm64:
python3 -c "import platform; print(platform.machine())"should printarm64. If it printsx86_64, you're running under Rosetta — reinstall Python via the official installer or arm64 Homebrew. - Webcam
source=0hangs. macOS requires explicit camera permission. Run from Terminal and grant access in System Settings → Privacy & Security → Camera. If you're inside a venv, the prompt may go to the parent app (Terminal/iTerm). - Slow CPU inference on Intel Macs. Expected. Use a smaller variant (n or s), reduce
imgszto 320 or 416, and export to ONNX withyolo export format=onnx— ONNX Runtime is meaningfully faster than the PyTorch CPU path. - Out-of-memory training YOLOv12 on M1/M2 (8 GB unified memory). YOLOv12's attention blocks are memory-hungry. Drop batch size to 4 or 2, lower
imgszto 640, and consider YOLO26 instead — it's lighter to train.
Hiring a vetted computer-vision engineer
Standing up a YOLO pipeline locally is the easy part. Productionising it — dataset curation, label QA, MLOps, CoreML/ONNX export, on-device latency budgets — is where most teams stall. If you'd rather extend your engineering team with a vetted CV specialist than spend three months hiring one, Codersera matches you with remote-ready developers and runs a risk-free trial so the technical fit is proven before you commit.
FAQ
Is YOLOv12 still worth using in 2026?
For research and benchmarking, yes — the attention-centric architecture is genuinely interesting and the NeurIPS 2025 weights are the definitive reference. For production, no — YOLO26 wins on accuracy, CPU speed, export simplicity, and quantization stability, and YOLO11 remains the safer mainstream choice. Ultralytics' own docs say so.
Does YOLO26 run on Apple Silicon GPUs?
Yes, through PyTorch's MPS backend (device="mps") on macOS 13+. For maximum throughput, the community YOLO26-MLX port runs natively on the Metal/MLX stack and is 1.1×–2.6× faster than PyTorch MPS on M4 Pro.
Why was NMS removed from YOLO26?
Non-Maximum Suppression is a post-processing step that's hard to export consistently across runtimes (CoreML, TFLite, ONNX, TensorRT) and adds latency. YOLO26 produces deduplicated predictions directly from the network — pioneered by YOLOv10 and refined in YOLO26 — which simplifies edge deployment and makes mobile export materially cleaner.
What's the lightest YOLO26 variant for an iPhone app?
YOLO26-N (~2.4 M parameters, 40.9 mAP). Export to CoreML with yolo export model=yolo26n.pt format=coreml and drop the .mlpackage into Xcode. The DFL removal in YOLO26 means the exported model has fewer custom ops than YOLOv12.
Can I use YOLOv12 segmentation/pose/classification weights?
Detection weights (yolo12n/s/m/l/x) are public; the segmentation, pose, classification, and OBB heads exist as YAML configs only — no pretrained weights were published with the NeurIPS 2025 release. If you need those tasks pre-trained, switch to YOLO26 or YOLO11, which ship all five task heads.
How do I train YOLO26 on a custom dataset on my Mac?
Same Ultralytics CLI as YOLO11. Example: yolo train model=yolo26n.pt data=path/to/data.yaml epochs=100 imgsz=640 device=mps batch=8. On 8 GB Apple Silicon, drop batch to 4. For real datasets, training on a Mac is fine for prototyping but a single A100 hour is usually cheaper than days on an M-series.
What license are these models under?
Ultralytics models (YOLO11, YOLOv8, YOLO12, YOLO26) ship under AGPL-3.0, which has copyleft implications for SaaS use. The sunsmarterjie YOLOv12 reference repo is also AGPL-3.0. Ultralytics offers a separate Enterprise license for closed-source commercial use.
The 2025 version of this guide referenced git clone https://github.com/ultralytics/yolov12.git — was that ever real?
No. That URL was a content error in the original 2025 post. The real research repo is github.com/sunsmarterjie/yolov12, and the official packaged distribution is the ultralytics pip package. Both have been corrected in this refresh.
Related Codersera guides
- YOLOv12 vs YOLOv10 for object detection
- YOLO-NAS vs YOLOv12
- YOLOv12 vs Detectron2
- Run Microsoft OmniParser V2 on macOS
References & further reading
- Ultralytics YOLO26 — official docs
- Ultralytics YOLO12 — official docs
- sunsmarterjie/yolov12 — NeurIPS 2025 reference implementation
- YOLO26 vs YOLO11 — Ultralytics benchmark comparison
- Roboflow: YOLO26 release analysis
- webAI: Running YOLO26 natively on Apple Silicon with MLX
- arXiv 2509.25164 — YOLO26 architectural enhancements paper
- Apple Developer: Accelerated PyTorch training on Mac (MPS)
- Ultralytics Quickstart — install & first run