Void AI

Run Void AI with Ollama on Ubuntu: 2026 Setup Guide (Dev Paused — Read This First)

Published 21 May 2025 • Updated 10 May 2026 • 12 min read

Last updated April 2026 — refreshed for current model/tool versions and Void's paused development status.

Void AI paired with Ollama gives you a fully local, private AI coding environment on Ubuntu — no subscriptions, no cloud, no data leaving your machine. This guide walks through a complete setup, covers the best models to run in 2026, and explains what Void's paused development means for your workflow.

If you want a broader look at the local-AI tooling ecosystem before diving into this setup, the OpenClaw + Ollama setup guide for running local AI agents is a strong companion read covering agent orchestration on the same Ollama backend.

What changed in 2026 — read this before anything elseVoid development is paused. The Void team announced in late 2025 that they have paused work on the Void IDE to explore "novel coding ideas." The README states they may not resume Void as an IDE. Existing builds still function, but no new releases have shipped since Beta Patch #7 (v1.4.1, June 5 2025). Treat Void as stable-but-frozen, not actively maintained.Ollama is at v0.22.0 (April 28, 2026). The install script and systemd service setup are unchanged, but the model library now includes Devstral, Qwen3-Coder, Kimi-K2.5, Gemma 4, and GLM-5. Ollama now runs as a full systemd service on Linux — ollama daemon start is no longer the recommended startup method.Llama 3.1 is outdated for coding. The original post recommended Llama 3.1 8B. In 2026, Devstral Small (24B) and Qwen3 14B are the recommended coding models; Llama 3.1 has been superseded by Llama 4, but Devstral/Qwen3-Coder consistently outperform Llama for agentic coding tasks.Void auto-detects Ollama on port 11434 — no manual config needed in most cases. The flagged fact in the previous post was accurate and remains so.Continue.dev is the most actively maintained alternative for VS Code users who want a Void-like experience with a project that's still shipping updates.

Want the full picture? Read our continuously-updated Cursor IDE complete guide — setup, model choices, agent workflows, privacy options, and how it compares to alternatives.

TL;DR

Question	Answer
Does Void still work on Ubuntu?	Yes — v1.4.1 installs and runs. No new features since June 2025.
Is Void free?	Yes. MIT-licensed, no subscriptions, no usage limits.
Best coding model to pair with Void + Ollama?	Devstral Small 24B (needs 16GB RAM); Qwen3 14B for 12GB setups.
Does Void send code to the cloud?	No — all inference runs locally via Ollama.
Should I start a new project with Void in 2026?	With caution. Consider Continue.dev for a more actively-maintained option.

What is Void?

Void is an open-source AI code editor forked from VS Code. Its design goal was a direct, privacy-preserving alternative to Cursor: full VS Code extension compatibility combined with built-in AI chat, inline edits, and local model support via Ollama and other backends.

Key features

Local AI agents: Use AI models on your codebase without sending data to the cloud.
VS Code compatibility: Extensions, themes, and keybindings transfer without changes.
Inline edits and chat: Ctrl+K for inline edits, Ctrl+L for chat with file context.
Privacy-first: No data leaves your machine unless you configure a cloud provider.
MCP support: Added in v1.4.1 (June 2025) — connect to external tools via the Model Context Protocol.
Multiple backends: Ollama, llama.cpp, LM Studio, and cloud providers (OpenAI, Anthropic, Gemini) are all supported.

Important: Void development is paused

As of late 2025, the Void team paused active development. The official README states: "We've paused work on the Void IDE to explore a few novel coding ideas… we might not resume Void as an IDE." The team is responsive via email (hello@voideditor.com) and Discord but is not reviewing GitHub Issues or PRs. The last release was Beta Patch #7 (v1.4.1), published June 5, 2025.

This does not mean Void is broken or unusable — but it does mean you should weigh the lack of active maintenance when choosing it for production work. See the Alternatives section for actively-maintained options.

What is Ollama?

Ollama is an open-source runtime for running large language models on your local machine. As of April 2026, Ollama is at v0.22.0 and supports dozens of models including Devstral, Qwen3-Coder, Kimi-K2.5, Gemma 4, Llama 4, GLM-5, and DeepSeek R1. It installs as a systemd service on Linux, provides a REST API on localhost:11434, and handles GPU acceleration automatically for NVIDIA and AMD GPUs.

Key features

Systemd-managed service: Starts automatically on boot; managed with systemctl.
REST API: Compatible with OpenAI's API format — most tools that support OpenAI can point at Ollama instead.
GPU auto-detection: Detects NVIDIA CUDA and AMD ROCm without manual configuration.
Model library: ollama pull <model> downloads and runs any model from the official library.
Multi-platform: Linux, macOS, and Windows (stable as of 2025).

Void + Ollama vs. Cursor: Feature comparison

Feature	Cursor	Void + Ollama
Open source	No	Yes (MIT)
Local model hosting	No (cloud-based)	Yes (via Ollama)
Data privacy	Limited	Full — local only
VS Code extension support	Yes	Yes (full compatibility)
Model choice	Fixed (Claude, GPT, Gemini)	Any Ollama-supported model
Cost	$20–$200/month (Pro to Ultra)	Free — no subscriptions
Platform support	macOS, Windows, Linux	Linux, macOS, Windows
Active development	Yes	Paused (last release June 2025)
MCP support	Yes	Yes (added v1.4.1)

Cursor's Pro plan costs $20/month ($16/month billed annually). Void + Ollama costs nothing after hardware. For teams with sensitive codebases or air-gapped environments, the privacy guarantee of a fully local setup is the deciding factor.

System requirements

Minimum

Ubuntu 20.04 LTS or newer (22.04 and 24.04 recommended)
4 GB RAM (8 GB to run a useful coding model)
10 GB free disk space (more for large models)
Sudo privileges
Internet access for initial downloads

Recommended for practical coding assistance

16 GB RAM to run Devstral Small 24B or Qwen3 14B comfortably
NVIDIA GPU with 8+ GB VRAM (RTX 3060 and above) or AMD RX 6700 XT and above for GPU-accelerated inference
SSD for faster model loading (models range from 4 GB to 20+ GB)

Step-by-step installation

1. Update your Ubuntu system

sudo apt update && sudo apt upgrade -y

2. Install Ollama

The official one-liner installs Ollama, creates a dedicated ollama user, and registers a systemd service:

curl -fsSL https://ollama.com/install.sh | sh

After installation, the Ollama service starts automatically. Verify it is running:

systemctl status ollama

You should see Active: active (running). The REST API is now available at http://localhost:11434. Open that URL in a browser; you should see "Ollama is running".

If you have a firewall configured, allow the port:

sudo ufw allow 11434/tcp

Note: The old ollama daemon start command is no longer needed or recommended. Ollama now runs as a managed systemd service. Use sudo systemctl start ollama, stop ollama, or restart ollama instead.

3. Pull a coding model

The original post recommended Llama 3.1 8B. In 2026, purpose-built coding models outperform Llama for code tasks. Choose based on your hardware:

Model	Size	RAM needed	SWE-bench Verified	Best for
Devstral Small 24B	~14 GB (Q4)	16 GB	46.8%	Agentic coding, multi-file edits
Qwen3 14B	~9 GB (Q4)	12 GB	~38%	Mid-range: coding + reasoning
Qwen3 8B	~5 GB (Q4)	8 GB	~30%	Low-RAM machines
DeepSeek R1 14B	~9 GB (Q4)	12 GB	~35%	Complex debugging, reasoning

Pull the recommended model for most setups:

ollama pull devstral

For 12 GB RAM machines:

ollama pull qwen3:14b

For the minimum 8 GB setup:

ollama pull qwen3:8b

Confirm the model downloaded successfully:

ollama list

Run a quick test to confirm inference works:

ollama run devstral "Write a Python function that merges two sorted lists."

4. Install Void IDE

Download the latest .deb package from the Void GitHub Releases page. As of this writing, the latest version is v1.4.1 (released June 5, 2025). Check the releases page for any newer build before downloading.

cd ~/Downloads
# Replace the filename with the version shown on the releases page
sudo apt install ./void_1.4.1_amd64.deb

If apt install reports missing dependencies, run:

sudo apt --fix-broken install

5. Launch Void and connect to Ollama

Start Void from your applications menu or from terminal:

void

On first launch, Void scans for a running Ollama instance on localhost:11434 and detects it automatically — no manual URL entry needed if you are using the default port. Available models appear in Void's model selection panel.

To switch models or add a new one:

Open Void's settings panel.
Navigate to the AI / Model section.
Select or type the model name as shown in ollama list (e.g., devstral or qwen3:14b).

Using Void's AI features

Void's core AI interactions all route through your local Ollama backend:

Autocomplete: Press Tab to accept inline completions as you type. Works best with a fast model like Qwen3 8B if latency is a concern.
Inline edits (Ctrl+K): Select a block of code and describe the change. Void sends the selection and your instruction to Ollama and applies the diff.
Chat panel (Ctrl+L): Ask questions with full file context. You can attach specific files or folders for larger-scope queries.
Codebase indexing: Void indexes your project for smarter suggestions that account for your own functions and variable names.
MCP tools: Connect external tools via MCP (added v1.4.1). This lets you give Void access to shell commands, APIs, or databases as part of a multi-step agent task.
AI commit messages: Void can draft commit messages from your staged diff.

Advanced configuration

Model management

# Update a model to the latest version
ollama pull devstral

# Remove a model you no longer use
ollama rm llama3.1:8b

# Show model metadata and parameter count
ollama show devstral

# List running model instances
ollama ps

Customizing the Ollama systemd service

If you want to set environment variables (e.g., VRAM limits, custom host binding), edit the service override:

sudo systemctl edit ollama

Add environment variables under [Service]:

[Service]
Environment="OLLAMA_HOST=0.0.0.0"
Environment="OLLAMA_NUM_PARALLEL=2"

Reload and restart:

sudo systemctl daemon-reload
sudo systemctl restart ollama

GPU acceleration

Ollama detects NVIDIA and AMD GPUs automatically at install time. Confirm GPU is being used:

ollama run devstral "hello"
# In another terminal:
nvidia-smi   # NVIDIA
rocm-smi     # AMD

If GPU is not detected, verify drivers are installed:

# NVIDIA
nvidia-smi
# AMD
rocminfo

Performance and benchmarks

The table below uses verified scores from published benchmarks (SWE-bench Verified, April 2026 leaderboard). Cloud models are included for reference.

Model	SWE-bench Verified	Runs locally	VRAM needed
Claude 4.6 Sonnet (cloud)	~72%	No	—
GPT-5.5 (cloud)	~74%	No	—
Devstral Small 24B	46.8%	Yes	~14 GB
Qwen3 14B	~38%	Yes	~9 GB
DeepSeek R1 14B	~35%	Yes	~9 GB
Qwen3 8B	~30%	Yes	~5 GB

Devstral Small 24B was built specifically for agentic software engineering tasks — it outperforms much larger models like DeepSeek-V3 and Qwen3 232B on SWE-bench when both are evaluated with the same OpenHands scaffold. For everyday tasks (autocomplete, refactors, docstrings, test generation), local models handle the work with no meaningful quality gap compared to a mid-tier cloud plan.

GPU throughput on an NVIDIA RTX 4090 (24 GB VRAM) running Devstral 24B at Q4 quantization: approximately 40–55 tokens/second — fast enough for interactive coding assistance.

Alternatives: if Void's paused development is a concern

Void remains functional, but its paused development is a legitimate reason to evaluate other options, especially for teams or long-running projects.

Tool	Type	Ollama support	Active development	Notes
Continue.dev	VS Code / JetBrains extension	Yes	Yes	Best drop-in for VS Code users; supports chat, autocomplete, and agent mode
Cline	VS Code extension	Yes	Yes	Agent-first; strong for multi-file tasks; transparent configuration
Zed Editor	Standalone editor	Yes	Yes	Rust-based, very fast; built-in AI with local model support
Void (current)	Standalone editor (VS Code fork)	Yes	Paused	Still functional; best UX parity with Cursor for existing VS Code users

For Continue.dev + Ollama setup on an active VS Code install (preserving all your existing extensions), the configuration is minimal: install the Continue extension, point it at http://localhost:11434, select your model. No editor migration needed.

How to choose

I want the closest thing to Cursor, fully local, and I accept that development is paused: Use Void + Ollama as described in this guide.
I need an actively maintained local-AI coding tool inside VS Code: Use Continue.dev + Ollama instead.
I want a full editor replacement (not VS Code) that is still actively developed: Evaluate Zed.
I need agent-mode coding (multi-file, multi-step) with transparent tool use: Use Cline + Ollama inside VS Code.
I have 8 GB RAM only: Pull qwen3:8b — it is the most capable model that fits. Expect slower inference without a GPU.
I want the best local coding model on 16+ GB RAM: Pull devstral — 46.8% on SWE-bench Verified, the top open-source score as of April 2026.

Common pitfalls and troubleshooting

Void cannot detect Ollama

Confirm Ollama is running: systemctl status ollama
Test the REST API directly: curl http://localhost:11434 — should return "Ollama is running"
Check that the model you want to use is pulled: ollama list
If port 11434 is blocked, open it: sudo ufw allow 11434/tcp
Restart Void after confirming Ollama is running

Inference is very slow or times out

Without a GPU, inference runs on CPU. A 14B model on a 4-core CPU will be 2–5 tokens/second — usable for chat, too slow for live autocomplete. Switch to a smaller model (qwen3:8b) or add a compatible GPU.
Confirm GPU is in use: nvidia-smi should show memory allocation while a model is running.
If Ollama was installed before NVIDIA drivers, reinstall Ollama after drivers are set up.

Model fails to load — out of memory

Check available RAM: free -h
Switch to a smaller quantization or a smaller model. Example: ollama pull qwen3:8b instead of devstral.
Close other memory-intensive applications before running inference.

Void .deb installation fails

Run sudo apt --fix-broken install to resolve missing dependencies.
If you are on Ubuntu 24.04, try installing the AppImage build from the releases page if the .deb has dependency issues.

I installed Void from a tutorial and have an old version

Check your installed version: open Void → Help → About.
The latest available release is v1.4.1 from June 5, 2025. Since development is paused, this is the current stable build — there is no newer version to update to.

FAQ

Is Void dead?

Not dead, but paused. The Void team has paused active development to explore new ideas and has not committed to resuming the IDE. Existing builds work normally. If the pause becomes permanent, the MIT license means the community can fork and maintain it. Monitor the GitHub repository and Discord for announcements.

Can I use Void with cloud models like Claude or GPT instead of Ollama?

Yes. Void supports Anthropic, OpenAI, Google Gemini, and other cloud providers in addition to Ollama. You can configure a cloud API key in Void's settings and switch between local and cloud models per session. Many users use a local model for routine tasks and a cloud model for harder problems.

What Ollama models work best for code completion in Void?

For autocomplete (low-latency, short completions), smaller models respond faster: Qwen3 8B or Qwen3 14B. For chat and multi-file editing where quality matters more than speed, Devstral Small 24B is the strongest local option as of April 2026 (46.8% SWE-bench Verified).

Does Void work on Ubuntu 24.04 LTS?

Yes. Install via the .deb package from the releases page. If dependencies are missing, run sudo apt --fix-broken install. The AppImage build is also available as a fallback.

Will Void's VS Code extensions still work?

Yes. Void is a VS Code fork; almost all extensions from the VS Code marketplace install and run without modification. The only exceptions are extensions that directly modify core VS Code behavior in ways that conflict with Void's AI layer — these are rare.

Is Ollama free for commercial use?

Ollama itself is MIT-licensed and free to use commercially. The models you run have their own licenses — Devstral uses the Mistral AI Research License (check mistral.ai/news/devstral for commercial use terms), and Qwen3 uses the Apache 2.0 license (commercial use allowed).

Can I run Void + Ollama on a machine without a GPU?

Yes, but with caveats. CPU-only inference works and is private, but it is too slow for real-time autocomplete with models above 8B parameters. If you are limited to CPU, use Qwen3 8B for a bearable experience.

What happened to Llama 3.1 — should I still use it?

Llama 3.1 still runs, but it has been superseded by Llama 4 and by purpose-built coding models. For coding tasks specifically, Devstral and Qwen3-Coder consistently outperform Llama variants on SWE-bench and LiveCodeBench. Pull devstral or qwen3:14b instead.

If your team is evaluating local-AI tooling for a larger engineering setup — particularly air-gapped or compliance-sensitive environments — Codersera's vetted remote developers are experienced with self-hosted AI stacks and can help evaluate and implement the right architecture. The OpenClaw + Ollama setup guide for running local AI agents covers extending this stack with agent orchestration.

TL;DR

What is Void?

Key features

Important: Void development is paused

What is Ollama?

Key features

Void + Ollama vs. Cursor: Feature comparison

System requirements

Minimum

Recommended for practical coding assistance

Step-by-step installation

1. Update your Ubuntu system

2. Install Ollama

3. Pull a coding model

4. Install Void IDE

5. Launch Void and connect to Ollama

Using Void's AI features

Advanced configuration

Model management

Customizing the Ollama systemd service

GPU acceleration

Performance and benchmarks

Alternatives: if Void's paused development is a concern

How to choose

Common pitfalls and troubleshooting

Void cannot detect Ollama

Inference is very slow or times out

Model fails to load — out of memory

Void .deb installation fails

I installed Void from a tutorial and have an old version

FAQ

Is Void dead?

Can I use Void with cloud models like Claude or GPT instead of Ollama?

What Ollama models work best for code completion in Void?

Does Void work on Ubuntu 24.04 LTS?

Will Void's VS Code extensions still work?

Is Ollama free for commercial use?

Can I run Void + Ollama on a machine without a GPU?

What happened to Llama 3.1 — should I still use it?

References and further reading

Sign up for more like this.