Running DeepSeek Janus Pro 7B on Windows with ComfyUI: 2026 Setup Guide

Published 30 Jan 2025 • Updated 31 May 2026 • 11 min read

Last updated April 2026 — refreshed for current model/tool versions.

DeepSeek Janus Pro 7B is a unified multimodal model that handles both image understanding and text-to-image generation in a single framework — an architectural approach that places it in direct competition with DALL-E 3 and Stable Diffusion 3 on standard benchmarks. This guide covers everything you need to install and run it locally on Windows using ComfyUI, updated for the current ComfyUI portable release (Python 3.13, CUDA 13.0) and the CY-CHENYUE Janus-Pro plugin.

What changed since the original January 2025 guidePython requirement raised: The original guide cited Python 3.8–3.11. The current ComfyUI portable ships with Python 3.13 (best-supported as of 2025–2026); Python 3.12 is the recommended fallback. Python 3.8 and 3.9 are unsupported in the current ComfyUI portable environment.CUDA updated: ComfyUI portable now bundles PyTorch with CUDA 13.0. The correct PyTorch install command is pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu130.ComfyUI Manager is now the recommended install path for the Janus-Pro plugin — faster and handles dependency resolution automatically.Model folder path changed: Place model files under ComfyUI/models/Janus-Pro/Janus-Pro-7B/ (not a models/Janus-Pro-7B/ root-level folder as some older guides show).VRAM reality check: 16 GB VRAM is the practical minimum; 24 GB (RTX 3090/4090) is recommended for the 7B variant. On L4-class cloud hardware the model uses ~22 GB VRAM and generates images in ~15 seconds each.Janus Pro 7B remains the latest public release of the Janus series as of April 2026. No successor model has been publicly announced. The GitHub repo shows no formal releases beyond the 2025-01-27 Janus-Pro drop.

TL;DR — Key Facts

Item	Value
Model release date	January 27, 2025
Model license	DeepSeek Model License (weights); MIT (code)
Parameter count	7B (also: Janus-Pro-1B for low-VRAM setups)
Minimum VRAM	16 GB (RTX 4080 or RTX 3090)
Recommended VRAM	24 GB (RTX 3090, RTX 4090)
Python version	3.13 (best); 3.12 (fallback)
PyTorch / CUDA	PyTorch 2.4+ with CUDA 13.0
GenEval score	80.0% (vs DALL-E 3: 67%, SD3: 74%)
DPG-Bench score	84.1% (vs DALL-E 3: 79.2%)
Disk space needed	~30 GB for 7B model + ComfyUI

What Is DeepSeek Janus Pro 7B?

Janus Pro is DeepSeek's unified multimodal large language model, built on a 7B-parameter base (DeepSeek-LLM-7b-base) with a decoupled visual encoding architecture. Unlike traditional models that share one visual encoder for both understanding and generation, Janus Pro routes these tasks through separate pathways:

Understanding path: SigLIP-L visual encoder (384×384 image input) — extracts high-dimensional semantic features for image captioning and VQA tasks.
Generation path: LlamaGen VQ tokenizer (16× downsample rate) — converts images to discrete token IDs for autoregressive text-to-image generation.

Both paths feed into a shared autoregressive transformer. The result is a single model that can describe an image in natural language, answer questions about it, and generate new images from text prompts — all without switching tools.

If you are thinking about broader local AI deployment (not just image generation), the OpenClaw + Ollama setup guide for running local AI agents covers how to orchestrate multiple local models including multimodal ones through a unified agent layer.

Benchmark Performance (2025 Published Data)

Janus Pro 7B was evaluated on two primary text-to-image generation benchmarks at release. These numbers are from the official arXiv paper (2501.17811) and have not been superseded by independent 2026 re-evaluations at the time of writing.

Benchmark	Janus-Pro-7B	DALL-E 3	SD3-Medium	Emu3-Gen
GenEval (overall)	80.0%	67.0%	74.0%	—
DPG-Bench	84.1%	79.2%	—	71.1%

GenEval tests how precisely a model follows compositional text prompts (counting objects, positioning, attributes). DPG-Bench evaluates dense, detailed prompt adherence. Janus Pro 7B leads both, though real-world image quality at 512×512 may feel softer than dedicated diffusion models like SDXL or FLUX for purely aesthetic generation. It shines on prompt accuracy and multimodal round-trip tasks (describe → modify → regenerate).

1B vs 7B: Which Variant to Use

Variant	VRAM	Speed (L4 GPU)	Image Quality	Best For
Janus-Pro-1B	~8 GB	~6–8 s/image	Good	Low-VRAM GPUs, rapid iteration
Janus-Pro-7B	~16–22 GB	~15 s/image	Better	High-VRAM GPUs, production quality

Speed figures are from cloud L4 hardware testing. Consumer RTX 4090 results are similar; RTX 3080 (10 GB) cannot run the 7B variant without quantization.

System Requirements (April 2026)

OS: Windows 10 64-bit or Windows 11 (both fully supported)
CPU: 8+ cores; 16 cores recommended
RAM: 32 GB minimum; 64 GB recommended for large batch generation
GPU: NVIDIA RTX series with 16 GB+ VRAM for Janus-Pro-7B; 8 GB+ for Janus-Pro-1B. RTX 3090, RTX 4080, RTX 4090 are all well-tested. RTX 3080 (10 GB) requires quantization.
Disk: 100 GB free (ComfyUI + model weights + generated outputs)
Python: 3.13 (recommended); 3.12 (fallback if custom node issues arise). Do not use Python 3.8 or 3.9 — these are no longer compatible with current ComfyUI.
CUDA Toolkit: Match your NVIDIA driver. ComfyUI portable ships with CUDA 13.0.
NVIDIA driver: Update via GeForce Experience or the NVIDIA Driver Portal before installation.

Pre-Installation Checklist

Update NVIDIA GPU drivers to the latest stable version.
Confirm at least 100 GB free disk space on your install drive (SSD strongly recommended for model loading speed).
Temporarily disable antivirus real-time protection during installation to prevent false-positive blocks on downloaded model weights.
Ensure a stable internet connection — the Janus-Pro-7B weights total roughly 14 GB from Hugging Face.

Installation Path A: ComfyUI Portable (Recommended for Most Users)

The portable build is the fastest path. It ships with Python 3.13 and PyTorch CUDA 13.0 pre-installed — no separate Python or CUDA Toolkit installation required.

Step 1: Download the ComfyUI Portable Package

Go to the ComfyUI releases page on GitHub.
Download the latest Windows portable zip (look for ComfyUI_windows_portable_nvidia.7z or equivalent).
Extract with 7-Zip to a folder such as C:\ComfyUI.

Step 2: Launch ComfyUI

Double-click run_nvidia_gpu.bat inside the extracted folder. ComfyUI will start and open in your browser at http://localhost:8188. Confirm it loads correctly before installing the plugin.

Step 3: Install ComfyUI Manager

ComfyUI Manager is not bundled by default but is required for one-click plugin installation.

Stop ComfyUI (Ctrl+C in the terminal).
Open a terminal in the ComfyUI/custom_nodes folder.
Clone ComfyUI Manager:

git clone https://github.com/ltdrdata/ComfyUI-Manager.git

Restart ComfyUI (run_nvidia_gpu.bat). A "Manager" button will appear in the top-right menu.

Step 4: Install the ComfyUI-Janus-Pro Plugin via Manager

In the ComfyUI web interface, click Manager → Custom Nodes Manager.
Search for Janus-Pro.
Find the entry by author CY-CHENYUE and click Install.
When installation completes, click Restart, then manually refresh your browser (F5).

Step 5: Download Janus Pro Model Weights

Visit deepseek-ai/Janus-Pro-7B on Hugging Face (or the community mirror).
Download all files (config JSONs, tokenizer files, and .bin model weight files).
Create this folder structure inside your ComfyUI installation:

ComfyUI/
└── models/
    └── Janus-Pro/
        └── Janus-Pro-7B/
            ├── config.json
            ├── pytorch_model.bin   (or sharded .bin files)
            ├── tokenizer.json
            ├── tokenizer_config.json
            └── special_tokens_map.json

For Janus-Pro-1B, create ComfyUI/models/Janus-Pro/Janus-Pro-1B/ and place files there instead.

Tip: Use huggingface-cli for faster parallel downloads:

pip install huggingface_hub
huggingface-cli download deepseek-ai/Janus-Pro-7B --local-dir ./Janus-Pro-7B

Installation Path B: Manual Python Environment

Use this path if you need custom Python/CUDA versions or already have an existing Python environment.

Step B1: Install Python and Git

Download Python 3.13 (or 3.12) from python.org. During installation, check "Add Python to PATH."
Download and install Git for Windows using default settings.

Step B2: Create a Virtual Environment

python -m venv comfyui-env
comfyui-env\Scripts\activate

Step B3: Install PyTorch with CUDA 13.0

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu130

Verify the install: python -c "import torch; print(torch.cuda.is_available(), torch.version.cuda)" should print True 13.0.

Step B4: Clone and Install ComfyUI

git clone https://github.com/comfy-org/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt

Step B5: Install ComfyUI-Janus-Pro Plugin Manually

cd custom_nodes
git clone https://github.com/CY-CHENYUE/ComfyUI-Janus-Pro.git
cd ComfyUI-Janus-Pro
pip install -r requirements.txt

Step B6: Download Model Weights and Launch

Follow the same model download and folder structure from Step 5 above. Then launch ComfyUI from the repo root:

python main.py

ComfyUI will be accessible at http://localhost:8188.

First Run: Building a Janus Pro Workflow in ComfyUI

After installation, ComfyUI exposes Janus Pro as a set of custom nodes. A minimal image-generation workflow uses five connected nodes:

Janus Model Loader — select Janus-Pro-7B (or Janus-Pro-1B) from the dropdown.
Text Input — your generation prompt (e.g., "a photorealistic mountain lake at golden hour").
Janus Image Generator — set num_images to 1 for the first test, resolution to 512 for speed.
Image Preview — displays the output in the browser.
Save Image — writes to ComfyUI/output/.

For image understanding (captioning / VQA), replace the generator node with the Janus Image Analyzer node and connect an image loader.

Quick Test Prompt

prompt = "a futuristic cityscape at night, neon reflections on wet streets, photorealistic"
num_images = 1
resolution = 512  # increase to 768 once confirmed working

Click Queue Prompt. First run will be slower as the model loads into VRAM; subsequent generations are faster.

Decision Tree: Should You Use Janus Pro 7B?

You want multimodal in one model (understand + generate): Yes, Janus Pro is designed for this.
You want the highest aesthetic quality for pure image generation: Consider FLUX.1 or SDXL first — dedicated diffusion models still lead on pure visual quality for most styles.
You have <16 GB VRAM: Use Janus-Pro-1B instead. The 7B will OOM on cards with 10 GB or less.
You want to run on CPU only: Technically possible but prohibitively slow (hours per image). Not practical.
You need commercial use: Review the DeepSeek Model License carefully before production deployment — it has usage restrictions that differ from pure MIT/Apache-2.0 models.

Pro Tips for Optimal Performance

FP16 precision: The model defaults to FP16 via ComfyUI; do not override to FP32 unless debugging. FP16 cuts VRAM usage roughly in half compared to FP32.
VRAM management: Close VRAM-hungry background applications (browsers with many tabs, games, Discord with hardware acceleration) before starting ComfyUI.
Multi-GPU: For multi-GPU setups, assign task affinity with set CUDA_VISIBLE_DEVICES=0 in the batch file before launching.
Prompt tuning: Janus Pro responds well to structured, specific prompts. Include lighting conditions, camera angle, and style references rather than abstract adjectives.
Batch processing: Set num_images to 4–8 for batch generation; images are generated sequentially, but the model stays loaded in VRAM between them.
Keep up to date: Run git pull in both ComfyUI/ and ComfyUI/custom_nodes/ComfyUI-Janus-Pro/ regularly to pick up bug fixes.

Troubleshooting Common Issues

Issue: CUDA Out of Memory (OOM)

Symptoms: RuntimeError: CUDA out of memory during model load or generation.

Fix: Clear VRAM and switch to FP16:

torch.cuda.empty_cache()
model.half()  # forces FP16 precision

Alternatively, switch to Janus-Pro-1B, or reduce resolution to 384 or 512.

Issue: Missing Dependencies / Import Errors

Symptoms: Node fails to load; ComfyUI startup shows a red error in the terminal for the Janus-Pro node.

Fix: Force-reinstall the plugin requirements:

pip install --force-reinstall -r custom_nodes/ComfyUI-Janus-Pro/requirements.txt

For the portable installation, use:

python_embeded\python.exe -m pip install --force-reinstall -r custom_nodes\ComfyUI-Janus-Pro\requirements.txt

Issue: Slow Generation

Symptoms: Images take significantly longer than the expected ~15 seconds on NVIDIA hardware.

Fix: Confirm GPU acceleration is active: open Windows Settings → Display → Graphics Settings and ensure your Python executable or ComfyUI is set to "High performance." Also check that PyTorch sees your GPU: python -c "import torch; print(torch.cuda.get_device_name(0))".

Issue: Hugging Face Download Fails

Symptoms: requests.exceptions.ConnectionError or incomplete downloads.

Fix: Authenticate with the Hugging Face CLI (required for some gated models) and use the CLI downloader:

huggingface-cli login
huggingface-cli download deepseek-ai/Janus-Pro-7B --local-dir ./Janus-Pro-7B --resume-download

Symptoms: Janus Model Loader node shows no models.

Fix: Verify the exact folder path. It must be ComfyUI/models/Janus-Pro/Janus-Pro-7B/ with all required files present. Restart ComfyUI after confirming the structure.

Issue: Python 3.8 or 3.9 No Longer Works

Symptoms: Dependency install fails with syntax errors or unsupported version warnings.

Fix: Upgrade to Python 3.13 or 3.12. The ComfyUI portable ships with Python 3.13 and is the simplest path. Python 3.8/3.9 are not supported in current ComfyUI releases.

2026 Context: Alternatives and Ecosystem

Janus Pro 7B remains the latest public Janus release as of April 2026 — no successor model has been announced on the official GitHub. However, the multimodal generation landscape has continued evolving:

FLUX.1 (Black Forest Labs): Better raw image quality than Janus Pro for purely aesthetic text-to-image generation, though it is image-generation only (no multimodal understanding).
Llama 4 Maverick (Meta, 2026): 17B active parameters with multimodal input support; outperforms GPT-4o and Gemini 2.0 Flash on several benchmarks but is larger and requires more VRAM.
Qwen-Image 2.0 (Alibaba): Strong alternative with competitive multimodal understanding, particularly for Chinese-language prompts.
HunyuanImage 3.0 (Tencent): Leads on certain evaluation metrics among Chinese multimodal models.
JanusFlow-1.3B: A lightweight Janus variant using rectified flow for generation; useful for hardware with under 8 GB VRAM.

For teams evaluating which local AI stack to standardize on, the companion Windows guide for running Janus Pro without ComfyUI covers the bare Python API approach, which is useful for batch scripting and headless server deployments.

If your team needs dedicated AI engineers to build or maintain local AI pipelines, Codersera's vetted AI engineer network can accelerate the process.

FAQ

Can I run Janus Pro 7B on an RTX 3080 (10 GB VRAM)?

Not reliably in full-precision FP16. The 7B model requires approximately 14–22 GB of VRAM depending on batch size and resolution. An RTX 3080 with 10 GB will OOM during model load. Use Janus-Pro-1B instead, which runs comfortably on 8 GB VRAM.

Does Janus Pro 7B work without a GPU (CPU-only)?

It can run on CPU, but generation times are prohibitively slow — expect 10 to 60 minutes per image depending on hardware. GPU is effectively required for practical use.

What is the difference between Janus, JanusFlow, and Janus Pro?

Janus (1.3B) was the original unified multimodal model. JanusFlow (1.3B) is a variant using rectified flow for generation — lighter but less flexible. Janus Pro (1B/7B) is the production version with improved training data, optimized training strategy, and significantly better benchmark scores. Janus Pro 7B is the current recommended variant for quality work.

Is Janus Pro 7B free to use commercially?

The code is MIT-licensed, but the model weights are under the DeepSeek Model License, which imposes usage restrictions. Review the license at the Hugging Face model page before commercial deployment.

Can Janus Pro 7B both understand and generate images?

Yes — this is its defining feature. You can feed an image and ask it to describe or analyze the content (multimodal understanding), and you can provide a text prompt and generate a new image (text-to-image generation). Both capabilities are available as separate ComfyUI nodes from the CY-CHENYUE plugin.

Which ComfyUI plugin should I use for Janus Pro?

The primary actively maintained plugin is CY-CHENYUE/ComfyUI-Janus-Pro. There are also alternatives: ComfyUI_Janus_Wrapper by chflame163 and ComfyUI-Janus_pro_vision by ShmuelRonen (vision-language focused). Install via ComfyUI Manager to avoid dependency conflicts.

My ComfyUI shows "CUDA not available" — what do I do?

First verify your NVIDIA driver is up to date. Then confirm PyTorch sees CUDA: python -c "import torch; print(torch.cuda.is_available())". If this returns False, reinstall PyTorch with the correct CUDA flag for your driver version. For the portable build, run update_comfyui.bat which will pull the correct PyTorch variant automatically.

How do I update Janus Pro or ComfyUI?

Run git pull in both the ComfyUI root directory and custom_nodes/ComfyUI-Janus-Pro/. For the portable version, use the included update_comfyui.bat script. Re-run pip install -r requirements.txt after updates in case new dependencies were added.