Running DeepSeek Janus Pro 1B on Windows with ComfyUI (2026 Guide)

Published 30 Jan 2025 • Updated 30 Apr 2026 • 11 min read

Last updated April 2026 — refreshed for current model/tool versions.

DeepSeek Janus Pro 1B is a lightweight, open-source multimodal model that does both image understanding and image generation from a single transformer. This guide walks through every step to run it locally on Windows via ComfyUI — covering two install paths, up-to-date CUDA/PyTorch versions, verified model placement, workflow setup, and practical troubleshooting drawn from the active user community.

What changed since this post was first published (January 2025):CUDA version bump: The old guide recommended cu118 (CUDA 11.8). As of 2025, PyTorch 2.7 targets CUDA 12.6 (cu126) and 12.8 (cu128). Use cu126 or cu128 for all new installs on RTX 30/40/50 series.Python version: ComfyUI officially recommends Python 3.13 (3.12 is a solid fallback for custom-node compatibility). Python 3.10 still works but receives no upstream testing priority.ComfyUI Desktop installer: Comfy-Org now ships a one-click Windows installer at comfy.org/download that bundles Python and CUDA automatically — no manual venv needed for most users.ComfyUI version: The project has moved to the Comfy-Org/ComfyUI GitHub org. Latest stable release as of April 2026: v0.20.1 (April 27, 2025 — note: GitHub dates reflect the tag; the project continues active development).Model note: The Hugging Face model card for Janus-Pro-1B correctly identifies the model as using a 1.5B-parameter base (DeepSeek-LLM-1.5b-base). The "1B" name is marketing shorthand. File sizes and VRAM figures below reflect the actual 1.5B size.comfy-cli available: A new command-line tool, comfy-cli, lets you install and launch ComfyUI from a single terminal session — useful for headless or scripted setups.

TL;DR: At a glance

Item	Value (April 2026)
Model	DeepSeek Janus-Pro-1B (1.5B params, MIT license)
Architecture	Unified autoregressive transformer with decoupled visual encoding
GenEval score	0.73 (vs. DALL-E 3: 0.67, Janus-Pro-7B: 0.80)
DPG-Bench score	82.63 (vs. DALL-E 3: 83.50, Janus-Pro-7B: 84.19)
Minimum GPU VRAM	~4 GB (FP16 weights ~3 GB; allow headroom for activations)
Recommended GPU	NVIDIA RTX 3060 8 GB or RTX 4060 8 GB or better
Python	3.13 recommended (3.12 fallback)
PyTorch CUDA build	`cu126` or `cu128`
ComfyUI version	v0.20.1 (April 2025); Desktop installer available
Custom node	ComfyUI-Janus-Pro by CY-CHENYUE
Model source	huggingface.co/deepseek-ai/Janus-Pro-1B

What is DeepSeek Janus Pro 1B?

Released January 27, 2025, Janus Pro is DeepSeek's second-generation unified multimodal model series. Unlike most open-source image generators that are generation-only (Stable Diffusion, FLUX) or understanding-only (LLaVA), Janus Pro handles both tasks through one set of weights using decoupled visual encoding: a SigLIP-L encoder for understanding tasks and a LlamaGen VQ tokenizer for generation tasks, routed through a shared transformer backbone.

The 1B variant (actual parameter count: 1.5B; base: DeepSeek-LLM-1.5b-base) is the entry-level model in the Janus Pro family. It is practical for anyone with a mid-range GPU. The 7B model produces noticeably better images and scores but needs substantially more VRAM (12–16 GB recommended). If you are primarily exploring the architecture, start with 1B; for production-quality image generation, consider upgrading to 7B or running it on a cloud instance. For a broader local AI stack, see the OpenClaw + Ollama setup guide for running local AI agents — it pairs well with this workflow.

Performance and Benchmarks

The following numbers are from the official Janus-Pro paper (arXiv 2501.17811) published January 2025. No newer evaluation supersedes them as of April 2026.

Benchmark	Janus-Pro-1B	Janus-Pro-7B	DALL-E 3	SD XL
GenEval (↑)	0.73	0.80	0.67	0.55
DPG-Bench (↑)	82.63	84.19	83.50	74.65
MMBench (understanding, ↑)	75.5	79.2	—	—
MMMU (understanding, ↑)	36.3	41.0	—	—

Key takeaway for the 1B model: it exceeds DALL-E 3 on GenEval (0.73 vs 0.67) at a tiny fraction of the cost — no API bills, no internet connection required. On understanding tasks it lags behind the 7B and behind frontier VLMs like LLaVA-1.6 34B, but it is serviceable for captioning and description at this weight class.

System Requirements

Component	Minimum	Recommended
OS	Windows 10 64-bit	Windows 11 64-bit
CPU	Intel Core i5 8th Gen / Ryzen 5 2600	Intel Core i7 12th Gen / Ryzen 7 5800X
RAM	16 GB	32 GB
GPU	NVIDIA GTX 1080 (8 GB VRAM)	NVIDIA RTX 3060 8 GB or RTX 4060 8 GB
VRAM	4 GB (tight; use `--lowvram`)	8 GB (comfortable)
Storage	15 GB SSD	30 GB NVMe SSD
CUDA Toolkit	12.4	12.6 or 12.8
Python	3.10	3.13 (3.12 fallback)

CPU-only: Technically possible using ComfyUI's --cpu flag, but generation takes 5–20 minutes per image. Not recommended for regular use.

AMD GPU: ROCm on Linux works for some users; Windows ROCm support for PyTorch is experimental as of April 2026. This guide focuses on NVIDIA.

Two Installation Paths

Path A: ComfyUI Desktop Installer (Recommended for most users)

The official ComfyUI Desktop app, maintained by Comfy-Org, bundles Python, CUDA dependencies, and ComfyUI-Manager automatically. It is the fastest way to reach a working install.

Go to comfy.org/download and download the Windows installer (.exe).
Double-click the installer. Select NVIDIA GPU when prompted.
ComfyUI will install to a folder of your choice, download the required PyTorch CUDA build, and create a desktop shortcut.
Launch ComfyUI Desktop. The browser UI opens at http://localhost:8188 automatically.
Proceed to the Install the Janus Pro Custom Node section below.

When to prefer Path A: new to ComfyUI, not comfortable with virtual environments, or want automatic updates.

Path B: Manual Install (Full control)

1. Install Python

Download Python 3.13 (recommended) or Python 3.12 from python.org. During installation, check Add Python to PATH. Verify:

python --version
# Expected: Python 3.13.x or 3.12.x

2. Install CUDA Toolkit

Check your GPU driver version first with nvidia-smi in Command Prompt. The reported CUDA version in the output is the maximum your driver supports — you can install any toolkit version up to that number.

For RTX 30/40 series (recommended): CUDA 12.6. For RTX 50 series: CUDA 12.8.

nvcc --version
# Expected: release 12.6 or 12.8

3. Clone ComfyUI and Create a Virtual Environment

git clone https://github.com/Comfy-Org/ComfyUI.git
cd ComfyUI

python -m venv venv
venv\Scripts\activate

# Install PyTorch with CUDA 12.6 (use cu128 if you have CUDA 12.8)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126

# Install ComfyUI dependencies
pip install -r requirements.txt

Note: The old cu118 build still works for RTX 30 series cards but misses performance improvements in cuDNN 9.x and Flash Attention 2 optimisations available in cu126. Use cu118 only if you have Ampere-generation cards and encounter driver incompatibilities.

4. Optional: install via comfy-cli

If you prefer a single-command install for headless or scripted setups:

pip install comfy-cli
comfy --workspace=C:\ComfyUI install
comfy launch

This downloads ComfyUI and ComfyUI-Manager into the workspace directory and starts the server.

Install the Janus Pro Custom Node

The ComfyUI-Janus-Pro extension by CY-CHENYUE adds three dedicated nodes: JanusModelLoader, JanusImageUnderstanding, and JanusImageGeneration.

Method 1: ComfyUI Manager (easiest)

In the ComfyUI web UI, open Manager → Custom Node Manager.
Search for Janus-Pro.
Select the result by CY-CHENYUE and click Install.
Restart ComfyUI when prompted.

Method 2: Manual Git Clone

# From the ComfyUI root directory:
cd custom_nodes
git clone https://github.com/CY-CHENYUE/ComfyUI-Janus-Pro.git

# Install dependencies (Desktop portable Python):
..\..\..\python_embeded\python.exe -m pip install -r ComfyUI-Janus-Pro\requirements.txt

# Or for manual venv:
pip install -r ComfyUI-Janus-Pro\requirements.txt

Restart ComfyUI after installation.

Download the Janus-Pro-1B Model

Download all files for the model from the official Hugging Face repository: huggingface.co/deepseek-ai/Janus-Pro-1B

You need every file from the repository, including all JSON config files and the model weights (pytorch_model.bin or sharded .safetensors files if present).

Recommended: use huggingface-hub for reliable downloads:

pip install huggingface-hub

python -c "
from huggingface_hub import snapshot_download
snapshot_download(
    repo_id='deepseek-ai/Janus-Pro-1B',
    local_dir='ComfyUI/models/Janus-Pro/Janus-Pro-1B'
)
"

Or clone with git-lfs if you have it installed:

git lfs install
git clone https://huggingface.co/deepseek-ai/Janus-Pro-1B ComfyUI/models/Janus-Pro/Janus-Pro-1B

Place the files in the following structure:

ComfyUI/
└── models/
    └── Janus-Pro/
        └── Janus-Pro-1B/
            ├── config.json
            ├── generation_config.json
            ├── model.safetensors   (or pytorch_model.bin)
            ├── special_tokens_map.json
            ├── tokenizer.json
            ├── tokenizer_config.json
            └── preprocessor_config.json

Total model size: approximately 3–4 GB on disk.

Launching ComfyUI and First Image Generation

Start the server

cd ComfyUI
venv\Scripts\activate     # skip if using Desktop installer
python main.py --listen

Open http://localhost:8188 in Chrome 143 or later (recommended by Comfy-Org for best compatibility).

Build the Generation Workflow

Right-click the canvas → Add Node → search for Janus Model Loader. Place it.
In the node, select Janus-Pro-1B from the model dropdown.
Add a Janus Image Generation node. Connect the model output from the loader to it.
Add a Text (Multiline) primitive node. Connect its output to the prompt input on the generation node.
Add a Save Image node. Connect Janus Image Generation output to it.
In the generation node, set:
- Prompt: A photorealistic mountain landscape at dawn, golden hour lighting
- Temperature: 1.0
- CFG: 5.0
- Steps: 30
- Width / Height: 384 × 384 (native resolution; upscale afterwards)
Click Queue Prompt.

Build the Understanding Workflow

Add Janus Model Loader → Janus Image Understanding → Show Text (from ComfyUI-Custom-Scripts).
Add a Load Image node. Connect its output to the image input of Janus Image Understanding.
Set the prompt field to something like Describe this image in detail.
Run the workflow. The model outputs a text caption in the Show Text node.

Tip: you can run both understanding and generation in a single workflow by branching the model loader output into both nodes — Janus Pro shares weights across both tasks.

Advanced Configuration and Performance Tuning

Launch flags

Scenario	Flag	Effect
Limited VRAM (<6 GB)	`--lowvram`	Offloads model layers to CPU RAM between steps
Extremely limited VRAM	`--novram`	Full CPU offload; slow but functional
CPU-only system	`--cpu`	All computation on CPU; expect 5–20 min/image
Speed boost (Ampere+)	`--xformers`	Memory-efficient attention; install xformers first
FP8 quantisation	`--fp8-e4m3fn`	Reduces VRAM ~40%; minor quality trade-off

# Example: low VRAM + xformers
python main.py --listen --lowvram --xformers

Install xformers for faster inference

pip install xformers --index-url https://download.pytorch.org/whl/cu126

Native resolution and upscaling

Janus Pro 1B natively generates at 384×384 pixels. Images look soft at that size. The recommended pipeline: generate at 384×384, then upscale with an ESRGAN model (RealESRGAN_x4plus works well) also inside ComfyUI using the Image Upscale (using Model) node. This produces output at 1536×1536 or higher without rerunning generation.

How to Choose: 1B vs. 7B

Scenario	Recommendation
GPU has 8 GB VRAM or less	Stick with Janus-Pro-1B
GPU has 12–16 GB VRAM	Janus-Pro-7B; noticeably better outputs
Primarily image understanding (captioning, VQA)	7B; the 1B MMMU score (36.3) is modest
Primarily image generation at low cost	1B; still beats DALL-E 3 on GenEval (0.73 vs. 0.67)
Experimenting with ComfyUI workflows locally	1B; faster iteration, lower VRAM pressure
Production pipeline, quality critical	Consider Janus-Pro-7B or a dedicated FLUX model

Common Pitfalls and Troubleshooting

CUDA out of memory

Symptom: torch.cuda.OutOfMemoryError during model load or during inference.

Fixes (in order):

Add --lowvram to your launch command.
Reduce generation resolution to 256×256 or 384×384.
Close other GPU-using apps (Chrome hardware acceleration, games, OBS).
Add --fp8-e4m3fn for additional VRAM reduction.

Symptom: The JanusModelLoader shows no models or throws a path error.

Fix: Verify your folder structure. The node expects the model at exactly:

ComfyUI/models/Janus-Pro/Janus-Pro-1B/config.json

If you placed files elsewhere, move them or create a symlink. The folder name is case-sensitive on some systems.

torch.cuda.is_available() returns False

Symptom: ComfyUI falls back to CPU even though you have an NVIDIA GPU.

Fixes:

Confirm your driver is up to date: nvidia-smi must show a CUDA version of 12.4 or higher.
Confirm you installed the correct PyTorch build: pip show torch — the version string should contain +cu126 or +cu128. If it says +cpu, you installed the wrong wheel. Reinstall with the --index-url flag.
Verify the virtual environment is activated before launching.

Slow dependency install or import errors after custom node install

Symptom: ComfyUI fails to start after installing ComfyUI-Janus-Pro, with an ImportError or ModuleNotFoundError.

Fix: The custom node has its own requirements.txt. If you used the Desktop installer, you must use the bundled Python executable to install them:

cd ComfyUI\custom_nodes\ComfyUI-Janus-Pro
..\..\..\python_embeded\python.exe -m pip install -r requirements.txt

Do not mix the system Python with the embedded Python.

Black images or blank output

Symptom: Generation completes but the saved image is solid black or uniform grey.

Fixes:

Lower the CFG scale to 3–5. Very high CFG (10+) can saturate outputs.
Check that the model files are complete — a partial download produces invalid weights. Re-download with snapshot_download.
Try a simple, short prompt first to isolate whether it is a prompt issue or a model issue.

git-lfs not installed: large files are 134 bytes

Symptom: After git clone, pytorch_model.bin is tiny (a git-lfs pointer file).

Fix: Install git-lfs (winget install GitHub.GitLFS) and run git lfs pull inside the repo directory. Or use the snapshot_download Python method above, which handles this automatically.

Extending Your Workflow: What to Add Next

ESRGAN Upscaling: Chain a Upscale Image (using Model) node after the Janus generator. RealESRGAN_x4plus (available from the ComfyUI model manager) quadruples resolution with minimal quality loss.
ControlNet: Janus Pro does not natively support ControlNet, but you can generate a base image with Janus and then use it as an init image in a separate FLUX or SD XL ControlNet workflow for pose/edge refinement.
Batch generation: Set num_images in the generation node to 4–8 to produce multiple seeds simultaneously. GPU VRAM permitting, batches are more efficient than sequential single-image runs.
AnimateDiff: Not directly compatible with Janus Pro. AnimateDiff requires SD 1.5 or SDXL checkpoints. Use Janus for stills, AnimateDiff for animation.
Integrate with an agent loop: If you want Janus Pro's image understanding capability to feed into a broader automation pipeline, pair it with a local LLM runner. See the OpenClaw + Ollama setup guide for running local AI agents for a production-ready local stack that complements ComfyUI workflows.

FAQ

Does Janus Pro 1B work without a GPU?

Yes, using the --cpu flag. Expect 5–20 minutes per image on a modern CPU. For regular use, a GPU is strongly recommended.

What is the difference between Janus-Pro-1B and Janus-Pro-7B?

The 7B model scores 0.80 on GenEval vs. 0.73 for 1B, and 84.19 vs. 82.63 on DPG-Bench. Image quality is visibly better, especially for complex multi-object compositions. The 7B also performs substantially better on understanding tasks (MMMU 41.0 vs. 36.3). The cost is roughly 3× the VRAM and 3–4× the generation time.

Is the "1B" model actually 1 billion parameters?

No. The Hugging Face model card confirms the base model is DeepSeek-LLM-1.5b-base. The "1B" label refers to the marketing name. Actual parameter count is approximately 1.5 billion. This affects your VRAM planning: budget for ~3 GB of weights in FP16, not ~2 GB.

Can I run this on an AMD GPU?

ROCm support for PyTorch on Linux works for many users. On Windows, AMD GPU PyTorch support is experimental as of April 2026. For a Windows AMD setup, check the ComfyUI GitHub issues for current ROCm/Windows status before investing time.

Does DeepSeek plan to release a Janus Pro 2 or Janus Pro successor?

DeepSeek has not announced a Janus Pro successor as of April 2026. The DeepSeek model roadmap has prioritised language models (V3, V4, Flash variants). The Janus GitHub repository (github.com/deepseek-ai/Janus) receives maintenance but no major new releases since January 2025. Monitor the repository and the DeepSeek Hugging Face organisation for updates.

How long does image generation take on an RTX 3060?

Based on community reports, the Janus-Pro-1B model at 384×384 with 30 steps takes approximately 15–40 seconds on an RTX 3060 8 GB. The 7B model at the same resolution takes 60–120 seconds. Exact timing depends on xformers, VRAM usage, and system RAM bandwidth.

What is the native output resolution?

Janus Pro's image generation tokenizer uses a 16× downsample ratio from the LlamaGen VQ tokenizer, which corresponds to native generation at 384×384 pixels. You can request other resolutions, but quality may degrade outside this native size. Use an ESRGAN upscaler post-generation to get 1024×1024 or larger outputs.

Does this setup support image-to-image or inpainting?

Not natively through the ComfyUI-Janus-Pro nodes. The Janus Pro architecture supports image understanding (describe an image) and text-to-image generation. True image-to-image conditioning would require custom code. If you need image-to-image workflows, FLUX or SDXL remain better-supported in ComfyUI for that use case.