Last updated April 2026 — refreshed for current model/tool versions.
DeepSeek Janus Pro 1B is a lightweight, open-source multimodal model that does both image understanding and image generation from a single transformer. This guide walks through every step to run it locally on Windows via ComfyUI — covering two install paths, up-to-date CUDA/PyTorch versions, verified model placement, workflow setup, and practical troubleshooting drawn from the active user community.
What changed since this post was first published (January 2025):CUDA version bump: The old guide recommendedcu118(CUDA 11.8). As of 2025, PyTorch 2.7 targets CUDA 12.6 (cu126) and 12.8 (cu128). Usecu126orcu128for all new installs on RTX 30/40/50 series.Python version: ComfyUI officially recommends Python 3.13 (3.12 is a solid fallback for custom-node compatibility). Python 3.10 still works but receives no upstream testing priority.ComfyUI Desktop installer: Comfy-Org now ships a one-click Windows installer at comfy.org/download that bundles Python and CUDA automatically — no manual venv needed for most users.ComfyUI version: The project has moved to the Comfy-Org/ComfyUI GitHub org. Latest stable release as of April 2026: v0.20.1 (April 27, 2025 — note: GitHub dates reflect the tag; the project continues active development).Model note: The Hugging Face model card for Janus-Pro-1B correctly identifies the model as using a 1.5B-parameter base (DeepSeek-LLM-1.5b-base). The "1B" name is marketing shorthand. File sizes and VRAM figures below reflect the actual 1.5B size.comfy-cli available: A new command-line tool,comfy-cli, lets you install and launch ComfyUI from a single terminal session — useful for headless or scripted setups.
TL;DR: At a glance
| Item | Value (April 2026) |
|---|---|
| Model | DeepSeek Janus-Pro-1B (1.5B params, MIT license) |
| Architecture | Unified autoregressive transformer with decoupled visual encoding |
| GenEval score | 0.73 (vs. DALL-E 3: 0.67, Janus-Pro-7B: 0.80) |
| DPG-Bench score | 82.63 (vs. DALL-E 3: 83.50, Janus-Pro-7B: 84.19) |
| Minimum GPU VRAM | ~4 GB (FP16 weights ~3 GB; allow headroom for activations) |
| Recommended GPU | NVIDIA RTX 3060 8 GB or RTX 4060 8 GB or better |
| Python | 3.13 recommended (3.12 fallback) |
| PyTorch CUDA build | cu126 or cu128 |
| ComfyUI version | v0.20.1 (April 2025); Desktop installer available |
| Custom node | ComfyUI-Janus-Pro by CY-CHENYUE |
| Model source | huggingface.co/deepseek-ai/Janus-Pro-1B |
What is DeepSeek Janus Pro 1B?
Released January 27, 2025, Janus Pro is DeepSeek's second-generation unified multimodal model series. Unlike most open-source image generators that are generation-only (Stable Diffusion, FLUX) or understanding-only (LLaVA), Janus Pro handles both tasks through one set of weights using decoupled visual encoding: a SigLIP-L encoder for understanding tasks and a LlamaGen VQ tokenizer for generation tasks, routed through a shared transformer backbone.
The 1B variant (actual parameter count: 1.5B; base: DeepSeek-LLM-1.5b-base) is the entry-level model in the Janus Pro family. It is practical for anyone with a mid-range GPU. The 7B model produces noticeably better images and scores but needs substantially more VRAM (12–16 GB recommended). If you are primarily exploring the architecture, start with 1B; for production-quality image generation, consider upgrading to 7B or running it on a cloud instance. For a broader local AI stack, see the OpenClaw + Ollama setup guide for running local AI agents — it pairs well with this workflow.
Performance and Benchmarks
The following numbers are from the official Janus-Pro paper (arXiv 2501.17811) published January 2025. No newer evaluation supersedes them as of April 2026.
| Benchmark | Janus-Pro-1B | Janus-Pro-7B | DALL-E 3 | SD XL |
|---|---|---|---|---|
| GenEval (↑) | 0.73 | 0.80 | 0.67 | 0.55 |
| DPG-Bench (↑) | 82.63 | 84.19 | 83.50 | 74.65 |
| MMBench (understanding, ↑) | 75.5 | 79.2 | — | — |
| MMMU (understanding, ↑) | 36.3 | 41.0 | — | — |
Key takeaway for the 1B model: it exceeds DALL-E 3 on GenEval (0.73 vs 0.67) at a tiny fraction of the cost — no API bills, no internet connection required. On understanding tasks it lags behind the 7B and behind frontier VLMs like LLaVA-1.6 34B, but it is serviceable for captioning and description at this weight class.
System Requirements
| Component | Minimum | Recommended |
|---|---|---|
| OS | Windows 10 64-bit | Windows 11 64-bit |
| CPU | Intel Core i5 8th Gen / Ryzen 5 2600 | Intel Core i7 12th Gen / Ryzen 7 5800X |
| RAM | 16 GB | 32 GB |
| GPU | NVIDIA GTX 1080 (8 GB VRAM) | NVIDIA RTX 3060 8 GB or RTX 4060 8 GB |
| VRAM | 4 GB (tight; use --lowvram) |
8 GB (comfortable) |
| Storage | 15 GB SSD | 30 GB NVMe SSD |
| CUDA Toolkit | 12.4 | 12.6 or 12.8 |
| Python | 3.10 | 3.13 (3.12 fallback) |
CPU-only: Technically possible using ComfyUI's --cpu flag, but generation takes 5–20 minutes per image. Not recommended for regular use.
AMD GPU: ROCm on Linux works for some users; Windows ROCm support for PyTorch is experimental as of April 2026. This guide focuses on NVIDIA.
Two Installation Paths
Path A: ComfyUI Desktop Installer (Recommended for most users)
The official ComfyUI Desktop app, maintained by Comfy-Org, bundles Python, CUDA dependencies, and ComfyUI-Manager automatically. It is the fastest way to reach a working install.
- Go to comfy.org/download and download the Windows installer (
.exe). - Double-click the installer. Select NVIDIA GPU when prompted.
- ComfyUI will install to a folder of your choice, download the required PyTorch CUDA build, and create a desktop shortcut.
- Launch ComfyUI Desktop. The browser UI opens at
http://localhost:8188automatically. - Proceed to the Install the Janus Pro Custom Node section below.
When to prefer Path A: new to ComfyUI, not comfortable with virtual environments, or want automatic updates.
Path B: Manual Install (Full control)
1. Install Python
Download Python 3.13 (recommended) or Python 3.12 from python.org. During installation, check Add Python to PATH. Verify:
python --version
# Expected: Python 3.13.x or 3.12.x
2. Install CUDA Toolkit
Check your GPU driver version first with nvidia-smi in Command Prompt. The reported CUDA version in the output is the maximum your driver supports — you can install any toolkit version up to that number.
For RTX 30/40 series (recommended): CUDA 12.6. For RTX 50 series: CUDA 12.8.
nvcc --version
# Expected: release 12.6 or 12.8
3. Clone ComfyUI and Create a Virtual Environment
git clone https://github.com/Comfy-Org/ComfyUI.git
cd ComfyUI
python -m venv venv
venv\Scripts\activate
# Install PyTorch with CUDA 12.6 (use cu128 if you have CUDA 12.8)
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
# Install ComfyUI dependencies
pip install -r requirements.txt
Note: The old cu118 build still works for RTX 30 series cards but misses performance improvements in cuDNN 9.x and Flash Attention 2 optimisations available in cu126. Use cu118 only if you have Ampere-generation cards and encounter driver incompatibilities.
4. Optional: install via comfy-cli
If you prefer a single-command install for headless or scripted setups:
pip install comfy-cli
comfy --workspace=C:\ComfyUI install
comfy launch
This downloads ComfyUI and ComfyUI-Manager into the workspace directory and starts the server.
Install the Janus Pro Custom Node
The ComfyUI-Janus-Pro extension by CY-CHENYUE adds three dedicated nodes: JanusModelLoader, JanusImageUnderstanding, and JanusImageGeneration.
Method 1: ComfyUI Manager (easiest)
- In the ComfyUI web UI, open Manager → Custom Node Manager.
- Search for
Janus-Pro. - Select the result by CY-CHENYUE and click Install.
- Restart ComfyUI when prompted.
Method 2: Manual Git Clone
# From the ComfyUI root directory:
cd custom_nodes
git clone https://github.com/CY-CHENYUE/ComfyUI-Janus-Pro.git
# Install dependencies (Desktop portable Python):
..\..\..\python_embeded\python.exe -m pip install -r ComfyUI-Janus-Pro\requirements.txt
# Or for manual venv:
pip install -r ComfyUI-Janus-Pro\requirements.txt
Restart ComfyUI after installation.
Download the Janus-Pro-1B Model
Download all files for the model from the official Hugging Face repository: huggingface.co/deepseek-ai/Janus-Pro-1B
You need every file from the repository, including all JSON config files and the model weights (pytorch_model.bin or sharded .safetensors files if present).
Recommended: use huggingface-hub for reliable downloads:
pip install huggingface-hub
python -c "
from huggingface_hub import snapshot_download
snapshot_download(
repo_id='deepseek-ai/Janus-Pro-1B',
local_dir='ComfyUI/models/Janus-Pro/Janus-Pro-1B'
)
"
Or clone with git-lfs if you have it installed:
git lfs install
git clone https://huggingface.co/deepseek-ai/Janus-Pro-1B ComfyUI/models/Janus-Pro/Janus-Pro-1B
Place the files in the following structure:
ComfyUI/
└── models/
└── Janus-Pro/
└── Janus-Pro-1B/
├── config.json
├── generation_config.json
├── model.safetensors (or pytorch_model.bin)
├── special_tokens_map.json
├── tokenizer.json
├── tokenizer_config.json
└── preprocessor_config.json
Total model size: approximately 3–4 GB on disk.
Launching ComfyUI and First Image Generation
Start the server
cd ComfyUI
venv\Scripts\activate # skip if using Desktop installer
python main.py --listen
Open http://localhost:8188 in Chrome 143 or later (recommended by Comfy-Org for best compatibility).
Build the Generation Workflow
- Right-click the canvas → Add Node → search for Janus Model Loader. Place it.
- In the node, select
Janus-Pro-1Bfrom the model dropdown. - Add a Janus Image Generation node. Connect the model output from the loader to it.
- Add a Text (Multiline) primitive node. Connect its output to the prompt input on the generation node.
- Add a Save Image node. Connect Janus Image Generation output to it.
- In the generation node, set:
- Prompt:
A photorealistic mountain landscape at dawn, golden hour lighting - Temperature:
1.0 - CFG:
5.0 - Steps:
30 - Width / Height:
384 × 384(native resolution; upscale afterwards)
- Prompt:
- Click Queue Prompt.
Build the Understanding Workflow
- Add Janus Model Loader → Janus Image Understanding → Show Text (from ComfyUI-Custom-Scripts).
- Add a Load Image node. Connect its output to the image input of Janus Image Understanding.
- Set the prompt field to something like
Describe this image in detail. - Run the workflow. The model outputs a text caption in the Show Text node.
Tip: you can run both understanding and generation in a single workflow by branching the model loader output into both nodes — Janus Pro shares weights across both tasks.
Advanced Configuration and Performance Tuning
Launch flags
| Scenario | Flag | Effect |
|---|---|---|
| Limited VRAM (<6 GB) | --lowvram |
Offloads model layers to CPU RAM between steps |
| Extremely limited VRAM | --novram |
Full CPU offload; slow but functional |
| CPU-only system | --cpu |
All computation on CPU; expect 5–20 min/image |
| Speed boost (Ampere+) | --xformers |
Memory-efficient attention; install xformers first |
| FP8 quantisation | --fp8-e4m3fn |
Reduces VRAM ~40%; minor quality trade-off |
# Example: low VRAM + xformers
python main.py --listen --lowvram --xformers
Install xformers for faster inference
pip install xformers --index-url https://download.pytorch.org/whl/cu126
Native resolution and upscaling
Janus Pro 1B natively generates at 384×384 pixels. Images look soft at that size. The recommended pipeline: generate at 384×384, then upscale with an ESRGAN model (RealESRGAN_x4plus works well) also inside ComfyUI using the Image Upscale (using Model) node. This produces output at 1536×1536 or higher without rerunning generation.
How to Choose: 1B vs. 7B
| Scenario | Recommendation |
|---|---|
| GPU has 8 GB VRAM or less | Stick with Janus-Pro-1B |
| GPU has 12–16 GB VRAM | Janus-Pro-7B; noticeably better outputs |
| Primarily image understanding (captioning, VQA) | 7B; the 1B MMMU score (36.3) is modest |
| Primarily image generation at low cost | 1B; still beats DALL-E 3 on GenEval (0.73 vs. 0.67) |
| Experimenting with ComfyUI workflows locally | 1B; faster iteration, lower VRAM pressure |
| Production pipeline, quality critical | Consider Janus-Pro-7B or a dedicated FLUX model |
Common Pitfalls and Troubleshooting
CUDA out of memory
Symptom: torch.cuda.OutOfMemoryError during model load or during inference.
Fixes (in order):
- Add
--lowvramto your launch command. - Reduce generation resolution to 256×256 or 384×384.
- Close other GPU-using apps (Chrome hardware acceleration, games, OBS).
- Add
--fp8-e4m3fnfor additional VRAM reduction.
Model not found / dropdown empty
Symptom: The JanusModelLoader shows no models or throws a path error.
Fix: Verify your folder structure. The node expects the model at exactly:
ComfyUI/models/Janus-Pro/Janus-Pro-1B/config.json
If you placed files elsewhere, move them or create a symlink. The folder name is case-sensitive on some systems.
torch.cuda.is_available() returns False
Symptom: ComfyUI falls back to CPU even though you have an NVIDIA GPU.
Fixes:
- Confirm your driver is up to date:
nvidia-smimust show a CUDA version of 12.4 or higher. - Confirm you installed the correct PyTorch build:
pip show torch— the version string should contain+cu126or+cu128. If it says+cpu, you installed the wrong wheel. Reinstall with the--index-urlflag. - Verify the virtual environment is activated before launching.
Slow dependency install or import errors after custom node install
Symptom: ComfyUI fails to start after installing ComfyUI-Janus-Pro, with an ImportError or ModuleNotFoundError.
Fix: The custom node has its own requirements.txt. If you used the Desktop installer, you must use the bundled Python executable to install them:
cd ComfyUI\custom_nodes\ComfyUI-Janus-Pro
..\..\..\python_embeded\python.exe -m pip install -r requirements.txt
Do not mix the system Python with the embedded Python.
Black images or blank output
Symptom: Generation completes but the saved image is solid black or uniform grey.
Fixes:
- Lower the CFG scale to 3–5. Very high CFG (10+) can saturate outputs.
- Check that the model files are complete — a partial download produces invalid weights. Re-download with
snapshot_download. - Try a simple, short prompt first to isolate whether it is a prompt issue or a model issue.
git-lfs not installed: large files are 134 bytes
Symptom: After git clone, pytorch_model.bin is tiny (a git-lfs pointer file).
Fix: Install git-lfs (winget install GitHub.GitLFS) and run git lfs pull inside the repo directory. Or use the snapshot_download Python method above, which handles this automatically.
Extending Your Workflow: What to Add Next
- ESRGAN Upscaling: Chain a
Upscale Image (using Model)node after the Janus generator. RealESRGAN_x4plus (available from the ComfyUI model manager) quadruples resolution with minimal quality loss. - ControlNet: Janus Pro does not natively support ControlNet, but you can generate a base image with Janus and then use it as an init image in a separate FLUX or SD XL ControlNet workflow for pose/edge refinement.
- Batch generation: Set
num_imagesin the generation node to 4–8 to produce multiple seeds simultaneously. GPU VRAM permitting, batches are more efficient than sequential single-image runs. - AnimateDiff: Not directly compatible with Janus Pro. AnimateDiff requires SD 1.5 or SDXL checkpoints. Use Janus for stills, AnimateDiff for animation.
- Integrate with an agent loop: If you want Janus Pro's image understanding capability to feed into a broader automation pipeline, pair it with a local LLM runner. See the OpenClaw + Ollama setup guide for running local AI agents for a production-ready local stack that complements ComfyUI workflows.
FAQ
Does Janus Pro 1B work without a GPU?
Yes, using the --cpu flag. Expect 5–20 minutes per image on a modern CPU. For regular use, a GPU is strongly recommended.
What is the difference between Janus-Pro-1B and Janus-Pro-7B?
The 7B model scores 0.80 on GenEval vs. 0.73 for 1B, and 84.19 vs. 82.63 on DPG-Bench. Image quality is visibly better, especially for complex multi-object compositions. The 7B also performs substantially better on understanding tasks (MMMU 41.0 vs. 36.3). The cost is roughly 3× the VRAM and 3–4× the generation time.
Is the "1B" model actually 1 billion parameters?
No. The Hugging Face model card confirms the base model is DeepSeek-LLM-1.5b-base. The "1B" label refers to the marketing name. Actual parameter count is approximately 1.5 billion. This affects your VRAM planning: budget for ~3 GB of weights in FP16, not ~2 GB.
Can I run this on an AMD GPU?
ROCm support for PyTorch on Linux works for many users. On Windows, AMD GPU PyTorch support is experimental as of April 2026. For a Windows AMD setup, check the ComfyUI GitHub issues for current ROCm/Windows status before investing time.
Does DeepSeek plan to release a Janus Pro 2 or Janus Pro successor?
DeepSeek has not announced a Janus Pro successor as of April 2026. The DeepSeek model roadmap has prioritised language models (V3, V4, Flash variants). The Janus GitHub repository (github.com/deepseek-ai/Janus) receives maintenance but no major new releases since January 2025. Monitor the repository and the DeepSeek Hugging Face organisation for updates.
How long does image generation take on an RTX 3060?
Based on community reports, the Janus-Pro-1B model at 384×384 with 30 steps takes approximately 15–40 seconds on an RTX 3060 8 GB. The 7B model at the same resolution takes 60–120 seconds. Exact timing depends on xformers, VRAM usage, and system RAM bandwidth.
What is the native output resolution?
Janus Pro's image generation tokenizer uses a 16× downsample ratio from the LlamaGen VQ tokenizer, which corresponds to native generation at 384×384 pixels. You can request other resolutions, but quality may degrade outside this native size. Use an ESRGAN upscaler post-generation to get 1024×1024 or larger outputs.
Does this setup support image-to-image or inpainting?
Not natively through the ComfyUI-Janus-Pro nodes. The Janus Pro architecture supports image understanding (describe an image) and text-to-image generation. True image-to-image conditioning would require custom code. If you need image-to-image workflows, FLUX or SDXL remain better-supported in ComfyUI for that use case.
References and Further Reading
- Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling (arXiv 2501.17811)
- deepseek-ai/Janus-Pro-1B — Hugging Face model card
- deepseek-ai/Janus — Official GitHub repository
- CY-CHENYUE/ComfyUI-Janus-Pro — Custom node source
- Comfy-Org/ComfyUI — Release history
- ComfyUI Official Documentation — System Requirements
- ComfyUI Desktop — Windows Installer Download
- PyTorch — Get Started (official CUDA build selector)