Last updated April 2026 — refreshed for current model/tool versions.
DeepSeek Janus Pro 7B is a unified multimodal model that handles both image understanding and text-to-image generation in a single framework — an architectural approach that places it in direct competition with DALL-E 3 and Stable Diffusion 3 on standard benchmarks. This guide covers everything you need to install and run it locally on Windows using ComfyUI, updated for the current ComfyUI portable release (Python 3.13, CUDA 13.0) and the CY-CHENYUE Janus-Pro plugin.
What changed since the original January 2025 guidePython requirement raised: The original guide cited Python 3.8–3.11. The current ComfyUI portable ships with Python 3.13 (best-supported as of 2025–2026); Python 3.12 is the recommended fallback. Python 3.8 and 3.9 are unsupported in the current ComfyUI portable environment.CUDA updated: ComfyUI portable now bundles PyTorch with CUDA 13.0. The correct PyTorch install command ispip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu130.ComfyUI Manager is now the recommended install path for the Janus-Pro plugin — faster and handles dependency resolution automatically.Model folder path changed: Place model files underComfyUI/models/Janus-Pro/Janus-Pro-7B/(not amodels/Janus-Pro-7B/root-level folder as some older guides show).VRAM reality check: 16 GB VRAM is the practical minimum; 24 GB (RTX 3090/4090) is recommended for the 7B variant. On L4-class cloud hardware the model uses ~22 GB VRAM and generates images in ~15 seconds each.Janus Pro 7B remains the latest public release of the Janus series as of April 2026. No successor model has been publicly announced. The GitHub repo shows no formal releases beyond the 2025-01-27 Janus-Pro drop.
TL;DR — Key Facts
| Item | Value |
|---|---|
| Model release date | January 27, 2025 |
| Model license | DeepSeek Model License (weights); MIT (code) |
| Parameter count | 7B (also: Janus-Pro-1B for low-VRAM setups) |
| Minimum VRAM | 16 GB (RTX 4080 or RTX 3090) |
| Recommended VRAM | 24 GB (RTX 3090, RTX 4090) |
| Python version | 3.13 (best); 3.12 (fallback) |
| PyTorch / CUDA | PyTorch 2.4+ with CUDA 13.0 |
| GenEval score | 80.0% (vs DALL-E 3: 67%, SD3: 74%) |
| DPG-Bench score | 84.1% (vs DALL-E 3: 79.2%) |
| Disk space needed | ~30 GB for 7B model + ComfyUI |
What Is DeepSeek Janus Pro 7B?
Janus Pro is DeepSeek's unified multimodal large language model, built on a 7B-parameter base (DeepSeek-LLM-7b-base) with a decoupled visual encoding architecture. Unlike traditional models that share one visual encoder for both understanding and generation, Janus Pro routes these tasks through separate pathways:
- Understanding path: SigLIP-L visual encoder (384×384 image input) — extracts high-dimensional semantic features for image captioning and VQA tasks.
- Generation path: LlamaGen VQ tokenizer (16× downsample rate) — converts images to discrete token IDs for autoregressive text-to-image generation.
Both paths feed into a shared autoregressive transformer. The result is a single model that can describe an image in natural language, answer questions about it, and generate new images from text prompts — all without switching tools.
If you are thinking about broader local AI deployment (not just image generation), the OpenClaw + Ollama setup guide for running local AI agents covers how to orchestrate multiple local models including multimodal ones through a unified agent layer.
Benchmark Performance (2025 Published Data)
Janus Pro 7B was evaluated on two primary text-to-image generation benchmarks at release. These numbers are from the official arXiv paper (2501.17811) and have not been superseded by independent 2026 re-evaluations at the time of writing.
| Benchmark | Janus-Pro-7B | DALL-E 3 | SD3-Medium | Emu3-Gen |
|---|---|---|---|---|
| GenEval (overall) | 80.0% | 67.0% | 74.0% | — |
| DPG-Bench | 84.1% | 79.2% | — | 71.1% |
GenEval tests how precisely a model follows compositional text prompts (counting objects, positioning, attributes). DPG-Bench evaluates dense, detailed prompt adherence. Janus Pro 7B leads both, though real-world image quality at 512×512 may feel softer than dedicated diffusion models like SDXL or FLUX for purely aesthetic generation. It shines on prompt accuracy and multimodal round-trip tasks (describe → modify → regenerate).
1B vs 7B: Which Variant to Use
| Variant | VRAM | Speed (L4 GPU) | Image Quality | Best For |
|---|---|---|---|---|
| Janus-Pro-1B | ~8 GB | ~6–8 s/image | Good | Low-VRAM GPUs, rapid iteration |
| Janus-Pro-7B | ~16–22 GB | ~15 s/image | Better | High-VRAM GPUs, production quality |
Speed figures are from cloud L4 hardware testing. Consumer RTX 4090 results are similar; RTX 3080 (10 GB) cannot run the 7B variant without quantization.
System Requirements (April 2026)
- OS: Windows 10 64-bit or Windows 11 (both fully supported)
- CPU: 8+ cores; 16 cores recommended
- RAM: 32 GB minimum; 64 GB recommended for large batch generation
- GPU: NVIDIA RTX series with 16 GB+ VRAM for Janus-Pro-7B; 8 GB+ for Janus-Pro-1B. RTX 3090, RTX 4080, RTX 4090 are all well-tested. RTX 3080 (10 GB) requires quantization.
- Disk: 100 GB free (ComfyUI + model weights + generated outputs)
- Python: 3.13 (recommended); 3.12 (fallback if custom node issues arise). Do not use Python 3.8 or 3.9 — these are no longer compatible with current ComfyUI.
- CUDA Toolkit: Match your NVIDIA driver. ComfyUI portable ships with CUDA 13.0.
- NVIDIA driver: Update via GeForce Experience or the NVIDIA Driver Portal before installation.
Pre-Installation Checklist
- Update NVIDIA GPU drivers to the latest stable version.
- Confirm at least 100 GB free disk space on your install drive (SSD strongly recommended for model loading speed).
- Temporarily disable antivirus real-time protection during installation to prevent false-positive blocks on downloaded model weights.
- Ensure a stable internet connection — the Janus-Pro-7B weights total roughly 14 GB from Hugging Face.
Installation Path A: ComfyUI Portable (Recommended for Most Users)
The portable build is the fastest path. It ships with Python 3.13 and PyTorch CUDA 13.0 pre-installed — no separate Python or CUDA Toolkit installation required.
Step 1: Download the ComfyUI Portable Package
- Go to the ComfyUI releases page on GitHub.
- Download the latest Windows portable zip (look for
ComfyUI_windows_portable_nvidia.7zor equivalent). - Extract with 7-Zip to a folder such as
C:\ComfyUI.
Step 2: Launch ComfyUI
Double-click run_nvidia_gpu.bat inside the extracted folder. ComfyUI will start and open in your browser at http://localhost:8188. Confirm it loads correctly before installing the plugin.
Step 3: Install ComfyUI Manager
ComfyUI Manager is not bundled by default but is required for one-click plugin installation.
- Stop ComfyUI (Ctrl+C in the terminal).
- Open a terminal in the
ComfyUI/custom_nodesfolder. - Clone ComfyUI Manager:
git clone https://github.com/ltdrdata/ComfyUI-Manager.git
- Restart ComfyUI (
run_nvidia_gpu.bat). A "Manager" button will appear in the top-right menu.
Step 4: Install the ComfyUI-Janus-Pro Plugin via Manager
- In the ComfyUI web interface, click Manager → Custom Nodes Manager.
- Search for
Janus-Pro. - Find the entry by author CY-CHENYUE and click Install.
- When installation completes, click Restart, then manually refresh your browser (F5).
Step 5: Download Janus Pro Model Weights
- Visit deepseek-ai/Janus-Pro-7B on Hugging Face (or the community mirror).
- Download all files (config JSONs, tokenizer files, and
.binmodel weight files). - Create this folder structure inside your ComfyUI installation:
ComfyUI/
└── models/
└── Janus-Pro/
└── Janus-Pro-7B/
├── config.json
├── pytorch_model.bin (or sharded .bin files)
├── tokenizer.json
├── tokenizer_config.json
└── special_tokens_map.json
For Janus-Pro-1B, create ComfyUI/models/Janus-Pro/Janus-Pro-1B/ and place files there instead.
Tip: Use huggingface-cli for faster parallel downloads:
pip install huggingface_hub
huggingface-cli download deepseek-ai/Janus-Pro-7B --local-dir ./Janus-Pro-7B
Installation Path B: Manual Python Environment
Use this path if you need custom Python/CUDA versions or already have an existing Python environment.
Step B1: Install Python and Git
- Download Python 3.13 (or 3.12) from python.org. During installation, check "Add Python to PATH."
- Download and install Git for Windows using default settings.
Step B2: Create a Virtual Environment
python -m venv comfyui-env
comfyui-env\Scripts\activate
Step B3: Install PyTorch with CUDA 13.0
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu130
Verify the install: python -c "import torch; print(torch.cuda.is_available(), torch.version.cuda)" should print True 13.0.
Step B4: Clone and Install ComfyUI
git clone https://github.com/comfy-org/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt
Step B5: Install ComfyUI-Janus-Pro Plugin Manually
cd custom_nodes
git clone https://github.com/CY-CHENYUE/ComfyUI-Janus-Pro.git
cd ComfyUI-Janus-Pro
pip install -r requirements.txt
Step B6: Download Model Weights and Launch
Follow the same model download and folder structure from Step 5 above. Then launch ComfyUI from the repo root:
python main.py
ComfyUI will be accessible at http://localhost:8188.
First Run: Building a Janus Pro Workflow in ComfyUI
After installation, ComfyUI exposes Janus Pro as a set of custom nodes. A minimal image-generation workflow uses five connected nodes:
- Janus Model Loader — select
Janus-Pro-7B(orJanus-Pro-1B) from the dropdown. - Text Input — your generation prompt (e.g., "a photorealistic mountain lake at golden hour").
- Janus Image Generator — set
num_imagesto 1 for the first test,resolutionto 512 for speed. - Image Preview — displays the output in the browser.
- Save Image — writes to
ComfyUI/output/.
For image understanding (captioning / VQA), replace the generator node with the Janus Image Analyzer node and connect an image loader.
Quick Test Prompt
prompt = "a futuristic cityscape at night, neon reflections on wet streets, photorealistic"
num_images = 1
resolution = 512 # increase to 768 once confirmed working
Click Queue Prompt. First run will be slower as the model loads into VRAM; subsequent generations are faster.
Decision Tree: Should You Use Janus Pro 7B?
- You want multimodal in one model (understand + generate): Yes, Janus Pro is designed for this.
- You want the highest aesthetic quality for pure image generation: Consider FLUX.1 or SDXL first — dedicated diffusion models still lead on pure visual quality for most styles.
- You have <16 GB VRAM: Use Janus-Pro-1B instead. The 7B will OOM on cards with 10 GB or less.
- You want to run on CPU only: Technically possible but prohibitively slow (hours per image). Not practical.
- You need commercial use: Review the DeepSeek Model License carefully before production deployment — it has usage restrictions that differ from pure MIT/Apache-2.0 models.
Pro Tips for Optimal Performance
- FP16 precision: The model defaults to FP16 via ComfyUI; do not override to FP32 unless debugging. FP16 cuts VRAM usage roughly in half compared to FP32.
- VRAM management: Close VRAM-hungry background applications (browsers with many tabs, games, Discord with hardware acceleration) before starting ComfyUI.
- Multi-GPU: For multi-GPU setups, assign task affinity with
set CUDA_VISIBLE_DEVICES=0in the batch file before launching. - Prompt tuning: Janus Pro responds well to structured, specific prompts. Include lighting conditions, camera angle, and style references rather than abstract adjectives.
- Batch processing: Set
num_imagesto 4–8 for batch generation; images are generated sequentially, but the model stays loaded in VRAM between them. - Keep up to date: Run
git pullin bothComfyUI/andComfyUI/custom_nodes/ComfyUI-Janus-Pro/regularly to pick up bug fixes.
Troubleshooting Common Issues
Issue: CUDA Out of Memory (OOM)
Symptoms: RuntimeError: CUDA out of memory during model load or generation.
Fix: Clear VRAM and switch to FP16:
torch.cuda.empty_cache()
model.half() # forces FP16 precision
Alternatively, switch to Janus-Pro-1B, or reduce resolution to 384 or 512.
Issue: Missing Dependencies / Import Errors
Symptoms: Node fails to load; ComfyUI startup shows a red error in the terminal for the Janus-Pro node.
Fix: Force-reinstall the plugin requirements:
pip install --force-reinstall -r custom_nodes/ComfyUI-Janus-Pro/requirements.txt
For the portable installation, use:
python_embeded\python.exe -m pip install --force-reinstall -r custom_nodes\ComfyUI-Janus-Pro\requirements.txt
Issue: Slow Generation
Symptoms: Images take significantly longer than the expected ~15 seconds on NVIDIA hardware.
Fix: Confirm GPU acceleration is active: open Windows Settings → Display → Graphics Settings and ensure your Python executable or ComfyUI is set to "High performance." Also check that PyTorch sees your GPU: python -c "import torch; print(torch.cuda.get_device_name(0))".
Issue: Hugging Face Download Fails
Symptoms: requests.exceptions.ConnectionError or incomplete downloads.
Fix: Authenticate with the Hugging Face CLI (required for some gated models) and use the CLI downloader:
huggingface-cli login
huggingface-cli download deepseek-ai/Janus-Pro-7B --local-dir ./Janus-Pro-7B --resume-download
Issue: Model Not Appearing in ComfyUI Dropdown
Symptoms: Janus Model Loader node shows no models.
Fix: Verify the exact folder path. It must be ComfyUI/models/Janus-Pro/Janus-Pro-7B/ with all required files present. Restart ComfyUI after confirming the structure.
Issue: Python 3.8 or 3.9 No Longer Works
Symptoms: Dependency install fails with syntax errors or unsupported version warnings.
Fix: Upgrade to Python 3.13 or 3.12. The ComfyUI portable ships with Python 3.13 and is the simplest path. Python 3.8/3.9 are not supported in current ComfyUI releases.
2026 Context: Alternatives and Ecosystem
Janus Pro 7B remains the latest public Janus release as of April 2026 — no successor model has been announced on the official GitHub. However, the multimodal generation landscape has continued evolving:
- FLUX.1 (Black Forest Labs): Better raw image quality than Janus Pro for purely aesthetic text-to-image generation, though it is image-generation only (no multimodal understanding).
- Llama 4 Maverick (Meta, 2026): 17B active parameters with multimodal input support; outperforms GPT-4o and Gemini 2.0 Flash on several benchmarks but is larger and requires more VRAM.
- Qwen-Image 2.0 (Alibaba): Strong alternative with competitive multimodal understanding, particularly for Chinese-language prompts.
- HunyuanImage 3.0 (Tencent): Leads on certain evaluation metrics among Chinese multimodal models.
- JanusFlow-1.3B: A lightweight Janus variant using rectified flow for generation; useful for hardware with under 8 GB VRAM.
For teams evaluating which local AI stack to standardize on, the companion Windows guide for running Janus Pro without ComfyUI covers the bare Python API approach, which is useful for batch scripting and headless server deployments.
If your team needs dedicated AI engineers to build or maintain local AI pipelines, Codersera's vetted AI engineer network can accelerate the process.
FAQ
Can I run Janus Pro 7B on an RTX 3080 (10 GB VRAM)?
Not reliably in full-precision FP16. The 7B model requires approximately 14–22 GB of VRAM depending on batch size and resolution. An RTX 3080 with 10 GB will OOM during model load. Use Janus-Pro-1B instead, which runs comfortably on 8 GB VRAM.
Does Janus Pro 7B work without a GPU (CPU-only)?
It can run on CPU, but generation times are prohibitively slow — expect 10 to 60 minutes per image depending on hardware. GPU is effectively required for practical use.
What is the difference between Janus, JanusFlow, and Janus Pro?
Janus (1.3B) was the original unified multimodal model. JanusFlow (1.3B) is a variant using rectified flow for generation — lighter but less flexible. Janus Pro (1B/7B) is the production version with improved training data, optimized training strategy, and significantly better benchmark scores. Janus Pro 7B is the current recommended variant for quality work.
Is Janus Pro 7B free to use commercially?
The code is MIT-licensed, but the model weights are under the DeepSeek Model License, which imposes usage restrictions. Review the license at the Hugging Face model page before commercial deployment.
Can Janus Pro 7B both understand and generate images?
Yes — this is its defining feature. You can feed an image and ask it to describe or analyze the content (multimodal understanding), and you can provide a text prompt and generate a new image (text-to-image generation). Both capabilities are available as separate ComfyUI nodes from the CY-CHENYUE plugin.
Which ComfyUI plugin should I use for Janus Pro?
The primary actively maintained plugin is CY-CHENYUE/ComfyUI-Janus-Pro. There are also alternatives: ComfyUI_Janus_Wrapper by chflame163 and ComfyUI-Janus_pro_vision by ShmuelRonen (vision-language focused). Install via ComfyUI Manager to avoid dependency conflicts.
My ComfyUI shows "CUDA not available" — what do I do?
First verify your NVIDIA driver is up to date. Then confirm PyTorch sees CUDA: python -c "import torch; print(torch.cuda.is_available())". If this returns False, reinstall PyTorch with the correct CUDA flag for your driver version. For the portable build, run update_comfyui.bat which will pull the correct PyTorch variant automatically.
How do I update Janus Pro or ComfyUI?
Run git pull in both the ComfyUI root directory and custom_nodes/ComfyUI-Janus-Pro/. For the portable version, use the included update_comfyui.bat script. Re-run pip install -r requirements.txt after updates in case new dependencies were added.
References & Further Reading
- deepseek-ai/Janus-Pro-7B — Hugging Face Model Card
- deepseek-ai/Janus — Official GitHub Repository
- Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling (arXiv 2501.17811)
- CY-CHENYUE/ComfyUI-Janus-Pro — Plugin GitHub Repository
- ComfyUI Official System Requirements Documentation
- DeepSeek Janus Pro ComfyUI Workflow — ComfyUI Wiki
- deepseek-community/Janus-Pro-7B — Community Mirror on Hugging Face
- ComfyUI Releases — GitHub (portable Windows builds)