Run DeepSeek Janus-Pro 1B on Linux/Ubuntu with ComfyUI (2026 Guide)

Run DeepSeek Janus-Pro 1B on Linux/Ubuntu with ComfyUI (2026 Guide)

Last updated April 2026 — refreshed for current model/tool versions.

DeepSeek Janus-Pro 1B is a unified multimodal model that handles both image-to-text understanding and text-to-image generation in a single 1.5B-parameter transformer. This guide covers the complete, tested installation on Ubuntu/Debian Linux using ComfyUI, with updated CUDA and PyTorch versions for 2026 hardware.

The 1B variant is the practical choice for developers working with consumer GPUs: it fits comfortably in 6–8 GB VRAM, whereas the 7B model needs 16 GB or more. If you are setting up a broader local AI agent stack, see our OpenClaw + Ollama setup guide for running local AI agents for orchestrating multiple models including Janus alongside text-only LLMs.

What changed in 2026 — read this if you followed an older guidePython: Use Python 3.12 or 3.13. Python 3.10 still works but 3.13 is now the ComfyUI default; Python 3.8 support is effectively deprecated across the ecosystem.PyTorch + CUDA: The current stable ComfyUI stack is PyTorch 2.9.1 with CUDA 13.0 (cu130). The cu118 and cu121 wheels cited in 2025 guides still install but are not the recommended path for new setups. Use cu124 if you are locked to CUDA 12.x drivers, or cu130 for CUDA 13.x.ComfyUI frontend: The project has migrated to the Comfy-Org/ComfyUI repository and moved to semantic versioning (frontend v1.42+ as of mid-2026).Janus model status: Janus-Pro-1B and Janus-Pro-7B remain the latest released Janus variants; DeepSeek has not released a Janus 2 as of April 2026. The model cards at deepseek-ai/Janus-Pro-1B on Hugging Face are current.Image resolution ceiling: Janus-Pro generates images at 384×384 px natively. This is an architectural constraint tied to the LlamaGen tokenizer and is not changeable through sampler settings. Plan your workflow accordingly.VRAM reality check: 4–6 GB VRAM is marginal for the 1B model in bfloat16; 8 GB is comfortable. The original post's "8 GB recommended" figure holds, though 6 GB GPUs (RTX 3060 6 GB, RX 6600) can work with reduced batch sizes.

TL;DR — Quick Reference

Item Recommended value (April 2026)
Python 3.12 or 3.13
PyTorch 2.9.1+cu130 (NVIDIA) or 2.9.1+cpu (CPU-only)
CUDA Toolkit 13.0 (cu130); CUDA 12.4 (cu124) also works
ComfyUI Latest from Comfy-Org/ComfyUI main branch
ComfyUI plugin CY-CHENYUE/ComfyUI-Janus-Pro (via ComfyUI Manager)
Model deepseek-ai/Janus-Pro-1B (Hugging Face)
Minimum VRAM 6 GB (marginal); 8 GB recommended
Output resolution 384×384 px (fixed by architecture)
License MIT (code) + DeepSeek Model License (weights)

What Is Janus-Pro 1B?

Janus-Pro is DeepSeek's unified multimodal framework, released January 27, 2025. It decouples visual encoding into two separate pathways — one for understanding (SigLIP-L encoder, 384×384 input) and one for generation (LlamaGen tokenizer with 16× downsample rate) — while sharing a single transformer backbone. This lets the same model accept an image and describe it, or accept a text prompt and synthesize an image, without switching between separate specialized models.

The 1B variant is built on DeepSeek-LLM-1.5b-base (1.5B parameters, marketed as 1B for rounding). The 7B variant is built on DeepSeek-LLM-7b-base. Unless you need the benchmark-level image quality of the 7B model, the 1B is the practical local deployment target.

Benchmark Performance

Published benchmark scores from the Janus-Pro technical report (arXiv 2501.17811, January 2025):

Model GenEval (%) DPG-Bench (%) MMMU (%)
Janus-Pro-7B 80.0 84.2 ~41
DALL-E 3 67.0 ~83
SD3-Medium 74.0 ~82
Janus-Pro-1B ~72 (est.) ~79 (est.) ~36 (est.)

Note: Janus-Pro-1B scores are extrapolated from DeepSeek's scaling curves; the official paper reports 7B numbers. For verified 1B scores, check the Hugging Face model card. The 384×384 output resolution is a hard limit that affects real-world usability regardless of GenEval scores.

Practical Limitations to Know Before You Install

  • 384×384 output only. The image tokenizer is fixed-resolution. Upscaling nodes can post-process, but native output is 384×384.
  • No Ollama support. Janus-Pro is not an GGUF-format model and cannot be loaded via Ollama. ComfyUI or direct Python inference are the supported paths.
  • Package conflicts are common. The Janus library installs its own version of transformers from GitHub which can conflict with other ComfyUI nodes pinning an older transformers release. Isolate in a virtual environment.
  • CPU inference is very slow. Without a CUDA-capable GPU expect 5–20 minutes per image on a modern CPU.

System Requirements

  • OS: Ubuntu 22.04 LTS or 24.04 LTS (also works on Debian 12). Other Linux distributions work if CUDA drivers are installed.
  • Python: 3.12 or 3.13 recommended. Python 3.10–3.11 remain functional but are no longer the official ComfyUI target.
  • GPU: NVIDIA GPU with CUDA support. 8 GB VRAM for comfortable 1B operation; 6 GB VRAM is possible with reduced batch sizes. RTX 3060 12 GB, RTX 3080, RTX 4070, and above all work well.
  • CUDA Toolkit: 12.4 (cu124) or 13.0 (cu130). Check your installed version with nvcc --version.
  • RAM: 16 GB system RAM minimum; 32 GB recommended when loading the model and running ComfyUI simultaneously.
  • Disk: Janus-Pro-1B model files total approximately 3 GB.
  • Dependencies: git, python3-pip, python3-venv.

Installation

1. Update and Install System Dependencies

sudo apt update && sudo apt upgrade -y
sudo apt install git python3 python3-pip python3-venv -y

2. Create a Virtual Environment

Using a virtual environment prevents dependency conflicts between ComfyUI, the Janus plugin, and any system Python packages. This step is not optional if you want a stable setup.

python3 -m venv ~/comfyui-env
source ~/comfyui-env/bin/activate

Add this activation line to your ~/.bashrc or run it each session before using ComfyUI.

3. Install PyTorch with CUDA

Choose the command matching your installed CUDA Toolkit version. Run nvcc --version to check.

CUDA 13.0 (recommended for 2026 setups):

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu130

CUDA 12.4 (if you are on CUDA 12.x drivers):

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu124

CPU-only (no GPU, slow):

pip install torch torchvision torchaudio

Verify the installation:

python3 -c "import torch; print(torch.__version__); print(torch.cuda.is_available())"

Expected output: version string (e.g., 2.9.1+cu130) and True if your GPU is detected.

4. Clone ComfyUI

git clone https://github.com/Comfy-Org/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt

Note: The canonical repository is now at Comfy-Org/ComfyUI (not the legacy comfyanonymous/ComfyUI). Both URLs currently resolve but prefer the org-owned repo for new clones.

5. Install the ComfyUI-Janus-Pro Plugin

  1. Launch ComfyUI (python3 main.py) and open http://localhost:8188.
  2. Open the Manager menu, go to Custom Node Manager, search for Janus-Pro (author: CY-CHENYUE), and click Install.
  3. Restart ComfyUI after installation completes.

First install ComfyUI Manager if not already present:

cd ComfyUI/custom_nodes
git clone https://github.com/ltdrdata/ComfyUI-Manager.git

Option B: Manual Installation

cd ComfyUI/custom_nodes
git clone https://github.com/CY-CHENYUE/ComfyUI-Janus-Pro.git
cd ComfyUI-Janus-Pro
pip install -r requirements.txt

The requirements.txt pulls the Janus library directly from GitHub (git+https://github.com/deepseek-ai/Janus.git), which installs a pinned version of the transformers library. If you see import errors about transformers after this, run pip install --upgrade transformers to reconcile versions.

Model Setup

1. Download Janus-Pro-1B from Hugging Face

The model is available at deepseek-ai/Janus-Pro-1B on Hugging Face. Total download size is approximately 3 GB.

Using the Hugging Face CLI (recommended for large downloads with resume support):

pip install huggingface_hub
huggingface-cli download deepseek-ai/Janus-Pro-1B --local-dir ~/Janus-Pro-1B

Or clone with git-lfs (requires git-lfs installed: sudo apt install git-lfs && git lfs install):

git clone https://huggingface.co/deepseek-ai/Janus-Pro-1B ~/Janus-Pro-1B

2. Place Model Files in ComfyUI's Model Directory

Create the expected directory structure inside your ComfyUI installation:

mkdir -p ComfyUI/models/Janus-Pro/Janus-Pro-1B
cp -r ~/Janus-Pro-1B/* ComfyUI/models/Janus-Pro/Janus-Pro-1B/

The final structure should look like:

ComfyUI/models/Janus-Pro/
└── Janus-Pro-1B/
    ├── config.json
    ├── generation_config.json
    ├── model.safetensors
    ├── special_tokens_map.json
    ├── tokenizer.json
    ├── tokenizer_config.json
    └── ... (other config files)

Running ComfyUI

From your ComfyUI directory, with the virtual environment active:

source ~/comfyui-env/bin/activate
cd ComfyUI
python3 main.py

Open http://localhost:8188 in your browser. To make ComfyUI accessible on your local network (e.g., from another machine), add --listen 0.0.0.0:

python3 main.py --listen 0.0.0.0

Building the Janus-Pro Workflow in ComfyUI

The ComfyUI-Janus-Pro plugin adds three core nodes:

  • JanusModelLoader — loads the model from ComfyUI/models/Janus-Pro/
  • JanusImageUnderstanding — image-to-text: accepts an image and returns a description
  • JanusImageGeneration — text-to-image: accepts a prompt and returns a 384×384 image

Text-to-Image Workflow

  1. Add a JanusModelLoader node. In the model selector, choose Janus-Pro-1B.
  2. Add a JanusImageGeneration node. Connect the model output from JanusModelLoader.
  3. Set your prompt in the text field (English, Chinese, and Japanese are supported).
  4. Adjust temperature (default 1.0; lower = more deterministic), top_p, and cfg_weight (classifier-free guidance, default 5.0) for quality/diversity trade-offs.
  5. Connect to a Save Image or Preview Image node and queue the prompt.

Image Understanding (Image-to-Text) Workflow

  1. Add a JanusModelLoader node and select Janus-Pro-1B.
  2. Add a Load Image node and connect it to a JanusImageUnderstanding node.
  3. Connect the model from JanusModelLoader to JanusImageUnderstanding.
  4. Optionally install ComfyUI-Custom-Scripts (via Manager) to get a text display node for reading the model's description output.
  5. Queue the prompt. Output is a text string with the model's image description.

Example Generation Settings for Janus-Pro-1B

{
  "model": "Janus-Pro-1B",
  "prompt": "a futuristic cityscape at dusk, neon lights reflecting on wet streets, cyberpunk style, highly detailed",
  "temperature": 1.0,
  "top_p": 0.95,
  "cfg_weight": 5.0,
  "num_images": 4
}

Generating 4 images at these settings takes approximately 30–90 seconds on an RTX 3080 (10 GB VRAM). On CPU, expect 15–30 minutes.

How to Choose: 1B vs 7B

Scenario Recommended
Consumer GPU, 6–8 GB VRAM 1B
GPU with 10–12 GB VRAM 1B (comfortable) or 7B (quantized)
GPU with 16+ GB VRAM 7B (native bfloat16)
Rapid prototyping or experimentation 1B (faster iteration)
Production image quality needed 7B, or consider Stable Diffusion 3.5 / FLUX
Image understanding only (no generation) 1B is sufficient; also consider LLaVA
CPU-only machine 1B (still slow, but possible)

If your primary use case is high-quality text-to-image generation, compare Janus-Pro against purpose-built image generation models. Janus-Pro's strength is the unified multimodal interface, not raw image quality; FLUX.1 or Stable Diffusion 3.5 Medium will outperform it on visual fidelity at the same VRAM budget.

Troubleshooting

Issue Likely Cause Solution
Model not appearing in JanusModelLoader dropdown Files not in correct path Confirm path is ComfyUI/models/Janus-Pro/Janus-Pro-1B/
torch.cuda.is_available() returns False Wrong PyTorch build or driver mismatch Reinstall PyTorch with correct cu130/cu124 wheel; run nvidia-smi to confirm driver
ImportError: cannot import from transformers.models.auto Transformers version conflict Run pip install --upgrade transformers inside the venv
CUDA out of memory VRAM insufficient for batch size Set num_images to 1; add --lowvram flag to ComfyUI launch command
INTEGER_DIVIDE_BY_ZERO error during generation Known bug in older plugin versions Update ComfyUI-Janus-Pro via Manager; check GitHub issues #31
JanusModelLoader fails to import Janus Janus library not installed Run pip install -r ComfyUI/custom_nodes/ComfyUI-Janus-Pro/requirements.txt
Slow generation on NVIDIA GPU Model running on CPU fallback Verify CUDA is available; check nvidia-smi during generation for GPU utilization

Monitoring VRAM During Generation

# In a separate terminal while ComfyUI is running:
watch -n 1 nvidia-smi

For the 1B model in bfloat16, expect 4–6 GB VRAM utilization during inference. If utilization is near zero and generation is slow, the model is running on CPU — check your PyTorch install.

Running on Lower VRAM (4–6 GB GPUs)

If you have an RTX 3060 6 GB, GTX 1080, or similar card:

  • Launch ComfyUI with --lowvram: python3 main.py --lowvram
  • Set num_images to 1 in the generation node
  • Close other GPU-intensive applications before running
  • Generation will succeed but take longer as the model is partially offloaded to system RAM

Context: Where Janus-Pro Fits in the Local AI Ecosystem

Janus-Pro-1B occupies a specific niche: it is the most accessible open-source model that does both image understanding and text-to-image generation in a single 3 GB download. For comparison:

  • FLUX.1 Schnell / Dev — superior image quality but generation-only; larger footprint (12+ GB)
  • LLaVA / Qwen2-VL — excellent image understanding, text output only, no generation
  • Stable Diffusion 3.5 Medium — better image quality than Janus-Pro, generation-only
  • GPT-4o / Gemini 2.0 Flash — cloud-only, not local

If you need the combination of vision understanding and image generation in a single local model without managing multiple separate model checkpoints, Janus-Pro-1B is currently the most practical option under 10 GB total disk usage.

Teams shipping production AI features with local models often benefit from dedicated engineering capacity. Codersera provides vetted remote developers experienced in ML infrastructure and ComfyUI pipeline deployment.

FAQ

Does Janus-Pro 1B work without a GPU?

Yes, but CPU-only inference is very slow — expect 15–30 minutes per image on a modern desktop CPU. A CUDA-capable NVIDIA GPU is strongly recommended. AMD GPUs are not officially supported but may work via ROCm on Linux.

Can I run Janus-Pro 1B on an AMD GPU on Linux?

AMD GPU support requires ROCm-enabled PyTorch. Install with pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/rocm6.2 (replace with your ROCm version). Community reports suggest it works on RX 6000/7000 series GPUs, though it is not officially tested by DeepSeek or the ComfyUI-Janus-Pro maintainer. Check the GitHub issues tracker for current AMD status.

Can I change Janus-Pro's output image resolution above 384×384?

No. The 384×384 resolution is baked into the LlamaGen tokenizer architecture. You cannot override it through ComfyUI settings. You can upscale the output using ComfyUI's upscaler nodes (e.g., ESRGAN or 4x-UltraSharp) to 768×768 or higher as a post-processing step.

What is the difference between Janus (original) and Janus-Pro?

Janus (October 2024) was the original unified multimodal model. Janus-Pro (January 2025) improved it with an optimized training strategy, expanded training data, and better instruction-following for image generation. Janus-Pro also added the 7B variant. For new installations, always use Janus-Pro.

Is there a newer version than Janus-Pro as of 2026?

As of April 2026, Janus-Pro (1B and 7B) remains the latest released Janus model. DeepSeek has not announced a Janus 2. The official GitHub repository is the authoritative source for any new releases.

Can I use Janus-Pro with Ollama?

No. Janus-Pro uses the Hugging Face Transformers format (not GGUF) and requires PyTorch for inference. Ollama only supports GGUF-format models. Use ComfyUI or direct Python inference via the transformers library.

Is Janus-Pro free to use commercially?

The code is MIT-licensed. The model weights are subject to the DeepSeek Model License, which permits research and commercial use with restrictions on redistribution and misuse. Review the full license at deepseek-ai/Janus-Pro-1B on Hugging Face before commercial deployment.

How do I update ComfyUI and the Janus-Pro plugin?

# Update ComfyUI
cd ComfyUI
git pull

# Update Janus-Pro plugin
cd ComfyUI/custom_nodes/ComfyUI-Janus-Pro
git pull
pip install -r requirements.txt

Or use the ComfyUI Manager UI: open Manager → Update All.


References & Further Reading