4 min to read
The DeepSeek Janus Pro 7B is a powerful language model designed for advanced text generation tasks. This guide provides a clear, structured approach to running the model on Hugging Face, ensuring even beginners can follow along.
Before starting, ensure you have:
transformers
, torch
, and other dependencies:pip install transformers torch torchvision torchaudio
Install Required Packages:
pip install --upgrade transformers torch
Use Hugging Face’s AutoModelForCausalLM
and AutoTokenizer
to load Janus Pro 7B:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/Janus-Pro-7B")
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/Janus-Pro-7B")
Tokenize your input prompt:
input_text = "Describe a futuristic city."
inputs = tokenizer(input_text, return_tensors="pt")
Run the model to generate text:
outputs = model.generate(
**inputs,
max_new_tokens=200, # Limit output length
temperature=0.7, # Control randomness (lower = more deterministic)
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Save this as run_janus_pro.py
:
from transformers import AutoModelForCausalLM, AutoTokenizer
def main():
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/Janus-Pro-7B")
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/Janus-Pro-7B")
# Input prompt
input_text = "Explain the impact of AI on climate change."
inputs = tokenizer(input_text, return_tensors="pt")
# Generate response
outputs = model.generate(
**inputs,
max_new_tokens=250,
temperature=0.7,
do_sample=True
)
# Decode and print
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
if __name__ == "__main__":
main()
Run the Script:
python run_janus_pro.py
transformers
updated:pip install --upgrade transformers
max_new_tokens
or use a smaller batch size.device_map="auto"
to leverage GPU/CPU resources efficiently:model = AutoModelForCausalLM.from_pretrained("deepseek-ai/Janus-Pro-7B", device_map="auto")
Here’s an expanded guide with additional insights, optimizations, and practical use cases for running DeepSeek Janus Pro 7B on Hugging Face:
Leverage Hugging Face Pipelines:
Simplify inference with the pipeline
API:
from transformers import pipeline
generator = pipeline("text-generation", model="deepseek-ai/Janus-Pro-7B")
output = generator("Write a poem about the ocean:", max_length=150)
print(output[0]['generated_text'])
Use 4-Bit Quantization (for GPU-limited systems):
Reduce memory usage by loading the model in 4-bit mode with bitsandbytes
:
from transformers import BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained(
"deepseek-ai/Janus-Pro-7B",
quantization_config=quantization_config,
device_map="auto"
)
Install bitsandbytes
first:
pip install bitsandbytes
For CPU Inference:
Add device_map="cpu"
when loading the model, but expect slower performance:
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/Janus-Pro-7B", device_map="cpu")
Minimum Requirements:
Component | Requirement |
---|---|
RAM | 32GB |
GPU | NVIDIA RTX 3090/4090 (24GB VRAM) |
Disk Space | 30GB |
Deploy via Hugging Face Inference API (Serverless):
Avoid local setup by using Hugging Face’s hosted API (requires API token):
import requests
API_URL = "https://api-inference.huggingface.co/models/deepseek-ai/Janus-Pro-7B"
headers = {"Authorization": "Bearer YOUR_API_TOKEN"}
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
output = query({"inputs": "Explain quantum computing to a 5-year-old."})
print(output)
With LangChain:
Use Janus Pro 7B in LangChain workflows for chatbots or document analysis:
from langchain.llms import HuggingFacePipeline
llm = HuggingFacePipeline.from_model_id(
model_id="deepseek-ai/Janus-Pro-7B",
task="text-generation",
model_kwargs={"temperature": 0.5, "max_length": 200}
)
response = llm("Summarize the French Revolution in 3 sentences.")
print(response)
For domain-specific tasks (e.g., medical or legal text), fine-tune Janus Pro 7B:
Training Script:
Use the Trainer
class:
from transformers import TrainingArguments, Trainer
training_args = TrainingArguments(
output_dir="./results",
per_device_train_batch_size=4,
num_train_epochs=3,
learning_rate=5e-5
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=dataset
)
trainer.train()
Prepare a Dataset:
Use a dataset in Hugging Face’s datasets
format. Example:
from datasets import load_dataset
dataset = load_dataset("your_dataset_name")
Code Generation:
Generate Python snippets (if the model is code-trained):
input_text = "Write a Python function to calculate Fibonacci numbers."
Technical Explanations:
Simplify complex topics:
input_text = "Explain how blockchain works in simple terms."
Creative Writing:
Generate stories, poems, or dialogue.
input_text = "Write a sci-fi story about a robot discovering emotions."
Customize outputs with these parameters in model.generate()
:
Parameter | Effect | Example Value |
---|---|---|
temperature |
Controls randomness (0–1). Lower = more deterministic. | 0.3 |
top_k |
Limits sampling to top-k likely tokens. | 50 |
top_p (nucleus) |
Samples from top tokens summing to top_p probability. |
0.9 |
repetition_penalty |
Reduces repetitive outputs. | 1.2 |
Example:
outputs = model.generate(
**inputs,
max_new_tokens=300,
temperature=0.5,
top_k=50,
top_p=0.95,
repetition_penalty=1.1
)
For multi-paragraph outputs, use stopping criteria:
from transformers import StoppingCriteria, StoppingCriteriaList
class StopAfterParagraph(StoppingCriteria):
def __call__(self, input_ids, scores, **kwargs):
decoded_text = tokenizer.decode(input_ids[0])
return "\n\n" in decoded_text # Stop after two newlines
stopping_criteria = StoppingCriteriaList([StopAfterParagraph()])
outputs = model.generate(
**inputs,
stopping_criteria=stopping_criteria
)
HuggingFace’s detoxify
to filter harmful content.Content Moderation:
Add a moderation layer to outputs:
from transformers import pipeline
moderator = pipeline("text-classification", model="unitary/toxic-bert")
if moderator(output_text)[0]['label'] == 'toxic':
print("Content flagged as inappropriate.")
The DeepSeek Janus Pro 7B is a versatile model for both creative and technical tasks. Experiment with parameters, integrate it into workflows, and always validate outputs for accuracy and safety.
Happy coding! 🚀
Connect with top remote developers instantly. No commitment, no risk.
Tags
Discover our most popular articles and guides
Running Android emulators on low-end PCs—especially those without Virtualization Technology (VT) or a dedicated graphics card—can be a challenge. Many popular emulators rely on hardware acceleration and virtualization to deliver smooth performance.
The demand for Android emulation has soared as users and developers seek flexible ways to run Android apps and games without a physical device. Online Android emulators, accessible directly through a web browser.
Discover the best free iPhone emulators that work online without downloads. Test iOS apps and games directly in your browser.
Top Android emulators optimized for gaming performance. Run mobile games smoothly on PC with these powerful emulators.
The rapid evolution of large language models (LLMs) has brought forth a new generation of open-source AI models that are more powerful, efficient, and versatile than ever.
ApkOnline is a cloud-based Android emulator that allows users to run Android apps and APK files directly from their web browsers, eliminating the need for physical devices or complex software installations.
Choosing the right Android emulator can transform your experience—whether you're a gamer, developer, or just want to run your favorite mobile apps on a bigger screen.
The rapid evolution of large language models (LLMs) has brought forth a new generation of open-source AI models that are more powerful, efficient, and versatile than ever.