Stand Out From the Crowd
Professional Resume Builder
Used by professionals from Google, Meta, and Amazon
5 min to read
Running DeepSeek's Janus-Pro-7B model on Amazon Web Services (AWS) involves setting up the right environment, selecting the necessary AWS services, and deploying the model effectively.
This guide walks you through setup, optimization, and advanced use cases—perfect for developers and businesses seeking scalable AI solutions.
Janus-Pro-7B is a cutting-edge multimodal AI model that processes text and images for tasks like content generation, visual question answering, and more. Key advantages include:
AWS provides scalable infrastructure tailored for AI/ML workloads:
Service | Best For | Estimated Cost (Hourly) |
---|---|---|
EC2 (g5.xlarge) | Customizable GPU workloads | $1.006 |
SageMaker | Managed ML pipelines | $1.20 (ml.g5.xlarge) |
Recommended: Start with EC2 for full control, then migrate to SageMaker for production.
p3
or g4
for optimal performance.![AWS EC2 Setup Diagram](placeholder: insert AWS console screenshot)
Once your EC2 instance is running:
Clone the Janus-Pro Repository:
git clone https://github.com/deepseek-ai/Janus.git
cd Janus
Install PyTorch and Transformers:
pip install torch torchvision torchaudio transformers
Install Python and Pip:
sudo apt update
sudo apt install python3 python3-pip
Connect via SSH:
ssh -i your-key.pem ec2-user@your-instance-ip
Retrieve Janus-Pro-7B from Hugging Face:
pip install huggingface_hub
from huggingface_hub import hf_hub_download
model_path = hf_hub_download("deepseek-ai/Janus-Pro-7B")
Connect via SSH and run:
# Update packages
sudo apt update && sudo apt upgrade -y
# Install Python 3.10
sudo apt install python3.10 python3-pip -y
# Set up virtual environment
python3 -m venv janus-env
source janus-env/bin/activate
# Install PyTorch with CUDA 11.7
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117
# Clone Janus-Pro repo
git clone https://github.com/deepseek-ai/Janus.git
cd Janus && pip install -r requirements.txt
from huggingface_hub import snapshot_download
snapshot_download(
"deepseek-ai/Janus-Pro-7B",
local_dir="/models/janus-pro-7b",
token="YOUR_HF_TOKEN" # Get from Hugging Face settings
)
Mount S3 to EC2 using s3fs:
sudo apt install s3fs
echo ACCESS_KEY:SECRET_KEY > ~/.passwd-s3fs
chmod 600 ~/.passwd-s3fs
s3fs your-bucket /mnt/janus-model -o passwd_file=~/.passwd-s3fs
Upload model to S3:
aws s3 sync /models/janus-pro-7b s3://your-bucket/janus-pro-7b
from janus import JanusPipeline
pipe = JanusPipeline.from_pretrained("/mnt/janus-model")
image = pipe("A futuristic city at sunset").images[0]
image.save("output.jpg")
from PIL import Image
prompt = """
USER: What's in this image?
ASSISTANT: <image>
"""
input_image = Image.open("street.jpg")
result = pipe(prompt, images=[input_image], max_new_tokens=200)
print(result.text) # Output: "The image shows a busy city street with..."
Create an endpoint that scales based on demand:
from sagemaker.huggingface import HuggingFaceModel
model = HuggingFaceModel(
role='sagemaker-role',
transformers_version='4.28',
pytorch_version='2.0',
model_data='s3://your-bucket/janus-pro-7b.tar.gz',
)
predictor = model.deploy(
initial_instance_count=1,
instance_type='ml.g5.12xlarge'
)
Trigger model inference via API Gateway:
import json
import boto3
def lambda_handler(event, context):
sagemaker = boto3.client('sagemaker-runtime')
response = sagemaker.invoke_endpoint(
EndpointName='janus-pro-endpoint',
Body=json.dumps(event['body']),
ContentType='application/json'
)
return {
'statusCode': 200,
'body': response['Body'].read().decode()
}
To efficiently run Janus-Pro-7B, you need the following AWS services:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load the model and tokenizer
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/Janus-Pro-7B")
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/Janus-Pro-7B")
# Prepare input text
input_text = "Describe what you want the image to depict."
inputs = tokenizer(input_text, return_tensors="pt")
# Generate output
with torch.no_grad():
outputs = model.generate(**inputs)
# Decode and print output
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(result)
For image processing, additional libraries like PIL or OpenCV are required:
from PIL import Image
# Load an image
image = Image.open("path_to_your_image.jpg")
For processing multiple text inputs simultaneously:
batch_inputs = [tokenizer("Input text 1", return_tensors="pt"),
tokenizer("Input text 2", return_tensors="pt")]
# Process batch inputs
outputs = [model.generate(**input) for input in batch_inputs]
ModelLatency
metric (keep <500ms)Deploying DeepSeek’s Janus-Pro-7B on AWS enables robust multimodal AI applications. By leveraging AWS services effectively, you can scale efficiently while optimizing costs. This guide provides a comprehensive roadmap to setting up, deploying, and managing Janus-Pro-7B on AWS for successful AI-powered implementations.