Unleash Your Creativity
AI Image Editor
Create, edit, and transform images with AI - completely free
3 min to read
DeepSeek R1 is a state-of-the-art AI model excelling in math, coding, and logical reasoning tasks. Running it locally on a Linux VM ensures privacy, reduces costs, and avoids cloud latency. This guide walks you through selecting the right model, installing it, and integrating it via API—even if you’re new to AI!
DeepSeek R1 offers distilled models optimized for different hardware:
Model | VRAM Requirement | Use Case |
---|---|---|
DeepSeek-R1-Distill-Qwen-1.5B | ~3.5 GB | Lightweight tasks, low-resource VMs |
DeepSeek-R1-Distill-Qwen-7B | ~16 GB | Balanced performance (recommended for most users) 12 |
DeepSeek-R1-Distill-Llama-70B | ~161 GB | High-end tasks requiring multi-GPU setups |
For Beginners: Start with the 7B model (4.7GB download) for a balance of speed and capability 210.
sudo apt update && sudo apt install -y curl python3-pip
Ollama simplifies local AI model management. Install it via:
curl -fsSL https://ollama.com/install.sh | sh
Verify installation:
ollama --version # Should display "ollama version 0.5.7" or later :cite[7]:cite[10]
Pull the 7B model (adjust 7b
to 1.5b
or 70b
as needed):
ollama pull deepseek-r1:7b
Check installed models:
ollama list # Should list "deepseek-r1:7b" :cite[10]
Launch Ollama in server mode:
ollama serve
The API will run at http://localhost:11434
.
Use curl
or Python to send requests:
Example 1: Curl Request
curl http://localhost:11434/api/generate -d '{
"model": "deepseek-r1:7b",
"prompt": "Explain quantum computing in simple terms"
}'
Example 2: Python Integration
import ollama
response = ollama.chat(
model='deepseek-r1:7b',
messages=[{'role': 'user', 'content': 'Write Python code for a Fibonacci sequence'}]
)
print(response['message']['content'])
--gpu
to ollama serve
.max_tokens
and adjust creativity using temperature
(0.7 recommended) 9.http://localhost:3000
46.ollama pull
and check for typos 10.Running DeepSeek R1 on a Linux VM is straightforward with Ollama. The 7B model offers the best balance for beginners, while the API integration opens doors for AI-powered apps. Experiment with different prompts and explore its reasoning prowess—your privacy-focused AI journey starts now!
Further Reading:
The 1.5B model is ideal for basic tasks (e.g., text summarization, simple Q&A) on low-resource VMs (≤8GB RAM). The 7B model (recommended) handles complex reasoning, coding, and math problems better. If your VM has ≥16GB RAM and a mid-tier GPU, start with 7B for balanced performance.
The 70B model requires ~161GB of VRAM, which typically needs enterprise-grade GPUs (e.g., 4x A100s). For personal VMs, stick to the 1.5B or 7B models. If you need 70B-level performance, consider cloud-based solutions like AWS/GCP.
sudo fallocate -l 8G /swapfile && sudo chmod 600 /swapfile
sudo mkswap /swapfile && sudo swapon /swapfile
ollama pull deepseek-r1:1.5b
).Yes! Ollama runs models on CPU by default, but responses will be slower. For GPU-like speed on CPU-only VMs, use quantization (e.g., ollama pull deepseek-r1:7b-q4_0
).
sudo systemctl stop ollama
), pull the new model, and restart.