Stop Paying for Screen Recording
Switch to Free & Open Source
Built for developers, by developers
3 min to read
Running Tülu 3 on Linux unlocks access to one of the most advanced open-source AI models available today, combining state-of-the-art performance with full transparency in training data and methodologies.
This guide provides a comprehensive walkthrough for installing and operating Tülu 3 on Linux systems, optimized for both developers and researchers.
sudo apt update && sudo apt upgrade -y
sudo apt install -y python3.10 python3-pip python3.10-venv build-essential cmake git curl
python3 -m venv tulu_env
source tulu_env/bin/activate
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
pip3 install transformers datasets accelerate vllm
git lfs install
git clone https://huggingface.co/Triangle104/Llama-3.1-Tulu-3-8B-Q5_K_S-GGUF
curl -fsSL https://ollama.ai/install.sh | sh
ollama pull tulu-3-8b-q5_k_s
sudo apt install nvidia-cuda-toolkit
nvidia-smi # Verify GPU recognition
tulu_config.yaml
):model: "tulu-3-8b"
tensor_parallel_size: 4
gpu_memory_utilization: 0.95
ollama run tulu-3-8b "Explain quantum entanglement in simple terms"
from transformers import AutoTokenizer, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("Triangle104/Llama-3.1-Tulu-3-8B-Q5_K_S-GGUF")
tokenizer = AutoTokenizer.from_pretrained("Triangle104/Llama-3.1-Tulu-3-8B-Q5_K_S-GGUF")
inputs = tokenizer("The capital of France is", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))
python3 -m vllm.entrypoints.api_server \
--model Triangle104/Llama-3.1-Tulu-3-8B-Q5_K_S-GGUF \
--tensor-parallel-size 4 \
--gpu-memory-utilization 0.95
http://localhost:8000/v1/completions
http://localhost:8000/v1/chat/completions
Task | Tülu 3-8B | DeepSeek 7B | Llama 3-8B |
---|---|---|---|
GSM8K (Math) | 78.2% | 75.9% | 72.1% |
HumanEval+ (Code) | 65.3% | 62.8% | 58.4% |
MMLU (Knowledge) | 68.9% | 66.2% | 64.7% |
Latency (ms/token) | 42 | 45 | 48 |
vllm
configuration--quantization awq
pip3 uninstall -y torch && pip3 cache purge
pip3 install torch --no-cache-dir
sha256sum model.bin
sudo swapon --show
pip3 install auto-gptq
python3 -m transformers.utils.quantization_config --model_name tulu-3-8b
torchrun --nproc_per_node=4 --nnodes=2 \
--node_rank=0 --master_addr="192.168.1.100" \
train.py --config tulu_config.yaml
def generate_documentation(code):
prompt = f"""Generate Markdown documentation for this Python code:
{code}
Include:
- Function parameters
- Return values
- Usage examples"""
return tulu_api(prompt)
ollama run tulu-3-8b "Summarize key contributions of this paper: $(cat research.pdf | pdftotext - -)"
podman build -t tulu-container -f Dockerfile.prod
podman run -d --gpus all -p 8000:8000 tulu-container
Tülu 3's Linux implementation combines cutting-edge AI capabilities with open-source flexibility, offering performance competitive with proprietary models like GPT-4o while maintaining full transparency.
Need expert guidance? Connect with a top Codersera professional today!