Create Your Imagination
AI-Powered Image Editing
No restrictions, just pure creativity. Browser-based and free!
6 min to read
Teapot LLM is an open-source language model with approximately 800 million parameters, fine-tuned on synthetic data and optimized to run locally on resource-constrained devices such as smartphones and CPUs.
Developed by the community, Teapot LLM is designed to perform a variety of tasks, including hallucination-resistant Question Answering (QnA), Retrieval-Augmented Generation (RAG), and JSON extraction.
Teapot LLM is fine-tuned from flan-t5-large
on a synthetic dataset of LLM tasks generated using DeepSeek-V3. The training process involves:
Before installing Teapot LLM, ensure your system meets the following requirements:
Docker simplifies the setup process by bundling dependencies into containers.
Create directories to store model files and configurations:
mkdir ollama-files open-webui-files
http://localhost:4000
to interact with the model.Pull the Teapot LLM Docker image:
docker run -d -p 4000:8080 -v /path/to/ollama-files:/root/.ollama -v /path/to/open-webui-files:/app/backend/data --name teapot-webui --restart always ghcr.io/open-webui/open-webui:teapot
For users preferring a direct installation without containers:
Run the Model:
python main.py --model teapot --port 8080
Install Dependencies using PowerShell:
./setup_env.ps1
Clone the Teapot Repository:
git clone https://github.com/teapot-ai/teapot.git
cd teapot
Llamafile simplifies running LLMs by bundling them into single executables.
.exe
file to start the application.To use Teapot LLM, you can leverage the teapotai
library, which simplifies model integration into production environments. Here’s a basic example of using Teapot LLM for general question answering:PythonCopy
from teapotai import TeapotAI
# Sample context
context = """
The Eiffel Tower is a wrought iron lattice tower in Paris, France. It was designed by Gustave Eiffel and completed in 1889.
It stands at a height of 330 meters and is one of the most recognizable structures in the world.
"""
teapot_ai = TeapotAI()
answer = teapot_ai.query(
query="What is the height of the Eiffel Tower?",
context=context
)
print(answer) # Output: "The Eiffel Tower stands at a height of 330 meters."
For more advanced use cases, such as Retrieval-Augmented Generation, Teapot LLM can be used with multiple documents to answer questions based on the most relevant information.
Leverage NVIDIA TensorRT for faster inference:
Configure Teapot to Use GPU:
python main.py --model teapot --gpu
Reduce model size by quantizing weights (e.g., converting to INT8). This process can greatly improve performance on machines with limited resources while maintaining acceptable accuracy.
Increase batch sizes for tasks like text generation to improve throughput and overall efficiency.
In this example, we showcase how to use Teapot LLM to answer questions based on a provided context. The model is optimized for conversational responses and is trained to avoid answering questions beyond the given context, thereby reducing hallucinations.
from teapotai import TeapotAI
# Sample context about the Eiffel Tower
context = """
The Eiffel Tower is a wrought iron lattice tower in Paris, France. It was designed by Gustave Eiffel and completed in 1889.
It stands at a height of 330 meters and is one of the most recognizable structures in the world.
"""
# Initialize TeapotAI
teapot_ai = TeapotAI()
# Get the answer using the provided context
answer = teapot_ai.query(
query="What is the height of the Eiffel Tower?",
context=context
)
print(answer) # Expected Output: "The Eiffel Tower stands at a height of 330 meters."
# Example demonstrating hallucination resistance:
context_without_height = """
The Eiffel Tower is a wrought iron lattice tower in Paris, France. It was designed by Gustave Eiffel and completed in 1889.
"""
answer = teapot_ai.query(
query="What is the height of the Eiffel Tower?",
context=context_without_height
)
print(answer) # Expected Output: "I don't have information on the height of the Eiffel Tower."
This example illustrates how to use Teapot LLM with Retrieval-Augmented Generation (RAG) to automatically select the most relevant documents before generating an answer. This approach is particularly useful when you have multiple documents and need the model to extract the most pertinent information.
from teapotai import TeapotAI
# Sample documents about various famous landmarks
documents = [
"The Eiffel Tower is located in Paris, France. It was built in 1889 and stands 330 meters tall.",
"The Great Wall of China is a historic fortification that stretches over 13,000 miles.",
"The Amazon Rainforest is the largest tropical rainforest in the world, covering over 5.5 million square kilometers.",
"The Grand Canyon is a natural landmark located in Arizona, USA, carved by the Colorado River.",
"Mount Everest is the tallest mountain on Earth, located in the Himalayas along the border between Nepal and China.",
"The Colosseum in Rome, Italy, is an ancient amphitheater known for its gladiator battles.",
"The Sahara Desert is the largest hot desert in the world, located in North Africa.",
"The Nile River is the longest river in the world, flowing through northeastern Africa.",
"The Empire State Building is an iconic skyscraper in New York City that was completed in 1931 and stands at 1454 feet tall."
]
# Initialize TeapotAI with documents for RAG
teapot_ai = TeapotAI(documents=documents)
# Start a chat session with a retrieval prompt
answer = teapot_ai.chat([
{
"role": "system",
"content": "You are an agent designed to answer facts about famous landmarks."
},
{
"role": "user",
"content": "What landmark was constructed in the 1800s?"
}
])
print(answer) # Expected Output: "The Eiffel Tower was constructed in the 1800s."
Creating a virtual environment is a best practice to manage project dependencies effectively. Use the following commands to set up a virtual environment:
python3 -m venv teapot-env
source teapot-env/bin/activate # On Windows use: teapot-env\Scripts\activate
Always ensure you have the latest version of TeapotAI to take advantage of new features and improvements:
pip install --upgrade teapotai
To reduce loading times, you can save a TeapotAI instance with precomputed embeddings using Python’s pickle module:
import pickle
# Save the TeapotAI model to a file
with open("teapot_ai.pkl", "wb") as f:
pickle.dump(teapot_ai, f)
# Load the saved TeapotAI model
with open("teapot_ai.pkl", "rb") as f:
loaded_teapot_ai = pickle.load(f)
# Verify the loaded model works as expected
print(len(loaded_teapot_ai.documents)) # Expected Output: Number of documents, e.g., 9
loaded_teapot_ai.query("What city is the Eiffel Tower in?") # Expected Output: "The Eiffel Tower is located in Paris, France."
Teapot LLM is particularly useful for:
While Teapot LLM excels in question answering and information extraction, it is not intended for code generation, creative writing, or critical decision-making applications. Additionally, Teapot LLM has been trained primarily on English and may not perform well in other languages.
Running Teapot LLM locally on Windows offers unparalleled flexibility, enhanced privacy, and significant cost savings for developers and AI enthusiasts alike. Whether you choose Docker containers, native installation, or executables like Llamafile, this guide provides the steps needed for a smooth setup process.
Need expert guidance? Connect with a top Codersera professional today!