OpenClaw + Ollama Local Setup Guide - Fully Offline AI Agent
Want to run OpenClaw without any cloud dependencies or API costs? This guide shows you how to set up OpenClaw with Ollama for a fully offline, privacy-first AI agent experience.
Why Go Local with Ollama?
While OpenClaw works best with Claude Opus 4.5, not everyone wants to send data to the cloud. With Ollama, you can run powerful open-source models like Llama 3, Mistral, and Qwen locally on your machine. Benefits include:
- Zero API costs—completely free to run
- Complete data privacy—nothing leaves your machine
- No internet required after initial setup
- Full control over model selection and parameters
- Great for sensitive enterprise or personal data
1. Prerequisites
You'll need a machine with decent specs for local model inference. Here's what's recommended:
# Minimum requirements:
# - 16GB RAM (32GB+ recommended)
# - Modern CPU or GPU with CUDA/Metal support
# - 20GB+ free disk space for models
# Check your system:
node --version # Requires v22+
ollama --version # Install from ollama.com if missing
2. Install Ollama
Ollama makes it trivial to run local LLMs. Install it with a single command, then pull your preferred model.
# Install Ollama (macOS/Linux)
curl -fsSL https://ollama.com/install.sh | sh
# Pull a recommended model
ollama pull llama3.1:8b # Good balance of speed & quality
# Or for more capable responses:
ollama pull qwen2.5:14b # Larger, slower, better reasoning
# Verify Ollama is running
ollama list
Tip: For the best experience, we recommend qwen2.5:14b or llama3.1:8b. Smaller models (7B) work but may struggle with complex multi-step tasks.
3. Configure OpenClaw for Ollama
Point OpenClaw at your local Ollama instance. The configuration is straightforward—just change the model provider and endpoint.
# In ~/.openclaw/openclaw.json:
{
"agent": {
"model": "ollama/llama3.1:8b",
"baseUrl": "http://localhost:11434"
}
}
# Or configure via CLI:
openclaw config set agent.model ollama/llama3.1:8b
openclaw config set agent.baseUrl http://localhost:11434
Tip: You can switch between Ollama and cloud models anytime by updating the config. OpenClaw supports hot-swapping models without restart.
4. Start and Verify
Launch OpenClaw and verify it's using your local Ollama model. The Gateway dashboard shows the active model connection.
# Start OpenClaw
openclaw start
# Check status (should show Ollama connection)
openclaw status
# Visit dashboard
# http://127.0.0.1:18789/
# Send a test message via your configured channel
# or use the WebChat interface
Performance Tips
Local models are inherently slower than cloud APIs. Here are tips to optimize your experience:
- Use GPU acceleration (CUDA for NVIDIA, Metal for Apple Silicon) for 5-10x faster inference
- Close other memory-intensive apps while running large models
- Use smaller models (8B) for quick tasks, larger models (14B+) for complex reasoning
- Enable model caching to avoid reloading between sessions
- Consider using quantized models (Q4_K_M) for better speed/memory tradeoff
Limitations of Local Models
While powerful, local models have some limitations compared to cloud-hosted Claude:
- Complex multi-step tasks may require more guidance
- Browser automation accuracy may be reduced
- Trading features are not recommended with local models
- Slower response times, especially on CPU-only systems
- No vision capabilities with most local models
Troubleshooting Common Issues
Here are solutions to the most common problems when running OpenClaw with Ollama:
- "Unknown model ollama/llama3.1:8b" — Make sure Ollama is running (ollama serve) and the model is pulled (ollama pull llama3.1:8b). Check that the model name in config matches exactly.
- Slow responses — Use GPU acceleration if available. On Mac, Metal is enabled by default. On Linux, install CUDA drivers for NVIDIA GPUs. Fall back to a smaller model (8B instead of 14B).
- Out of memory errors — Close other applications, use a quantized model (Q4_K_M), or switch to a smaller model. 8B models need ~8GB RAM, 14B models need ~16GB.
- Ollama not connecting — Verify Ollama is running on port 11434: curl http://localhost:11434/api/tags. If using Docker, ensure the Ollama container is accessible.
- Poor quality responses — Upgrade to a larger model (qwen2.5:14b or llama3.1:70b if you have the hardware). Smaller models work for simple tasks but struggle with complex reasoning.
Model Compatibility Guide
OpenClaw works with any Ollama-compatible model. Here is a quick reference for choosing the right model based on your hardware and use case:
# Recommended models by hardware:
# 8GB RAM (minimum):
ollama pull llama3.1:8b # General purpose
ollama pull mistral:7b # Fast, good for chat
# 16GB RAM (recommended):
ollama pull qwen2.5:14b # Best reasoning at this size
ollama pull llama3.1:8b # Faster alternative
# 32GB+ RAM (power user):
ollama pull llama3.1:70b # Near cloud-model quality
ollama pull qwen2.5:32b # Excellent reasoning
# Apple Silicon (M1/M2/M3):
# All models run with Metal GPU acceleration
# M1: 8B models comfortable, 14B possible
# M2/M3 Pro: 14B smooth, 32B possible
# M2/M3 Max: 70B models run well
What's Next?
Your offline AI agent is ready. Explore what you can build with it.