Ollama Integration
Run AI models locally with Ollama for complete privacy and offline capability.
Why Ollama?
Ollama enables you to run AI models locally on your machine, providing:
- Privacy: Your data never leaves your computer
- No API costs: Free local inference (no OpenAI/Anthropic subscription needed)
- Offline capability: Works without internet connection
- Fast inference: Optimized for local hardware (CPU/GPU)
Installation
Windows
-
Download the Ollama installer from ollama.com/download, or use winget:
winget install Ollama.Ollama -
Run the installer and follow the prompts
-
Verify installation:
ollama --version
macOS
# Using Homebrew (recommended)
brew install ollama
# Or download from https://ollama.com/download
Linux
# Install via curl
curl -fsSL https://ollama.ai/install.sh | sh
# Verify installation
ollama --version
Download the Recommended Model
CommandLane recommends the Phi-3 Mini 4K Instruct model for optimal performance:
ollama pull phi3:3.8b-mini-4k-instruct-q4_K_M
Why this model?
- Size: ~2.4 GB (fits on most systems)
- Performance: Excellent balance of speed and quality
- Optimized: Quantized for efficient CPU inference
- Context: 4096 token context window
Alternative Models
Faster (smaller, less capable):
ollama pull phi3:mini # 2.2 GB, faster inference
More capable (larger, slower):
ollama pull phi3:3.8b-mini-4k-instruct-q6_K # 3.1 GB, higher quality
Verify Ollama is Running
# Check Ollama status
curl http://localhost:11434
# List installed models
ollama list
Configure CommandLane
Automatic Detection (Default)
CommandLane automatically detects Ollama if it's running on http://localhost:11434. No configuration needed!
Custom Base URL
If you're running Ollama on a different port or remote server:
- Open Dashboard (Ctrl + Shift + D)
- Go to Integrations
- Find the Ollama card
- Click Configure
- Set your custom base URL (e.g.,
http://192.168.1.100:11434) - Click Save
Select Ollama as Provider
In Dashboard > Settings > AI Settings:
For Classification
- Under Classification, select Provider: Ollama (Local)
- Model will be auto-selected from your installed models
For Planning
- Under Planning, select Provider: Ollama (Local)
- Model will be auto-selected
For Chat
- Under Ask Feature, select Provider: Ollama (Local)
- Choose your model from the dropdown
Recommended Models
| Model | Size | Speed | Quality |
|---|---|---|---|
| phi3:mini | 2.2 GB | Fastest | Good |
| phi3:3.8b-mini-4k-instruct-q4_K_M | 2.4 GB | Fast | Better |
| phi3:3.8b-mini-4k-instruct-q6_K | 3.1 GB | Medium | Best |
Troubleshooting
"Ollama integration not available"
Cause: Ollama server is not running or not reachable.
Solutions:
-
Verify Ollama is running:
curl http://localhost:11434 -
Start Ollama service:
- Windows: Ollama runs as a service (check system tray)
- macOS/Linux: Run
ollama serve
-
Check firewall settings (ensure port 11434 is not blocked)
"No phi3 models found"
Cause: Recommended model not installed.
Solution:
ollama pull phi3:3.8b-mini-4k-instruct-q4_K_M
Model inference is slow
Solutions:
-
Use GPU: Ollama automatically detects NVIDIA GPUs
- Verify GPU usage:
ollama ps(check PROCESSOR column) - Update GPU drivers if not detected
- Verify GPU usage:
-
Use smaller model:
ollama pull phi3:mini
"Connection failed" with custom base URL
Checklist:
- Ensure Ollama is running on the target machine
- Verify network connectivity:
curl http://<ip>:11434 - Check firewall rules allow incoming connections
- Make sure URL has no trailing slash
Performance Tips
For Fast Classification
- Model:
phi3:miniorphi3:3.8b-mini-4k-instruct-q4_K_M - Temperature: 0.05 (deterministic)
For Planning
- Model:
phi3:3.8b-mini-4k-instruct-q4_K_M - Temperature: 0.1
For Chat
- Model: Any phi3 variant or larger models
- Temperature: 0.7 (balanced creativity)
GPU Acceleration
If you have an NVIDIA GPU:
-
Verify GPU is detected:
ollama ps # Check PROCESSOR column shows "gpu" -
If CPU-only, install/update CUDA drivers from NVIDIA
- GPU memory should be 4GB+ for phi3 models
Managing Disk Space
Models can consume significant disk space:
# List installed models with sizes
ollama list
# Remove unused models
ollama rm <model-name>
Remote Ollama Setup (Advanced)
Server Setup
On your server machine:
# Allow external connections
export OLLAMA_HOST=0.0.0.0:11434
# Start Ollama
ollama serve
Only expose Ollama on trusted networks. It has no authentication.
Client Configuration
On your CommandLane machine:
- Open Dashboard > Integrations
- Configure Ollama with Base URL:
http://<server-ip>:11434 - Save and verify connection
Privacy
With Ollama:
- 100% local: All AI processing happens on your machine
- No data sent: Nothing is transmitted to external servers
- Offline ready: Works without any internet connection
This is the most private option for AI features in CommandLane.