Skip to main content

Ollama Integration

Run AI models locally with Ollama for complete privacy and offline capability.

Why Ollama?

Ollama enables you to run AI models locally on your machine, providing:

  • Privacy: Your data never leaves your computer
  • No API costs: Free local inference (no OpenAI/Anthropic subscription needed)
  • Offline capability: Works without internet connection
  • Fast inference: Optimized for local hardware (CPU/GPU)

Installation

Windows

  1. Download the Ollama installer from ollama.com/download, or use winget:

    winget install Ollama.Ollama
  2. Run the installer and follow the prompts

  3. Verify installation:

    ollama --version

macOS

# Using Homebrew (recommended)
brew install ollama

# Or download from https://ollama.com/download

Linux

# Install via curl
curl -fsSL https://ollama.ai/install.sh | sh

# Verify installation
ollama --version

CommandLane recommends the Phi-3 Mini 4K Instruct model for optimal performance:

ollama pull phi3:3.8b-mini-4k-instruct-q4_K_M

Why this model?

  • Size: ~2.4 GB (fits on most systems)
  • Performance: Excellent balance of speed and quality
  • Optimized: Quantized for efficient CPU inference
  • Context: 4096 token context window

Alternative Models

Faster (smaller, less capable):

ollama pull phi3:mini  # 2.2 GB, faster inference

More capable (larger, slower):

ollama pull phi3:3.8b-mini-4k-instruct-q6_K  # 3.1 GB, higher quality

Verify Ollama is Running

# Check Ollama status
curl http://localhost:11434

# List installed models
ollama list

Configure CommandLane

Automatic Detection (Default)

CommandLane automatically detects Ollama if it's running on http://localhost:11434. No configuration needed!

Custom Base URL

If you're running Ollama on a different port or remote server:

  1. Open Dashboard (Ctrl + Shift + D)
  2. Go to Integrations
  3. Find the Ollama card
  4. Click Configure
  5. Set your custom base URL (e.g., http://192.168.1.100:11434)
  6. Click Save

Select Ollama as Provider

In Dashboard > Settings > AI Settings:

For Classification

  • Under Classification, select Provider: Ollama (Local)
  • Model will be auto-selected from your installed models

For Planning

  • Under Planning, select Provider: Ollama (Local)
  • Model will be auto-selected

For Chat

  • Under Ask Feature, select Provider: Ollama (Local)
  • Choose your model from the dropdown
ModelSizeSpeedQuality
phi3:mini2.2 GBFastestGood
phi3:3.8b-mini-4k-instruct-q4_K_M2.4 GBFastBetter
phi3:3.8b-mini-4k-instruct-q6_K3.1 GBMediumBest

Troubleshooting

"Ollama integration not available"

Cause: Ollama server is not running or not reachable.

Solutions:

  1. Verify Ollama is running:

    curl http://localhost:11434
  2. Start Ollama service:

    • Windows: Ollama runs as a service (check system tray)
    • macOS/Linux: Run ollama serve
  3. Check firewall settings (ensure port 11434 is not blocked)

"No phi3 models found"

Cause: Recommended model not installed.

Solution:

ollama pull phi3:3.8b-mini-4k-instruct-q4_K_M

Model inference is slow

Solutions:

  1. Use GPU: Ollama automatically detects NVIDIA GPUs

    • Verify GPU usage: ollama ps (check PROCESSOR column)
    • Update GPU drivers if not detected
  2. Use smaller model:

    ollama pull phi3:mini

"Connection failed" with custom base URL

Checklist:

  • Ensure Ollama is running on the target machine
  • Verify network connectivity: curl http://<ip>:11434
  • Check firewall rules allow incoming connections
  • Make sure URL has no trailing slash

Performance Tips

For Fast Classification

  • Model: phi3:mini or phi3:3.8b-mini-4k-instruct-q4_K_M
  • Temperature: 0.05 (deterministic)

For Planning

  • Model: phi3:3.8b-mini-4k-instruct-q4_K_M
  • Temperature: 0.1

For Chat

  • Model: Any phi3 variant or larger models
  • Temperature: 0.7 (balanced creativity)

GPU Acceleration

If you have an NVIDIA GPU:

  1. Verify GPU is detected:

    ollama ps  # Check PROCESSOR column shows "gpu"
  2. If CPU-only, install/update CUDA drivers from NVIDIA

    • GPU memory should be 4GB+ for phi3 models

Managing Disk Space

Models can consume significant disk space:

# List installed models with sizes
ollama list

# Remove unused models
ollama rm <model-name>

Remote Ollama Setup (Advanced)

Server Setup

On your server machine:

# Allow external connections
export OLLAMA_HOST=0.0.0.0:11434

# Start Ollama
ollama serve
Security Note

Only expose Ollama on trusted networks. It has no authentication.

Client Configuration

On your CommandLane machine:

  1. Open Dashboard > Integrations
  2. Configure Ollama with Base URL: http://<server-ip>:11434
  3. Save and verify connection

Privacy

With Ollama:

  • 100% local: All AI processing happens on your machine
  • No data sent: Nothing is transmitted to external servers
  • Offline ready: Works without any internet connection

This is the most private option for AI features in CommandLane.

Additional Resources