Ollama Setup Guide

This guide walks you through installing Ollama and setting up local LLM features in CommandLane.

Why Ollama?

Ollama enables you to run AI models locally on your machine, providing:

Privacy: Your data never leaves your computer
No API costs: Free local inference (no OpenAI/Anthropic subscription needed)
Offline capability: Works without internet connection
Fast inference: Optimized for local hardware (CPU/GPU)

Installation

Windows

Download the Ollama installer:

# Visit https://ollama.com/download and download the Windows installer
# Or use winget:
winget install Ollama.Ollama

Run the installer and follow the prompts
Verify installation:
```
ollama --version
```

macOS

# Using Homebrew (recommended)
brew install ollama

# Or download from https://ollama.com/download

Linux

# Install via curl
curl -fsSL https://ollama.ai/install.sh | sh

# Verify installation
ollama --version

Download the Recommended Model

CommandLane recommends the Phi-3 Mini 4K Instruct model (Q4_K_M quantization) for optimal performance:

ollama pull phi3:3.8b-mini-4k-instruct-q4_K_M

Why this model?

Size: ~2.4 GB (fits on most systems)
Performance: Excellent balance of speed and quality
Optimized: Quantized for efficient CPU inference
Context: 4096 token context window
Accuracy: Performs well on classification and planning tasks

Alternative Models

If you prefer different tradeoffs:

Faster (smaller, less capable):

ollama pull phi3:mini  # 2.2 GB, faster inference

More capable (larger, slower):

ollama pull phi3:3.8b-mini-4k-instruct-q6_K  # 3.1 GB, higher quality

Verify Ollama is Running

# Check Ollama status
curl http://localhost:11434

# List installed models
ollama list

Expected output:

Ollama is running

Configure CommandLane

Option 1: Automatic Detection (Default)

CommandLane automatically detects Ollama if it's running on http://localhost:11434. No configuration needed!

Option 2: Custom Base URL

If you're running Ollama on a different port or remote server:

Open the CommandLane dashboard
Navigate to Settings → AI Settings → Advanced
Set Ollama Base URL (e.g., http://192.168.1.100:11434)
Click Save

Option 3: Configuration File

Edit pkb.config.json:

{
  "integrations": {
    "ollama": {
      "config": {
        "base_url": "http://localhost:11434"
      },
      "connected": true
    }
  }
}

Select Ollama as Provider

For Classification

Go to Settings → AI Settings
Under Classification, select Provider: Ollama (Local)
Model will be auto-selected (uses your installed phi3 model)

For Planning

Go to Settings → AI Settings
Under Planning, select Provider: Ollama (Local)
Model will be auto-selected

For Chat

Go to Settings → AI Settings
Under Ask Feature, select Provider: Ollama (Local)
Choose your model from the dropdown (shows all installed models)

Troubleshooting

"Ollama integration not available"

Cause: Ollama server is not running or not reachable.

Solutions:

Verify Ollama is running:
```
curl http://localhost:11434
```

Start Ollama service:

# Windows: Ollama runs as a service (check system tray)
# macOS/Linux:
ollama serve

Check firewall settings (ensure port 11434 is not blocked)

"No phi3 models found"

Cause: Recommended model not installed.

Solution:

ollama pull phi3:3.8b-mini-4k-instruct-q4_K_M

Model inference is slow

Solutions:

Use GPU: Ollama automatically detects NVIDIA GPUs
- Verify GPU usage: ollama ps (check PROCESSOR column)
- Update GPU drivers if not detected
Use smaller model:
```
ollama pull phi3:mini
```
Reduce context window: In advanced settings, lower num_ctx

"Connection failed" with custom base URL

Checklist:

Ensure Ollama is running on the target machine
Verify network connectivity: curl http://<ip>:11434
Check firewall rules allow incoming connections
Update CommandLane settings with correct URL (no trailing slash)

Performance Tips

Optimal Settings for Phi3

For fast classification:

Model: phi3:mini or phi3:3.8b-mini-4k-instruct-q4_K_M
Temperature: 0.05 (deterministic)
Tokens: 150 (short responses)

For planning:

Model: phi3:3.8b-mini-4k-instruct-q4_K_M
Temperature: 0.1
Tokens: 300 (structured JSON output)

For chat:

Model: Any phi3 variant or larger models
Temperature: 0.7 (balanced creativity)
Tokens: 2000 (longer conversations)

GPU Acceleration

If you have an NVIDIA GPU:

Verify GPU is detected:

ollama ps  # Check PROCESSOR column shows "gpu"

If CPU-only, install/update CUDA drivers:
- Windows/Linux: NVIDIA CUDA Toolkit
- GPU memory should be ≥4GB for phi3 models

Managing Disk Space

Models can consume significant disk space:

# List installed models with sizes
ollama list

# Remove unused models
ollama rm <model-name>

# Example: Remove old model
ollama rm gemma2:2b

Remote Ollama Setup (Advanced)

Server Setup

On your server machine:

# Allow external connections
export OLLAMA_HOST=0.0.0.0:11434

# Start Ollama
ollama serve

Security Note

Only expose Ollama on trusted networks. It has no authentication.

Client Configuration

On CommandLane machine:

Settings → AI Settings → Advanced
Set Ollama Base URL: http://<server-ip>:11434
Save and verify connection

Next Steps

Explore the Dashboard → Integrations page for connection status
Test classification by capturing some text
Try the planning feature with Ctrl+Shift+P (customize tasks)
Use the Ask feature for chat interactions

Why Ollama?​

Installation​

Windows​

macOS​

Linux​

Download the Recommended Model​

Alternative Models​

Verify Ollama is Running​

Configure CommandLane​

Option 1: Automatic Detection (Default)​

Option 2: Custom Base URL​

Option 3: Configuration File​

Select Ollama as Provider​

For Classification​

For Planning​

For Chat​

Troubleshooting​

"Ollama integration not available"​

"No phi3 models found"​

Model inference is slow​

"Connection failed" with custom base URL​

Performance Tips​

Optimal Settings for Phi3​

GPU Acceleration​

Managing Disk Space​

Remote Ollama Setup (Advanced)​

Server Setup​

Client Configuration​

Next Steps​

Additional Resources​