SKILL.md
Ollama Setup for GrepAI
This skill covers installing and configuring Ollama as the local embedding provider for GrepAI. Ollama enables 100% private code search where your code never leaves your machine.
When to Use This Skill
- Setting up GrepAI with local, private embeddings
- Installing Ollama for the first time
- Choosing and downloading embedding models
- Troubleshooting Ollama connection issues
Why Ollama?
Benefit
Description
π Privacy
Code never leaves your machine
π° Free
No API costs
β‘ Fast
Local processing, no network latency
π Offline
Works without internet
Installation
macOS (Homebrew)
# Install Ollama
brew install ollama
# Start the Ollama service
ollama serve
macOS (Direct Download)
- Download from ollama.com
- Open the
.dmgand drag to Applications
- Launch Ollama from Applications
Linux
# One-line installer
curl -fsSL https://ollama.com/install.sh | sh
# Start the service
ollama serve
Windows
- Download installer from ollama.com
- Run the installer
- Ollama starts automatically as a service
Downloading Embedding Models
GrepAI requires an embedding model to convert code into vectors.
Recommended Model: nomic-embed-text
# Download the recommended model (768 dimensions)
ollama pull nomic-embed-text
Specifications:
- Dimensions: 768
- Size: ~274 MB
- Performance: Excellent for code search
- Language: English-optimized
Alternative Models
# Multilingual support (better for non-English code/comments)
ollama pull nomic-embed-text-v2-moe
# Larger, more accurate
ollama pull bge-m3
# Maximum quality
ollama pull mxbai-embed-large
Model
Dimensions
Size
Best For
nomic-embed-text
768
274 MB
General code search
nomic-embed-text-v2-moe
768
500 MB
Multilingual codebases
bge-m3
1024
1.2 GB
Large codebases
mxbai-embed-large
1024
670 MB
Maximum accuracy
Verifying Installation
Check Ollama is Running
# Check if Ollama server is responding
curl http://localhost:11434/api/tags
# Expected output: JSON with available models
List Downloaded Models
ollama list
# Output:
# NAME ID SIZE MODIFIED
# nomic-embed-text:latest abc123... 274 MB 2 hours ago
Test Embedding Generation
# Quick test (should return embedding vector)
curl http://localhost:11434/api/embeddings -d '{
"model": "nomic-embed-text",
"prompt": "function hello() { return world; }"
}'
Configuring GrepAI for Ollama
After installing Ollama, configure GrepAI to use it:
# .grepai/config.yaml
embedder:
provider: ollama
model: nomic-embed-text
endpoint: http://localhost:11434
This is the default configuration when you run grepai init, so no changes are needed if using nomic-embed-text.
Running Ollama
Foreground (Development)
# Run in current terminal (see logs)
ollama serve
Background (macOS/Linux)
# Using nohup
nohup ollama serve &
# Or as a systemd service (Linux)
sudo systemctl enable ollama
sudo systemctl start ollama
Check Status
# Check if running
pgrep -f ollama
# Or test the API
curl -s http://localhost:11434/api/tags | head -1
Resource Considerations
Memory Usage
Embedding models load into RAM:
nomic-embed-text: ~500 MB RAM
bge-m3: ~1.5 GB RAM
mxbai-embed-large: ~1 GB RAM
CPU vs GPU
Ollama uses CPU by default. For faster embeddings:
- macOS: Uses Metal (Apple Silicon) automatically
- Linux/Windows: Install CUDA for NVIDIA GPU support
Common Issues
β Problem: connection refused to localhost:11434
β Solution: Start Ollama:
ollama serve
β Problem: Model not found
β Solution: Pull the model first:
ollama pull nomic-embed-text
β Problem: Slow embedding generation
β Solution:
- Use a smaller model
- Ensure Ollama is using GPU (check
ollama ps)
- Close other memory-intensive applications
β Problem: Out of memory
β Solution: Use a smaller model or increase system RAM
Best Practices
- Start Ollama before GrepAI: Ensure
ollama serveis running
- Use recommended model:
nomic-embed-textoffers best balance
- Keep Ollama running: Leave it as a background service
- Update periodically:
ollama pull nomic-embed-textfor updates
Output Format
After successful setup:
β
Ollama Setup Complete
Ollama Version: 0.1.x
Endpoint: http://localhost:11434
Model: nomic-embed-text (768 dimensions)
Status: Running
GrepAI is ready to use with local embeddings.
Your code will never leave your machine.