---
title: 'Ollama Setup'
description: 'Use local LLM models with Ollama instead of OpenAI'
---
# Ollama Setup Guide
T6 Mem0 v2 supports both OpenAI and Ollama as LLM providers. Use Ollama to run completely local models without requiring OpenAI API credits.
## Why Ollama?
- **Cost-effective**: No API costs, run models locally
- **Privacy**: All data stays on your infrastructure
- **Offline capability**: Works without internet connection
- **Model variety**: Access to Llama, Mistral, and other open-source models
## Prerequisites
- Docker and Docker Compose (if using containerized deployment)
- Or Ollama installed locally
- Sufficient RAM (8GB+ for smaller models, 16GB+ recommended)
- GPU optional but recommended for better performance
## Installation
### Option 1: Ollama on Host Machine
**Install Ollama:**
```bash
# Linux
curl -fsSL https://ollama.com/install.sh | sh
# macOS
brew install ollama
# Or download from https://ollama.com/download
```
**Start Ollama service:**
```bash
ollama serve
```
**Pull required models:**
```bash
# LLM model (choose one)
ollama pull llama3.1:8b # 8B parameters, 4.7GB
ollama pull llama3.1:70b # 70B parameters, 40GB (requires 48GB RAM)
ollama pull mistral:7b # 7B parameters, 4.1GB
# Embedding model (required)
ollama pull nomic-embed-text # 274MB
```
### Option 2: Ollama in Docker
**Add to docker-compose.yml:**
```yaml
services:
ollama:
image: ollama/ollama:latest
container_name: t6-ollama
ports:
- "11434:11434"
volumes:
- ollama_data:/root/.ollama
networks:
- localai
restart: unless-stopped
volumes:
ollama_data:
```
**Pull models inside container:**
```bash
docker exec -it t6-ollama ollama pull llama3.1:8b
docker exec -it t6-ollama ollama pull nomic-embed-text
```
## Configuration
### Environment Variables
Update your `.env` file:
```bash
# Switch to Ollama
LLM_PROVIDER=ollama
EMBEDDER_PROVIDER=ollama
# Ollama configuration
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_LLM_MODEL=llama3.1:8b
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
# OpenAI key no longer required
# OPENAI_API_KEY= # Can be left empty
```
### Docker Network Configuration
If running Ollama in Docker on the same network as mem0:
```bash
# Find Ollama container IP
docker inspect t6-ollama --format='{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}'
# Update .env
OLLAMA_BASE_URL=http://172.21.0.15:11434 # Use actual container IP
```
Or use Docker service name:
```bash
OLLAMA_BASE_URL=http://ollama:11434 # If on same Docker network
```
## Model Selection
### LLM Models
| Model | Size | RAM Required | Use Case |
|-------|------|--------------|----------|
| `llama3.1:8b` | 4.7GB | 8GB | General purpose, fast |
| `llama3.1:70b` | 40GB | 48GB | High quality responses |
| `mistral:7b` | 4.1GB | 8GB | Fast, efficient |
| `codellama:7b` | 3.8GB | 8GB | Code generation |
| `phi3:3.8b` | 2.3GB | 4GB | Smallest viable model |
### Embedding Models
| Model | Size | Dimensions | Use Case |
|-------|------|------------|----------|
| `nomic-embed-text` | 274MB | 768 | Recommended, fast |
| `mxbai-embed-large` | 669MB | 1024 | Higher quality |
| `all-minilm` | 46MB | 384 | Smallest option |
**Important**: Update `MEM0_EMBEDDING_DIMS` to match your embedding model:
```bash
# For nomic-embed-text
MEM0_EMBEDDING_DIMS=768
# For mxbai-embed-large
MEM0_EMBEDDING_DIMS=1024
# For all-minilm
MEM0_EMBEDDING_DIMS=384
```
## Switching Between OpenAI and Ollama
### Full Ollama Configuration
```bash
LLM_PROVIDER=ollama
EMBEDDER_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_LLM_MODEL=llama3.1:8b
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
MEM0_EMBEDDING_DIMS=768
```
### Hybrid Configuration
Use Ollama for LLM but OpenAI for embeddings:
```bash
LLM_PROVIDER=ollama
EMBEDDER_PROVIDER=openai
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_LLM_MODEL=llama3.1:8b
OPENAI_API_KEY=sk-your-key
MEM0_EMBEDDING_DIMS=1536 # OpenAI dimensions
```
Or use OpenAI for LLM but Ollama for embeddings:
```bash
LLM_PROVIDER=openai
EMBEDDER_PROVIDER=ollama
OPENAI_API_KEY=sk-your-key
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
MEM0_EMBEDDING_DIMS=768 # Ollama dimensions
```
### Back to OpenAI
```bash
LLM_PROVIDER=openai
EMBEDDER_PROVIDER=openai
OPENAI_API_KEY=sk-your-key
MEM0_EMBEDDING_DIMS=1536
```
## Deployment
### Docker Deployment with Ollama
**Complete docker-compose.yml:**
```yaml
version: '3.8'
services:
ollama:
image: ollama/ollama:latest
container_name: t6-ollama
ports:
- "11434:11434"
volumes:
- ollama_data:/root/.ollama
networks:
- localai
restart: unless-stopped
mcp-server:
build:
context: .
dockerfile: docker/Dockerfile.mcp
container_name: t6-mem0-mcp
restart: unless-stopped
ports:
- "8765:8765"
environment:
- LLM_PROVIDER=ollama
- EMBEDDER_PROVIDER=ollama
- OLLAMA_BASE_URL=http://ollama:11434
- OLLAMA_LLM_MODEL=llama3.1:8b
- OLLAMA_EMBEDDING_MODEL=nomic-embed-text
- MEM0_EMBEDDING_DIMS=768
- SUPABASE_CONNECTION_STRING=${SUPABASE_CONNECTION_STRING}
- NEO4J_URI=neo4j://neo4j:7687
- NEO4J_USER=${NEO4J_USER}
- NEO4J_PASSWORD=${NEO4J_PASSWORD}
depends_on:
- ollama
- neo4j
networks:
- localai
neo4j:
image: neo4j:5.26.0
container_name: t6-neo4j
ports:
- "7474:7474"
- "7687:7687"
environment:
- NEO4J_AUTH=neo4j/${NEO4J_PASSWORD}
volumes:
- neo4j_data:/data
networks:
- localai
volumes:
ollama_data:
neo4j_data:
networks:
localai:
external: true
```
**Startup sequence:**
```bash
# Start services
docker compose up -d
# Pull models
docker exec -it t6-ollama ollama pull llama3.1:8b
docker exec -it t6-ollama ollama pull nomic-embed-text
# Verify Ollama is working
curl http://localhost:11434/api/tags
# Restart mem0 services to pick up models
docker compose restart mcp-server
```
## Testing
### Test Ollama Connection
```bash
# List available models
curl http://localhost:11434/api/tags
# Test generation
curl http://localhost:11434/api/generate -d '{
"model": "llama3.1:8b",
"prompt": "Hello, world!",
"stream": false
}'
```
### Test Memory Operations
```bash
# Add memory via REST API
curl -X POST http://localhost:8080/v1/memories/ \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "I love local AI models"},
{"role": "assistant", "content": "Noted!"}
],
"user_id": "test_user"
}'
# Check logs for Ollama usage
docker logs t6-mem0-mcp --tail 50
```
## Performance Tuning
### GPU Acceleration
If you have an NVIDIA GPU:
```yaml
ollama:
image: ollama/ollama:latest
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
```
### Model Caching
Models are cached in `ollama_data` volume. To clear cache:
```bash
docker volume rm ollama_data
```
### Concurrent Requests
Ollama handles concurrent requests by default. For high load:
```yaml
ollama:
environment:
- OLLAMA_NUM_PARALLEL=4 # Number of parallel requests
- OLLAMA_MAX_LOADED_MODELS=2 # Keep models in memory
```
## Troubleshooting
### Ollama Not Responding
```bash
# Check Ollama status
curl http://localhost:11434/api/tags
# Check logs
docker logs t6-ollama
# Restart Ollama
docker restart t6-ollama
```
### Model Not Found
```bash
# List pulled models
docker exec -it t6-ollama ollama list
# Pull missing model
docker exec -it t6-ollama ollama pull llama3.1:8b
```
### Out of Memory
Try a smaller model:
```bash
# Switch to smaller model in .env
OLLAMA_LLM_MODEL=phi3:3.8b
# Or use quantized version
OLLAMA_LLM_MODEL=llama3.1:8b-q4_0 # 4-bit quantization
```
### Slow Response Times
- Use GPU acceleration
- Use smaller models (phi3:3.8b)
- Reduce concurrent requests
- Check system resources (RAM, CPU)
### Connection Refused
If mem0 can't connect to Ollama:
```bash
# Test from mem0 container
docker exec -it t6-mem0-mcp curl http://ollama:11434/api/tags
# Check both containers on same network
docker network inspect localai
```
## Migration from OpenAI
### 1. Pull Models
```bash
ollama pull llama3.1:8b
ollama pull nomic-embed-text
```
### 2. Update Configuration
```bash
# Backup current .env
cp .env .env.openai.backup
# Update .env
LLM_PROVIDER=ollama
EMBEDDER_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_LLM_MODEL=llama3.1:8b
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
MEM0_EMBEDDING_DIMS=768 # Changed from 1536
```
### 3. Clear Existing Embeddings (Important!)
When switching embedding models, you must clear existing embeddings as dimensions changed from 1536 (OpenAI) to 768 (Ollama).
```bash
# Clear Supabase embeddings
psql $SUPABASE_CONNECTION_STRING -c "DELETE FROM t6_memories;"
# Clear Neo4j graph
docker exec -it t6-neo4j cypher-shell -u neo4j -p YOUR_PASSWORD \
"MATCH (n) DETACH DELETE n"
```
### 4. Restart Services
```bash
docker compose restart
```
### 5. Test
Add new memories and verify they work with Ollama.
## Next Steps
Deploy with Ollama in Docker
Compare OpenAI vs Ollama performance