--- title: 'Ollama Setup' description: 'Use local LLM models with Ollama instead of OpenAI' --- # Ollama Setup Guide T6 Mem0 v2 supports both OpenAI and Ollama as LLM providers. Use Ollama to run completely local models without requiring OpenAI API credits. ## Why Ollama? - **Cost-effective**: No API costs, run models locally - **Privacy**: All data stays on your infrastructure - **Offline capability**: Works without internet connection - **Model variety**: Access to Llama, Mistral, and other open-source models ## Prerequisites - Docker and Docker Compose (if using containerized deployment) - Or Ollama installed locally - Sufficient RAM (8GB+ for smaller models, 16GB+ recommended) - GPU optional but recommended for better performance ## Installation ### Option 1: Ollama on Host Machine **Install Ollama:** ```bash # Linux curl -fsSL https://ollama.com/install.sh | sh # macOS brew install ollama # Or download from https://ollama.com/download ``` **Start Ollama service:** ```bash ollama serve ``` **Pull required models:** ```bash # LLM model (choose one) ollama pull llama3.1:8b # 8B parameters, 4.7GB ollama pull llama3.1:70b # 70B parameters, 40GB (requires 48GB RAM) ollama pull mistral:7b # 7B parameters, 4.1GB # Embedding model (required) ollama pull nomic-embed-text # 274MB ``` ### Option 2: Ollama in Docker **Add to docker-compose.yml:** ```yaml services: ollama: image: ollama/ollama:latest container_name: t6-ollama ports: - "11434:11434" volumes: - ollama_data:/root/.ollama networks: - localai restart: unless-stopped volumes: ollama_data: ``` **Pull models inside container:** ```bash docker exec -it t6-ollama ollama pull llama3.1:8b docker exec -it t6-ollama ollama pull nomic-embed-text ``` ## Configuration ### Environment Variables Update your `.env` file: ```bash # Switch to Ollama LLM_PROVIDER=ollama EMBEDDER_PROVIDER=ollama # Ollama configuration OLLAMA_BASE_URL=http://localhost:11434 OLLAMA_LLM_MODEL=llama3.1:8b OLLAMA_EMBEDDING_MODEL=nomic-embed-text # OpenAI key no longer required # OPENAI_API_KEY= # Can be left empty ``` ### Docker Network Configuration If running Ollama in Docker on the same network as mem0: ```bash # Find Ollama container IP docker inspect t6-ollama --format='{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' # Update .env OLLAMA_BASE_URL=http://172.21.0.15:11434 # Use actual container IP ``` Or use Docker service name: ```bash OLLAMA_BASE_URL=http://ollama:11434 # If on same Docker network ``` ## Model Selection ### LLM Models | Model | Size | RAM Required | Use Case | |-------|------|--------------|----------| | `llama3.1:8b` | 4.7GB | 8GB | General purpose, fast | | `llama3.1:70b` | 40GB | 48GB | High quality responses | | `mistral:7b` | 4.1GB | 8GB | Fast, efficient | | `codellama:7b` | 3.8GB | 8GB | Code generation | | `phi3:3.8b` | 2.3GB | 4GB | Smallest viable model | ### Embedding Models | Model | Size | Dimensions | Use Case | |-------|------|------------|----------| | `nomic-embed-text` | 274MB | 768 | Recommended, fast | | `mxbai-embed-large` | 669MB | 1024 | Higher quality | | `all-minilm` | 46MB | 384 | Smallest option | **Important**: Update `MEM0_EMBEDDING_DIMS` to match your embedding model: ```bash # For nomic-embed-text MEM0_EMBEDDING_DIMS=768 # For mxbai-embed-large MEM0_EMBEDDING_DIMS=1024 # For all-minilm MEM0_EMBEDDING_DIMS=384 ``` ## Switching Between OpenAI and Ollama ### Full Ollama Configuration ```bash LLM_PROVIDER=ollama EMBEDDER_PROVIDER=ollama OLLAMA_BASE_URL=http://localhost:11434 OLLAMA_LLM_MODEL=llama3.1:8b OLLAMA_EMBEDDING_MODEL=nomic-embed-text MEM0_EMBEDDING_DIMS=768 ``` ### Hybrid Configuration Use Ollama for LLM but OpenAI for embeddings: ```bash LLM_PROVIDER=ollama EMBEDDER_PROVIDER=openai OLLAMA_BASE_URL=http://localhost:11434 OLLAMA_LLM_MODEL=llama3.1:8b OPENAI_API_KEY=sk-your-key MEM0_EMBEDDING_DIMS=1536 # OpenAI dimensions ``` Or use OpenAI for LLM but Ollama for embeddings: ```bash LLM_PROVIDER=openai EMBEDDER_PROVIDER=ollama OPENAI_API_KEY=sk-your-key OLLAMA_BASE_URL=http://localhost:11434 OLLAMA_EMBEDDING_MODEL=nomic-embed-text MEM0_EMBEDDING_DIMS=768 # Ollama dimensions ``` ### Back to OpenAI ```bash LLM_PROVIDER=openai EMBEDDER_PROVIDER=openai OPENAI_API_KEY=sk-your-key MEM0_EMBEDDING_DIMS=1536 ``` ## Deployment ### Docker Deployment with Ollama **Complete docker-compose.yml:** ```yaml version: '3.8' services: ollama: image: ollama/ollama:latest container_name: t6-ollama ports: - "11434:11434" volumes: - ollama_data:/root/.ollama networks: - localai restart: unless-stopped mcp-server: build: context: . dockerfile: docker/Dockerfile.mcp container_name: t6-mem0-mcp restart: unless-stopped ports: - "8765:8765" environment: - LLM_PROVIDER=ollama - EMBEDDER_PROVIDER=ollama - OLLAMA_BASE_URL=http://ollama:11434 - OLLAMA_LLM_MODEL=llama3.1:8b - OLLAMA_EMBEDDING_MODEL=nomic-embed-text - MEM0_EMBEDDING_DIMS=768 - SUPABASE_CONNECTION_STRING=${SUPABASE_CONNECTION_STRING} - NEO4J_URI=neo4j://neo4j:7687 - NEO4J_USER=${NEO4J_USER} - NEO4J_PASSWORD=${NEO4J_PASSWORD} depends_on: - ollama - neo4j networks: - localai neo4j: image: neo4j:5.26.0 container_name: t6-neo4j ports: - "7474:7474" - "7687:7687" environment: - NEO4J_AUTH=neo4j/${NEO4J_PASSWORD} volumes: - neo4j_data:/data networks: - localai volumes: ollama_data: neo4j_data: networks: localai: external: true ``` **Startup sequence:** ```bash # Start services docker compose up -d # Pull models docker exec -it t6-ollama ollama pull llama3.1:8b docker exec -it t6-ollama ollama pull nomic-embed-text # Verify Ollama is working curl http://localhost:11434/api/tags # Restart mem0 services to pick up models docker compose restart mcp-server ``` ## Testing ### Test Ollama Connection ```bash # List available models curl http://localhost:11434/api/tags # Test generation curl http://localhost:11434/api/generate -d '{ "model": "llama3.1:8b", "prompt": "Hello, world!", "stream": false }' ``` ### Test Memory Operations ```bash # Add memory via REST API curl -X POST http://localhost:8080/v1/memories/ \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "messages": [ {"role": "user", "content": "I love local AI models"}, {"role": "assistant", "content": "Noted!"} ], "user_id": "test_user" }' # Check logs for Ollama usage docker logs t6-mem0-mcp --tail 50 ``` ## Performance Tuning ### GPU Acceleration If you have an NVIDIA GPU: ```yaml ollama: image: ollama/ollama:latest deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: [gpu] ``` ### Model Caching Models are cached in `ollama_data` volume. To clear cache: ```bash docker volume rm ollama_data ``` ### Concurrent Requests Ollama handles concurrent requests by default. For high load: ```yaml ollama: environment: - OLLAMA_NUM_PARALLEL=4 # Number of parallel requests - OLLAMA_MAX_LOADED_MODELS=2 # Keep models in memory ``` ## Troubleshooting ### Ollama Not Responding ```bash # Check Ollama status curl http://localhost:11434/api/tags # Check logs docker logs t6-ollama # Restart Ollama docker restart t6-ollama ``` ### Model Not Found ```bash # List pulled models docker exec -it t6-ollama ollama list # Pull missing model docker exec -it t6-ollama ollama pull llama3.1:8b ``` ### Out of Memory Try a smaller model: ```bash # Switch to smaller model in .env OLLAMA_LLM_MODEL=phi3:3.8b # Or use quantized version OLLAMA_LLM_MODEL=llama3.1:8b-q4_0 # 4-bit quantization ``` ### Slow Response Times - Use GPU acceleration - Use smaller models (phi3:3.8b) - Reduce concurrent requests - Check system resources (RAM, CPU) ### Connection Refused If mem0 can't connect to Ollama: ```bash # Test from mem0 container docker exec -it t6-mem0-mcp curl http://ollama:11434/api/tags # Check both containers on same network docker network inspect localai ``` ## Migration from OpenAI ### 1. Pull Models ```bash ollama pull llama3.1:8b ollama pull nomic-embed-text ``` ### 2. Update Configuration ```bash # Backup current .env cp .env .env.openai.backup # Update .env LLM_PROVIDER=ollama EMBEDDER_PROVIDER=ollama OLLAMA_BASE_URL=http://localhost:11434 OLLAMA_LLM_MODEL=llama3.1:8b OLLAMA_EMBEDDING_MODEL=nomic-embed-text MEM0_EMBEDDING_DIMS=768 # Changed from 1536 ``` ### 3. Clear Existing Embeddings (Important!) When switching embedding models, you must clear existing embeddings as dimensions changed from 1536 (OpenAI) to 768 (Ollama). ```bash # Clear Supabase embeddings psql $SUPABASE_CONNECTION_STRING -c "DELETE FROM t6_memories;" # Clear Neo4j graph docker exec -it t6-neo4j cypher-shell -u neo4j -p YOUR_PASSWORD \ "MATCH (n) DETACH DELETE n" ``` ### 4. Restart Services ```bash docker compose restart ``` ### 5. Test Add new memories and verify they work with Ollama. ## Next Steps Deploy with Ollama in Docker Compare OpenAI vs Ollama performance