# T6 Mem0 v2 - System Architecture ## Executive Summary Comprehensive memory system for LLM applications based on mem0.ai, featuring MCP server integration, REST API, hybrid storage (Supabase + Neo4j), and OpenAI embeddings. ## Architecture Overview ``` ┌─────────────────────────────────────────────────────────────┐ │ Client Layer │ ├──────────────────┬──────────────────┬──────────────────────┤ │ Claude Code (MCP)│ N8N Workflows │ External Apps │ └──────────────────┴──────────────────┴──────────────────────┘ │ │ │ │ │ │ ▼ ▼ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Interface Layer │ ├──────────────────────────────┬──────────────────────────────┤ │ MCP Server (Port 8765) │ REST API (Port 8080) │ │ - SSE Connections │ - FastAPI │ │ - MCP Protocol │ - OpenAPI Spec │ │ - Tool Registration │ - Auth Middleware │ └──────────────────────────────┴──────────────────────────────┘ │ │ └────────┬───────────┘ ▼ ┌─────────────────────────────────────────────────────────────┐ │ Core Layer │ │ Mem0 Core Library │ │ - Memory Management - Embedding Generation │ │ - Semantic Search - Relationship Extraction │ │ - Multi-Agent Support - Deduplication │ └─────────────────────────────────────────────────────────────┘ │ ┌───────────────────┼───────────────────┐ ▼ ▼ ▼ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ Vector Store │ │ Graph Store │ │ External LLM │ │ Supabase │ │ Neo4j │ │ OpenAI │ │ (pgvector) │ │ (Cypher) │ │ (Embeddings) │ │ 172.21.0.12 │ │ 172.21.0.x │ │ API Cloud │ └─────────────────┘ └─────────────────┘ └─────────────────┘ ``` ## Design Decisions ### 1. MCP Server Approach: Custom Implementation ✅ **Decision**: Build custom MCP server using mem0 core library with Supabase + Neo4j **Rationale**: - Official OpenMemory MCP uses Qdrant (requirement is Supabase) - Community implementations provide good templates but need customization - Custom build ensures exact stack matching and full control **Implementation**: - Python-based MCP server using `mcp` library - SSE (Server-Sent Events) for MCP protocol communication - Shares mem0 configuration with REST API ### 2. Storage Architecture: Hybrid Multi-Store ✅ **Vector Storage** (Supabase + pgvector): - Semantic search via cosine similarity - 1536-dimensional embeddings (OpenAI text-embedding-3-small) - PostgreSQL with pgvector extension - Connection: `172.21.0.12:5432` **Graph Storage** (Neo4j): - Relationship modeling between memory nodes - Entity extraction and connection - Visual exploration via Neo4j Browser - New container on localai network **Key-Value Storage** (PostgreSQL JSONB): - Metadata storage in Supabase - Eliminates need for separate Redis - Simplifies infrastructure **Why This Works**: - Mem0's hybrid architecture expects multiple stores - Each store optimized for specific query patterns - Supabase handles both vector and structured data - Neo4j specializes in relationship queries ### 3. API Layer Design: Dual Interface Pattern ✅ **REST API**: - FastAPI framework - Port: 8080 - Authentication: Bearer token - OpenAPI documentation - CRUD operations on memories **MCP Server**: - Port: 8765 - MCP protocol (SSE transport) - Tool-based interface - Compatible with Claude, Cursor, etc. **Shared Core**: - Both use same mem0 configuration - Single source of truth for storage - Consistent behavior across interfaces ### 4. Docker Networking: LocalAI Network Integration ✅ **Network**: `localai` (172.21.0.0/16) **Services**: ```yaml Existing: - Supabase: 172.21.0.12:5432 - N8N: (existing container) New: - Neo4j: 172.21.0.x:7687 (Bolt) + :7474 (Browser) - REST API: 172.21.0.x:8080 - MCP Server: 172.21.0.x:8765 ``` **Benefits**: - All services on same network - Direct container-to-container communication - No host networking complications - Persistent IPs via Docker Compose ### 5. Phase 1 vs Phase 2: Provider Abstraction ✅ **Phase 1** (OpenAI): ```python config = { "llm": { "provider": "openai", "config": { "model": "gpt-4o-mini", "temperature": 0.1 } }, "embedder": { "provider": "openai", "config": { "model": "text-embedding-3-small" } } } ``` **Phase 2** (Ollama): ```python config = { "llm": { "provider": "ollama", "config": { "model": "llama3.1:8b", "ollama_base_url": "http://172.21.0.1:11434" } }, "embedder": { "provider": "ollama", "config": { "model": "nomic-embed-text" } } } ``` **Strategy**: - Configuration-driven provider selection - Environment variable overrides - No code changes for provider swap - Mem0 natively supports both providers ## Component Details ### Mem0 Configuration ```python from mem0 import Memory config = { # Vector Store "vector_store": { "provider": "supabase", "config": { "connection_string": "postgresql://user:pass@172.21.0.12:5432/postgres", "collection_name": "t6_memories", "embedding_model_dims": 1536, "index_method": "hnsw", "index_measure": "cosine_distance" } }, # Graph Store "graph_store": { "provider": "neo4j", "config": { "url": "neo4j://172.21.0.x:7687", "username": "neo4j", "password": "${NEO4J_PASSWORD}" } }, # LLM Provider "llm": { "provider": "openai", "config": { "model": "gpt-4o-mini", "temperature": 0.1, "max_tokens": 2000 } }, # Embedder "embedder": { "provider": "openai", "config": { "model": "text-embedding-3-small", "embedding_dims": 1536 } }, # Version "version": "v1.1" } memory = Memory.from_config(config_dict=config) ``` ### REST API Endpoints ``` POST /v1/memories/ - Add new memory GET /v1/memories/{id} - Get specific memory GET /v1/memories/search - Search memories PATCH /v1/memories/{id} - Update memory DELETE /v1/memories/{id} - Delete memory GET /v1/memories/user/{id} - Get user memories GET /v1/health - Health check GET /v1/stats - System statistics ``` ### MCP Server Tools ``` add_memory - Add new memory to system search_memories - Search memories by query get_memory - Retrieve specific memory update_memory - Update existing memory delete_memory - Remove memory list_user_memories - List all memories for user get_memory_graph - Visualize memory relationships ``` ## Data Flow ### Adding a Memory ``` 1. Client → MCP/REST API POST memory data with user_id 2. Interface Layer → Mem0 Core Validate and process request 3. Mem0 Core → OpenAI Generate embeddings (1536-dim vector) 4. Mem0 Core → Supabase Store vector + metadata in PostgreSQL 5. Mem0 Core → Neo4j Extract entities and relationships Create nodes and edges in graph 6. Response → Client Return memory_id and confirmation ``` ### Searching Memories ``` 1. Client → MCP/REST API Search query + filters (user_id, etc.) 2. Interface Layer → Mem0 Core Process search request 3. Mem0 Core → OpenAI Generate query embedding 4. Mem0 Core → Supabase Vector similarity search (cosine) Retrieve top-k matches 5. Mem0 Core → Neo4j Fetch related graph context Enrich results with relationships 6. Response → Client Ranked results with relevance scores ``` ## Performance Characteristics Based on mem0.ai research findings: - **Accuracy**: 26% improvement over baseline OpenAI - **Latency**: 91% lower p95 than full-context approaches - **Token Efficiency**: 90% reduction via selective memory retrieval - **Storage**: Hybrid approach optimal for different query patterns ## Security Considerations ### Authentication - Bearer token authentication for REST API - MCP server uses client-specific SSE endpoints - Tokens stored in environment variables ### Data Privacy - All data stored locally (Supabase + Neo4j) - No cloud sync or external storage - User isolation via user_id filtering ### Network Security - Services on private Docker network - No public exposure (use reverse proxy if needed) - Internal communication only ## Scalability Considerations ### Horizontal Scaling - REST API: Multiple containers behind load balancer - MCP Server: Dedicated instances per client group - Mem0 Core: Stateless, scales with API containers ### Vertical Scaling - Supabase: PostgreSQL connection pooling - Neo4j: Memory configuration tuning - Vector indexing: HNSW for performance ## Monitoring & Observability ### Metrics - Memory operations (add/search/delete) per second - Average response time - Vector store query latency - Graph query complexity - Token usage (OpenAI API) ### Logging - Structured logging (JSON) - Request/response tracking - Error aggregation - Performance profiling ## Migration Path to Phase 2 (Ollama) ### Changes Required 1. Update configuration to use Ollama provider 2. Deploy Ollama container on localai network 3. Pull required models (llama3.1, nomic-embed-text) 4. Update embedding dimensions if needed 5. Test and validate performance ### No Changes Required - Storage layer (Supabase + Neo4j) - API interfaces (REST + MCP) - Docker networking - Client integrations ## Technology Stack Summary | Layer | Technology | Version | Purpose | |-------|-----------|---------|---------| | Core | mem0ai | latest | Memory management | | Vector DB | Supabase (pgvector) | existing | Semantic search | | Graph DB | Neo4j | 5.x | Relationships | | LLM | OpenAI API | latest | Embeddings + reasoning | | REST API | FastAPI | 0.115+ | HTTP interface | | MCP Server | Python MCP SDK | latest | MCP protocol | | Containerization | Docker Compose | latest | Orchestration | | Documentation | Mintlify | latest | Docs site | ## Next Steps 1. Initialize Git repository 2. Set up Docker Compose configuration 3. Configure Supabase migrations (pgvector + tables) 4. Deploy Neo4j container 5. Implement REST API with FastAPI 6. Build MCP server 7. Create Mintlify documentation site 8. Testing and validation 9. Push to git repository --- **Last Updated**: 2025-10-13 **Status**: Architecture Design Complete **Next Phase**: Implementation