- Add project requirements document - Add comprehensive architecture design - Add README with quick start guide - Add .gitignore for Python/Docker/Node 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
12 KiB
12 KiB
T6 Mem0 v2 - System Architecture
Executive Summary
Comprehensive memory system for LLM applications based on mem0.ai, featuring MCP server integration, REST API, hybrid storage (Supabase + Neo4j), and OpenAI embeddings.
Architecture Overview
┌─────────────────────────────────────────────────────────────┐
│ Client Layer │
├──────────────────┬──────────────────┬──────────────────────┤
│ Claude Code (MCP)│ N8N Workflows │ External Apps │
└──────────────────┴──────────────────┴──────────────────────┘
│ │ │
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────┐
│ Interface Layer │
├──────────────────────────────┬──────────────────────────────┤
│ MCP Server (Port 8765) │ REST API (Port 8080) │
│ - SSE Connections │ - FastAPI │
│ - MCP Protocol │ - OpenAPI Spec │
│ - Tool Registration │ - Auth Middleware │
└──────────────────────────────┴──────────────────────────────┘
│ │
└────────┬───────────┘
▼
┌─────────────────────────────────────────────────────────────┐
│ Core Layer │
│ Mem0 Core Library │
│ - Memory Management - Embedding Generation │
│ - Semantic Search - Relationship Extraction │
│ - Multi-Agent Support - Deduplication │
└─────────────────────────────────────────────────────────────┘
│
┌───────────────────┼───────────────────┐
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Vector Store │ │ Graph Store │ │ External LLM │
│ Supabase │ │ Neo4j │ │ OpenAI │
│ (pgvector) │ │ (Cypher) │ │ (Embeddings) │
│ 172.21.0.12 │ │ 172.21.0.x │ │ API Cloud │
└─────────────────┘ └─────────────────┘ └─────────────────┘
Design Decisions
1. MCP Server Approach: Custom Implementation ✅
Decision: Build custom MCP server using mem0 core library with Supabase + Neo4j
Rationale:
- Official OpenMemory MCP uses Qdrant (requirement is Supabase)
- Community implementations provide good templates but need customization
- Custom build ensures exact stack matching and full control
Implementation:
- Python-based MCP server using
mcplibrary - SSE (Server-Sent Events) for MCP protocol communication
- Shares mem0 configuration with REST API
2. Storage Architecture: Hybrid Multi-Store ✅
Vector Storage (Supabase + pgvector):
- Semantic search via cosine similarity
- 1536-dimensional embeddings (OpenAI text-embedding-3-small)
- PostgreSQL with pgvector extension
- Connection:
172.21.0.12:5432
Graph Storage (Neo4j):
- Relationship modeling between memory nodes
- Entity extraction and connection
- Visual exploration via Neo4j Browser
- New container on localai network
Key-Value Storage (PostgreSQL JSONB):
- Metadata storage in Supabase
- Eliminates need for separate Redis
- Simplifies infrastructure
Why This Works:
- Mem0's hybrid architecture expects multiple stores
- Each store optimized for specific query patterns
- Supabase handles both vector and structured data
- Neo4j specializes in relationship queries
3. API Layer Design: Dual Interface Pattern ✅
REST API:
- FastAPI framework
- Port: 8080
- Authentication: Bearer token
- OpenAPI documentation
- CRUD operations on memories
MCP Server:
- Port: 8765
- MCP protocol (SSE transport)
- Tool-based interface
- Compatible with Claude, Cursor, etc.
Shared Core:
- Both use same mem0 configuration
- Single source of truth for storage
- Consistent behavior across interfaces
4. Docker Networking: LocalAI Network Integration ✅
Network: localai (172.21.0.0/16)
Services:
Existing:
- Supabase: 172.21.0.12:5432
- N8N: (existing container)
New:
- Neo4j: 172.21.0.x:7687 (Bolt) + :7474 (Browser)
- REST API: 172.21.0.x:8080
- MCP Server: 172.21.0.x:8765
Benefits:
- All services on same network
- Direct container-to-container communication
- No host networking complications
- Persistent IPs via Docker Compose
5. Phase 1 vs Phase 2: Provider Abstraction ✅
Phase 1 (OpenAI):
config = {
"llm": {
"provider": "openai",
"config": {
"model": "gpt-4o-mini",
"temperature": 0.1
}
},
"embedder": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small"
}
}
}
Phase 2 (Ollama):
config = {
"llm": {
"provider": "ollama",
"config": {
"model": "llama3.1:8b",
"ollama_base_url": "http://172.21.0.1:11434"
}
},
"embedder": {
"provider": "ollama",
"config": {
"model": "nomic-embed-text"
}
}
}
Strategy:
- Configuration-driven provider selection
- Environment variable overrides
- No code changes for provider swap
- Mem0 natively supports both providers
Component Details
Mem0 Configuration
from mem0 import Memory
config = {
# Vector Store
"vector_store": {
"provider": "supabase",
"config": {
"connection_string": "postgresql://user:pass@172.21.0.12:5432/postgres",
"collection_name": "t6_memories",
"embedding_model_dims": 1536,
"index_method": "hnsw",
"index_measure": "cosine_distance"
}
},
# Graph Store
"graph_store": {
"provider": "neo4j",
"config": {
"url": "neo4j://172.21.0.x:7687",
"username": "neo4j",
"password": "${NEO4J_PASSWORD}"
}
},
# LLM Provider
"llm": {
"provider": "openai",
"config": {
"model": "gpt-4o-mini",
"temperature": 0.1,
"max_tokens": 2000
}
},
# Embedder
"embedder": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
"embedding_dims": 1536
}
},
# Version
"version": "v1.1"
}
memory = Memory.from_config(config_dict=config)
REST API Endpoints
POST /v1/memories/ - Add new memory
GET /v1/memories/{id} - Get specific memory
GET /v1/memories/search - Search memories
PATCH /v1/memories/{id} - Update memory
DELETE /v1/memories/{id} - Delete memory
GET /v1/memories/user/{id} - Get user memories
GET /v1/health - Health check
GET /v1/stats - System statistics
MCP Server Tools
add_memory - Add new memory to system
search_memories - Search memories by query
get_memory - Retrieve specific memory
update_memory - Update existing memory
delete_memory - Remove memory
list_user_memories - List all memories for user
get_memory_graph - Visualize memory relationships
Data Flow
Adding a Memory
1. Client → MCP/REST API
POST memory data with user_id
2. Interface Layer → Mem0 Core
Validate and process request
3. Mem0 Core → OpenAI
Generate embeddings (1536-dim vector)
4. Mem0 Core → Supabase
Store vector + metadata in PostgreSQL
5. Mem0 Core → Neo4j
Extract entities and relationships
Create nodes and edges in graph
6. Response → Client
Return memory_id and confirmation
Searching Memories
1. Client → MCP/REST API
Search query + filters (user_id, etc.)
2. Interface Layer → Mem0 Core
Process search request
3. Mem0 Core → OpenAI
Generate query embedding
4. Mem0 Core → Supabase
Vector similarity search (cosine)
Retrieve top-k matches
5. Mem0 Core → Neo4j
Fetch related graph context
Enrich results with relationships
6. Response → Client
Ranked results with relevance scores
Performance Characteristics
Based on mem0.ai research findings:
- Accuracy: 26% improvement over baseline OpenAI
- Latency: 91% lower p95 than full-context approaches
- Token Efficiency: 90% reduction via selective memory retrieval
- Storage: Hybrid approach optimal for different query patterns
Security Considerations
Authentication
- Bearer token authentication for REST API
- MCP server uses client-specific SSE endpoints
- Tokens stored in environment variables
Data Privacy
- All data stored locally (Supabase + Neo4j)
- No cloud sync or external storage
- User isolation via user_id filtering
Network Security
- Services on private Docker network
- No public exposure (use reverse proxy if needed)
- Internal communication only
Scalability Considerations
Horizontal Scaling
- REST API: Multiple containers behind load balancer
- MCP Server: Dedicated instances per client group
- Mem0 Core: Stateless, scales with API containers
Vertical Scaling
- Supabase: PostgreSQL connection pooling
- Neo4j: Memory configuration tuning
- Vector indexing: HNSW for performance
Monitoring & Observability
Metrics
- Memory operations (add/search/delete) per second
- Average response time
- Vector store query latency
- Graph query complexity
- Token usage (OpenAI API)
Logging
- Structured logging (JSON)
- Request/response tracking
- Error aggregation
- Performance profiling
Migration Path to Phase 2 (Ollama)
Changes Required
- Update configuration to use Ollama provider
- Deploy Ollama container on localai network
- Pull required models (llama3.1, nomic-embed-text)
- Update embedding dimensions if needed
- Test and validate performance
No Changes Required
- Storage layer (Supabase + Neo4j)
- API interfaces (REST + MCP)
- Docker networking
- Client integrations
Technology Stack Summary
| Layer | Technology | Version | Purpose |
|---|---|---|---|
| Core | mem0ai | latest | Memory management |
| Vector DB | Supabase (pgvector) | existing | Semantic search |
| Graph DB | Neo4j | 5.x | Relationships |
| LLM | OpenAI API | latest | Embeddings + reasoning |
| REST API | FastAPI | 0.115+ | HTTP interface |
| MCP Server | Python MCP SDK | latest | MCP protocol |
| Containerization | Docker Compose | latest | Orchestration |
| Documentation | Mintlify | latest | Docs site |
Next Steps
- Initialize Git repository
- Set up Docker Compose configuration
- Configure Supabase migrations (pgvector + tables)
- Deploy Neo4j container
- Implement REST API with FastAPI
- Build MCP server
- Create Mintlify documentation site
- Testing and validation
- Push to git repository
Last Updated: 2025-10-13 Status: Architecture Design Complete Next Phase: Implementation