- Add project requirements document - Add comprehensive architecture design - Add README with quick start guide - Add .gitignore for Python/Docker/Node 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
404 lines
12 KiB
Markdown
404 lines
12 KiB
Markdown
# T6 Mem0 v2 - System Architecture
|
|
|
|
## Executive Summary
|
|
|
|
Comprehensive memory system for LLM applications based on mem0.ai, featuring MCP server integration, REST API, hybrid storage (Supabase + Neo4j), and OpenAI embeddings.
|
|
|
|
## Architecture Overview
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Client Layer │
|
|
├──────────────────┬──────────────────┬──────────────────────┤
|
|
│ Claude Code (MCP)│ N8N Workflows │ External Apps │
|
|
└──────────────────┴──────────────────┴──────────────────────┘
|
|
│ │ │
|
|
│ │ │
|
|
▼ ▼ ▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Interface Layer │
|
|
├──────────────────────────────┬──────────────────────────────┤
|
|
│ MCP Server (Port 8765) │ REST API (Port 8080) │
|
|
│ - SSE Connections │ - FastAPI │
|
|
│ - MCP Protocol │ - OpenAPI Spec │
|
|
│ - Tool Registration │ - Auth Middleware │
|
|
└──────────────────────────────┴──────────────────────────────┘
|
|
│ │
|
|
└────────┬───────────┘
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Core Layer │
|
|
│ Mem0 Core Library │
|
|
│ - Memory Management - Embedding Generation │
|
|
│ - Semantic Search - Relationship Extraction │
|
|
│ - Multi-Agent Support - Deduplication │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
│
|
|
┌───────────────────┼───────────────────┐
|
|
▼ ▼ ▼
|
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
│ Vector Store │ │ Graph Store │ │ External LLM │
|
|
│ Supabase │ │ Neo4j │ │ OpenAI │
|
|
│ (pgvector) │ │ (Cypher) │ │ (Embeddings) │
|
|
│ 172.21.0.12 │ │ 172.21.0.x │ │ API Cloud │
|
|
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
```
|
|
|
|
## Design Decisions
|
|
|
|
### 1. MCP Server Approach: Custom Implementation ✅
|
|
|
|
**Decision**: Build custom MCP server using mem0 core library with Supabase + Neo4j
|
|
|
|
**Rationale**:
|
|
- Official OpenMemory MCP uses Qdrant (requirement is Supabase)
|
|
- Community implementations provide good templates but need customization
|
|
- Custom build ensures exact stack matching and full control
|
|
|
|
**Implementation**:
|
|
- Python-based MCP server using `mcp` library
|
|
- SSE (Server-Sent Events) for MCP protocol communication
|
|
- Shares mem0 configuration with REST API
|
|
|
|
### 2. Storage Architecture: Hybrid Multi-Store ✅
|
|
|
|
**Vector Storage** (Supabase + pgvector):
|
|
- Semantic search via cosine similarity
|
|
- 1536-dimensional embeddings (OpenAI text-embedding-3-small)
|
|
- PostgreSQL with pgvector extension
|
|
- Connection: `172.21.0.12:5432`
|
|
|
|
**Graph Storage** (Neo4j):
|
|
- Relationship modeling between memory nodes
|
|
- Entity extraction and connection
|
|
- Visual exploration via Neo4j Browser
|
|
- New container on localai network
|
|
|
|
**Key-Value Storage** (PostgreSQL JSONB):
|
|
- Metadata storage in Supabase
|
|
- Eliminates need for separate Redis
|
|
- Simplifies infrastructure
|
|
|
|
**Why This Works**:
|
|
- Mem0's hybrid architecture expects multiple stores
|
|
- Each store optimized for specific query patterns
|
|
- Supabase handles both vector and structured data
|
|
- Neo4j specializes in relationship queries
|
|
|
|
### 3. API Layer Design: Dual Interface Pattern ✅
|
|
|
|
**REST API**:
|
|
- FastAPI framework
|
|
- Port: 8080
|
|
- Authentication: Bearer token
|
|
- OpenAPI documentation
|
|
- CRUD operations on memories
|
|
|
|
**MCP Server**:
|
|
- Port: 8765
|
|
- MCP protocol (SSE transport)
|
|
- Tool-based interface
|
|
- Compatible with Claude, Cursor, etc.
|
|
|
|
**Shared Core**:
|
|
- Both use same mem0 configuration
|
|
- Single source of truth for storage
|
|
- Consistent behavior across interfaces
|
|
|
|
### 4. Docker Networking: LocalAI Network Integration ✅
|
|
|
|
**Network**: `localai` (172.21.0.0/16)
|
|
|
|
**Services**:
|
|
```yaml
|
|
Existing:
|
|
- Supabase: 172.21.0.12:5432
|
|
- N8N: (existing container)
|
|
|
|
New:
|
|
- Neo4j: 172.21.0.x:7687 (Bolt) + :7474 (Browser)
|
|
- REST API: 172.21.0.x:8080
|
|
- MCP Server: 172.21.0.x:8765
|
|
```
|
|
|
|
**Benefits**:
|
|
- All services on same network
|
|
- Direct container-to-container communication
|
|
- No host networking complications
|
|
- Persistent IPs via Docker Compose
|
|
|
|
### 5. Phase 1 vs Phase 2: Provider Abstraction ✅
|
|
|
|
**Phase 1** (OpenAI):
|
|
```python
|
|
config = {
|
|
"llm": {
|
|
"provider": "openai",
|
|
"config": {
|
|
"model": "gpt-4o-mini",
|
|
"temperature": 0.1
|
|
}
|
|
},
|
|
"embedder": {
|
|
"provider": "openai",
|
|
"config": {
|
|
"model": "text-embedding-3-small"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Phase 2** (Ollama):
|
|
```python
|
|
config = {
|
|
"llm": {
|
|
"provider": "ollama",
|
|
"config": {
|
|
"model": "llama3.1:8b",
|
|
"ollama_base_url": "http://172.21.0.1:11434"
|
|
}
|
|
},
|
|
"embedder": {
|
|
"provider": "ollama",
|
|
"config": {
|
|
"model": "nomic-embed-text"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Strategy**:
|
|
- Configuration-driven provider selection
|
|
- Environment variable overrides
|
|
- No code changes for provider swap
|
|
- Mem0 natively supports both providers
|
|
|
|
## Component Details
|
|
|
|
### Mem0 Configuration
|
|
|
|
```python
|
|
from mem0 import Memory
|
|
|
|
config = {
|
|
# Vector Store
|
|
"vector_store": {
|
|
"provider": "supabase",
|
|
"config": {
|
|
"connection_string": "postgresql://user:pass@172.21.0.12:5432/postgres",
|
|
"collection_name": "t6_memories",
|
|
"embedding_model_dims": 1536,
|
|
"index_method": "hnsw",
|
|
"index_measure": "cosine_distance"
|
|
}
|
|
},
|
|
|
|
# Graph Store
|
|
"graph_store": {
|
|
"provider": "neo4j",
|
|
"config": {
|
|
"url": "neo4j://172.21.0.x:7687",
|
|
"username": "neo4j",
|
|
"password": "${NEO4J_PASSWORD}"
|
|
}
|
|
},
|
|
|
|
# LLM Provider
|
|
"llm": {
|
|
"provider": "openai",
|
|
"config": {
|
|
"model": "gpt-4o-mini",
|
|
"temperature": 0.1,
|
|
"max_tokens": 2000
|
|
}
|
|
},
|
|
|
|
# Embedder
|
|
"embedder": {
|
|
"provider": "openai",
|
|
"config": {
|
|
"model": "text-embedding-3-small",
|
|
"embedding_dims": 1536
|
|
}
|
|
},
|
|
|
|
# Version
|
|
"version": "v1.1"
|
|
}
|
|
|
|
memory = Memory.from_config(config_dict=config)
|
|
```
|
|
|
|
### REST API Endpoints
|
|
|
|
```
|
|
POST /v1/memories/ - Add new memory
|
|
GET /v1/memories/{id} - Get specific memory
|
|
GET /v1/memories/search - Search memories
|
|
PATCH /v1/memories/{id} - Update memory
|
|
DELETE /v1/memories/{id} - Delete memory
|
|
GET /v1/memories/user/{id} - Get user memories
|
|
GET /v1/health - Health check
|
|
GET /v1/stats - System statistics
|
|
```
|
|
|
|
### MCP Server Tools
|
|
|
|
```
|
|
add_memory - Add new memory to system
|
|
search_memories - Search memories by query
|
|
get_memory - Retrieve specific memory
|
|
update_memory - Update existing memory
|
|
delete_memory - Remove memory
|
|
list_user_memories - List all memories for user
|
|
get_memory_graph - Visualize memory relationships
|
|
```
|
|
|
|
## Data Flow
|
|
|
|
### Adding a Memory
|
|
|
|
```
|
|
1. Client → MCP/REST API
|
|
POST memory data with user_id
|
|
|
|
2. Interface Layer → Mem0 Core
|
|
Validate and process request
|
|
|
|
3. Mem0 Core → OpenAI
|
|
Generate embeddings (1536-dim vector)
|
|
|
|
4. Mem0 Core → Supabase
|
|
Store vector + metadata in PostgreSQL
|
|
|
|
5. Mem0 Core → Neo4j
|
|
Extract entities and relationships
|
|
Create nodes and edges in graph
|
|
|
|
6. Response → Client
|
|
Return memory_id and confirmation
|
|
```
|
|
|
|
### Searching Memories
|
|
|
|
```
|
|
1. Client → MCP/REST API
|
|
Search query + filters (user_id, etc.)
|
|
|
|
2. Interface Layer → Mem0 Core
|
|
Process search request
|
|
|
|
3. Mem0 Core → OpenAI
|
|
Generate query embedding
|
|
|
|
4. Mem0 Core → Supabase
|
|
Vector similarity search (cosine)
|
|
Retrieve top-k matches
|
|
|
|
5. Mem0 Core → Neo4j
|
|
Fetch related graph context
|
|
Enrich results with relationships
|
|
|
|
6. Response → Client
|
|
Ranked results with relevance scores
|
|
```
|
|
|
|
## Performance Characteristics
|
|
|
|
Based on mem0.ai research findings:
|
|
|
|
- **Accuracy**: 26% improvement over baseline OpenAI
|
|
- **Latency**: 91% lower p95 than full-context approaches
|
|
- **Token Efficiency**: 90% reduction via selective memory retrieval
|
|
- **Storage**: Hybrid approach optimal for different query patterns
|
|
|
|
## Security Considerations
|
|
|
|
### Authentication
|
|
- Bearer token authentication for REST API
|
|
- MCP server uses client-specific SSE endpoints
|
|
- Tokens stored in environment variables
|
|
|
|
### Data Privacy
|
|
- All data stored locally (Supabase + Neo4j)
|
|
- No cloud sync or external storage
|
|
- User isolation via user_id filtering
|
|
|
|
### Network Security
|
|
- Services on private Docker network
|
|
- No public exposure (use reverse proxy if needed)
|
|
- Internal communication only
|
|
|
|
## Scalability Considerations
|
|
|
|
### Horizontal Scaling
|
|
- REST API: Multiple containers behind load balancer
|
|
- MCP Server: Dedicated instances per client group
|
|
- Mem0 Core: Stateless, scales with API containers
|
|
|
|
### Vertical Scaling
|
|
- Supabase: PostgreSQL connection pooling
|
|
- Neo4j: Memory configuration tuning
|
|
- Vector indexing: HNSW for performance
|
|
|
|
## Monitoring & Observability
|
|
|
|
### Metrics
|
|
- Memory operations (add/search/delete) per second
|
|
- Average response time
|
|
- Vector store query latency
|
|
- Graph query complexity
|
|
- Token usage (OpenAI API)
|
|
|
|
### Logging
|
|
- Structured logging (JSON)
|
|
- Request/response tracking
|
|
- Error aggregation
|
|
- Performance profiling
|
|
|
|
## Migration Path to Phase 2 (Ollama)
|
|
|
|
### Changes Required
|
|
1. Update configuration to use Ollama provider
|
|
2. Deploy Ollama container on localai network
|
|
3. Pull required models (llama3.1, nomic-embed-text)
|
|
4. Update embedding dimensions if needed
|
|
5. Test and validate performance
|
|
|
|
### No Changes Required
|
|
- Storage layer (Supabase + Neo4j)
|
|
- API interfaces (REST + MCP)
|
|
- Docker networking
|
|
- Client integrations
|
|
|
|
## Technology Stack Summary
|
|
|
|
| Layer | Technology | Version | Purpose |
|
|
|-------|-----------|---------|---------|
|
|
| Core | mem0ai | latest | Memory management |
|
|
| Vector DB | Supabase (pgvector) | existing | Semantic search |
|
|
| Graph DB | Neo4j | 5.x | Relationships |
|
|
| LLM | OpenAI API | latest | Embeddings + reasoning |
|
|
| REST API | FastAPI | 0.115+ | HTTP interface |
|
|
| MCP Server | Python MCP SDK | latest | MCP protocol |
|
|
| Containerization | Docker Compose | latest | Orchestration |
|
|
| Documentation | Mintlify | latest | Docs site |
|
|
|
|
## Next Steps
|
|
|
|
1. Initialize Git repository
|
|
2. Set up Docker Compose configuration
|
|
3. Configure Supabase migrations (pgvector + tables)
|
|
4. Deploy Neo4j container
|
|
5. Implement REST API with FastAPI
|
|
6. Build MCP server
|
|
7. Create Mintlify documentation site
|
|
8. Testing and validation
|
|
9. Push to git repository
|
|
|
|
---
|
|
|
|
**Last Updated**: 2025-10-13
|
|
**Status**: Architecture Design Complete
|
|
**Next Phase**: Implementation
|