Files
t6_mem0_v2/ARCHITECTURE.md
Claude Code cfa7abd23d Initial commit: Project foundation and architecture
- Add project requirements document
- Add comprehensive architecture design
- Add README with quick start guide
- Add .gitignore for Python/Docker/Node

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-13 15:01:50 +02:00

404 lines
12 KiB
Markdown

# T6 Mem0 v2 - System Architecture
## Executive Summary
Comprehensive memory system for LLM applications based on mem0.ai, featuring MCP server integration, REST API, hybrid storage (Supabase + Neo4j), and OpenAI embeddings.
## Architecture Overview
```
┌─────────────────────────────────────────────────────────────┐
│ Client Layer │
├──────────────────┬──────────────────┬──────────────────────┤
│ Claude Code (MCP)│ N8N Workflows │ External Apps │
└──────────────────┴──────────────────┴──────────────────────┘
│ │ │
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────┐
│ Interface Layer │
├──────────────────────────────┬──────────────────────────────┤
│ MCP Server (Port 8765) │ REST API (Port 8080) │
│ - SSE Connections │ - FastAPI │
│ - MCP Protocol │ - OpenAPI Spec │
│ - Tool Registration │ - Auth Middleware │
└──────────────────────────────┴──────────────────────────────┘
│ │
└────────┬───────────┘
┌─────────────────────────────────────────────────────────────┐
│ Core Layer │
│ Mem0 Core Library │
│ - Memory Management - Embedding Generation │
│ - Semantic Search - Relationship Extraction │
│ - Multi-Agent Support - Deduplication │
└─────────────────────────────────────────────────────────────┘
┌───────────────────┼───────────────────┐
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Vector Store │ │ Graph Store │ │ External LLM │
│ Supabase │ │ Neo4j │ │ OpenAI │
│ (pgvector) │ │ (Cypher) │ │ (Embeddings) │
│ 172.21.0.12 │ │ 172.21.0.x │ │ API Cloud │
└─────────────────┘ └─────────────────┘ └─────────────────┘
```
## Design Decisions
### 1. MCP Server Approach: Custom Implementation ✅
**Decision**: Build custom MCP server using mem0 core library with Supabase + Neo4j
**Rationale**:
- Official OpenMemory MCP uses Qdrant (requirement is Supabase)
- Community implementations provide good templates but need customization
- Custom build ensures exact stack matching and full control
**Implementation**:
- Python-based MCP server using `mcp` library
- SSE (Server-Sent Events) for MCP protocol communication
- Shares mem0 configuration with REST API
### 2. Storage Architecture: Hybrid Multi-Store ✅
**Vector Storage** (Supabase + pgvector):
- Semantic search via cosine similarity
- 1536-dimensional embeddings (OpenAI text-embedding-3-small)
- PostgreSQL with pgvector extension
- Connection: `172.21.0.12:5432`
**Graph Storage** (Neo4j):
- Relationship modeling between memory nodes
- Entity extraction and connection
- Visual exploration via Neo4j Browser
- New container on localai network
**Key-Value Storage** (PostgreSQL JSONB):
- Metadata storage in Supabase
- Eliminates need for separate Redis
- Simplifies infrastructure
**Why This Works**:
- Mem0's hybrid architecture expects multiple stores
- Each store optimized for specific query patterns
- Supabase handles both vector and structured data
- Neo4j specializes in relationship queries
### 3. API Layer Design: Dual Interface Pattern ✅
**REST API**:
- FastAPI framework
- Port: 8080
- Authentication: Bearer token
- OpenAPI documentation
- CRUD operations on memories
**MCP Server**:
- Port: 8765
- MCP protocol (SSE transport)
- Tool-based interface
- Compatible with Claude, Cursor, etc.
**Shared Core**:
- Both use same mem0 configuration
- Single source of truth for storage
- Consistent behavior across interfaces
### 4. Docker Networking: LocalAI Network Integration ✅
**Network**: `localai` (172.21.0.0/16)
**Services**:
```yaml
Existing:
- Supabase: 172.21.0.12:5432
- N8N: (existing container)
New:
- Neo4j: 172.21.0.x:7687 (Bolt) + :7474 (Browser)
- REST API: 172.21.0.x:8080
- MCP Server: 172.21.0.x:8765
```
**Benefits**:
- All services on same network
- Direct container-to-container communication
- No host networking complications
- Persistent IPs via Docker Compose
### 5. Phase 1 vs Phase 2: Provider Abstraction ✅
**Phase 1** (OpenAI):
```python
config = {
"llm": {
"provider": "openai",
"config": {
"model": "gpt-4o-mini",
"temperature": 0.1
}
},
"embedder": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small"
}
}
}
```
**Phase 2** (Ollama):
```python
config = {
"llm": {
"provider": "ollama",
"config": {
"model": "llama3.1:8b",
"ollama_base_url": "http://172.21.0.1:11434"
}
},
"embedder": {
"provider": "ollama",
"config": {
"model": "nomic-embed-text"
}
}
}
```
**Strategy**:
- Configuration-driven provider selection
- Environment variable overrides
- No code changes for provider swap
- Mem0 natively supports both providers
## Component Details
### Mem0 Configuration
```python
from mem0 import Memory
config = {
# Vector Store
"vector_store": {
"provider": "supabase",
"config": {
"connection_string": "postgresql://user:pass@172.21.0.12:5432/postgres",
"collection_name": "t6_memories",
"embedding_model_dims": 1536,
"index_method": "hnsw",
"index_measure": "cosine_distance"
}
},
# Graph Store
"graph_store": {
"provider": "neo4j",
"config": {
"url": "neo4j://172.21.0.x:7687",
"username": "neo4j",
"password": "${NEO4J_PASSWORD}"
}
},
# LLM Provider
"llm": {
"provider": "openai",
"config": {
"model": "gpt-4o-mini",
"temperature": 0.1,
"max_tokens": 2000
}
},
# Embedder
"embedder": {
"provider": "openai",
"config": {
"model": "text-embedding-3-small",
"embedding_dims": 1536
}
},
# Version
"version": "v1.1"
}
memory = Memory.from_config(config_dict=config)
```
### REST API Endpoints
```
POST /v1/memories/ - Add new memory
GET /v1/memories/{id} - Get specific memory
GET /v1/memories/search - Search memories
PATCH /v1/memories/{id} - Update memory
DELETE /v1/memories/{id} - Delete memory
GET /v1/memories/user/{id} - Get user memories
GET /v1/health - Health check
GET /v1/stats - System statistics
```
### MCP Server Tools
```
add_memory - Add new memory to system
search_memories - Search memories by query
get_memory - Retrieve specific memory
update_memory - Update existing memory
delete_memory - Remove memory
list_user_memories - List all memories for user
get_memory_graph - Visualize memory relationships
```
## Data Flow
### Adding a Memory
```
1. Client → MCP/REST API
POST memory data with user_id
2. Interface Layer → Mem0 Core
Validate and process request
3. Mem0 Core → OpenAI
Generate embeddings (1536-dim vector)
4. Mem0 Core → Supabase
Store vector + metadata in PostgreSQL
5. Mem0 Core → Neo4j
Extract entities and relationships
Create nodes and edges in graph
6. Response → Client
Return memory_id and confirmation
```
### Searching Memories
```
1. Client → MCP/REST API
Search query + filters (user_id, etc.)
2. Interface Layer → Mem0 Core
Process search request
3. Mem0 Core → OpenAI
Generate query embedding
4. Mem0 Core → Supabase
Vector similarity search (cosine)
Retrieve top-k matches
5. Mem0 Core → Neo4j
Fetch related graph context
Enrich results with relationships
6. Response → Client
Ranked results with relevance scores
```
## Performance Characteristics
Based on mem0.ai research findings:
- **Accuracy**: 26% improvement over baseline OpenAI
- **Latency**: 91% lower p95 than full-context approaches
- **Token Efficiency**: 90% reduction via selective memory retrieval
- **Storage**: Hybrid approach optimal for different query patterns
## Security Considerations
### Authentication
- Bearer token authentication for REST API
- MCP server uses client-specific SSE endpoints
- Tokens stored in environment variables
### Data Privacy
- All data stored locally (Supabase + Neo4j)
- No cloud sync or external storage
- User isolation via user_id filtering
### Network Security
- Services on private Docker network
- No public exposure (use reverse proxy if needed)
- Internal communication only
## Scalability Considerations
### Horizontal Scaling
- REST API: Multiple containers behind load balancer
- MCP Server: Dedicated instances per client group
- Mem0 Core: Stateless, scales with API containers
### Vertical Scaling
- Supabase: PostgreSQL connection pooling
- Neo4j: Memory configuration tuning
- Vector indexing: HNSW for performance
## Monitoring & Observability
### Metrics
- Memory operations (add/search/delete) per second
- Average response time
- Vector store query latency
- Graph query complexity
- Token usage (OpenAI API)
### Logging
- Structured logging (JSON)
- Request/response tracking
- Error aggregation
- Performance profiling
## Migration Path to Phase 2 (Ollama)
### Changes Required
1. Update configuration to use Ollama provider
2. Deploy Ollama container on localai network
3. Pull required models (llama3.1, nomic-embed-text)
4. Update embedding dimensions if needed
5. Test and validate performance
### No Changes Required
- Storage layer (Supabase + Neo4j)
- API interfaces (REST + MCP)
- Docker networking
- Client integrations
## Technology Stack Summary
| Layer | Technology | Version | Purpose |
|-------|-----------|---------|---------|
| Core | mem0ai | latest | Memory management |
| Vector DB | Supabase (pgvector) | existing | Semantic search |
| Graph DB | Neo4j | 5.x | Relationships |
| LLM | OpenAI API | latest | Embeddings + reasoning |
| REST API | FastAPI | 0.115+ | HTTP interface |
| MCP Server | Python MCP SDK | latest | MCP protocol |
| Containerization | Docker Compose | latest | Orchestration |
| Documentation | Mintlify | latest | Docs site |
## Next Steps
1. Initialize Git repository
2. Set up Docker Compose configuration
3. Configure Supabase migrations (pgvector + tables)
4. Deploy Neo4j container
5. Implement REST API with FastAPI
6. Build MCP server
7. Create Mintlify documentation site
8. Testing and validation
9. Push to git repository
---
**Last Updated**: 2025-10-13
**Status**: Architecture Design Complete
**Next Phase**: Implementation