Implementation Summary:
- REST API with FastAPI (complete CRUD operations)
- MCP Server with Python MCP SDK (7 tools)
- Supabase migrations (pgvector setup)
- Docker Compose orchestration
- Mintlify documentation site
- Environment configuration
- Shared config module
REST API Features:
- POST /v1/memories/ - Add memory
- GET /v1/memories/search - Semantic search
- GET /v1/memories/{id} - Get memory
- GET /v1/memories/user/{user_id} - User memories
- PATCH /v1/memories/{id} - Update memory
- DELETE /v1/memories/{id} - Delete memory
- GET /v1/health - Health check
- GET /v1/stats - Statistics
- Bearer token authentication
- OpenAPI documentation
MCP Server Tools:
- add_memory - Add from messages
- search_memories - Semantic search
- get_memory - Retrieve by ID
- get_all_memories - List all
- update_memory - Update content
- delete_memory - Delete by ID
- delete_all_memories - Bulk delete
Infrastructure:
- Neo4j 5.26 with APOC/GDS
- Supabase pgvector integration
- Docker network: localai
- Health checks and monitoring
- Structured logging
Documentation:
- Introduction page
- Quickstart guide
- Architecture deep dive
- Mintlify configuration
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
314 lines
9.8 KiB
Plaintext
314 lines
9.8 KiB
Plaintext
---
|
|
title: 'System Architecture'
|
|
description: 'Technical architecture and design decisions for T6 Mem0 v2'
|
|
---
|
|
|
|
## Architecture Overview
|
|
|
|
T6 Mem0 v2 implements a **hybrid storage architecture** combining vector search, graph relationships, and structured data storage for optimal memory management.
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Client Layer │
|
|
├──────────────────┬──────────────────┬──────────────────────┤
|
|
│ Claude Code (MCP)│ N8N Workflows │ External Apps │
|
|
└──────────────────┴──────────────────┴──────────────────────┘
|
|
│ │ │
|
|
│ │ │
|
|
▼ ▼ ▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Interface Layer │
|
|
├──────────────────────────────┬──────────────────────────────┤
|
|
│ MCP Server (Port 8765) │ REST API (Port 8080) │
|
|
│ - SSE Connections │ - FastAPI │
|
|
│ - MCP Protocol │ - OpenAPI Spec │
|
|
│ - Tool Registration │ - Auth Middleware │
|
|
└──────────────────────────────┴──────────────────────────────┘
|
|
│ │
|
|
└────────┬───────────┘
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Core Layer │
|
|
│ Mem0 Core Library │
|
|
│ - Memory Management - Embedding Generation │
|
|
│ - Semantic Search - Relationship Extraction │
|
|
│ - Multi-Agent Support - Deduplication │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
│
|
|
┌───────────────────┼───────────────────┐
|
|
▼ ▼ ▼
|
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
│ Vector Store │ │ Graph Store │ │ External LLM │
|
|
│ Supabase │ │ Neo4j │ │ OpenAI │
|
|
│ (pgvector) │ │ (Cypher) │ │ (Embeddings) │
|
|
│ 172.21.0.12 │ │ 172.21.0.x │ │ API Cloud │
|
|
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
```
|
|
|
|
## Design Decisions
|
|
|
|
### 1. Hybrid Storage Architecture ✅
|
|
|
|
**Why Multiple Storage Systems?**
|
|
|
|
Each store is optimized for specific query patterns:
|
|
|
|
<AccordionGroup>
|
|
<Accordion title="Vector Store (Supabase + pgvector)">
|
|
**Purpose**: Semantic similarity search
|
|
|
|
- Stores 1536-dimensional OpenAI embeddings
|
|
- HNSW indexing for fast approximate nearest neighbor search
|
|
- O(log n) query performance
|
|
- Cosine distance for similarity measurement
|
|
</Accordion>
|
|
|
|
<Accordion title="Graph Store (Neo4j)">
|
|
**Purpose**: Relationship modeling
|
|
|
|
- Entity extraction and connection mapping
|
|
- Relationship traversal and pathfinding
|
|
- Visual exploration in Neo4j Browser
|
|
- Dynamic knowledge graph evolution
|
|
</Accordion>
|
|
|
|
<Accordion title="Key-Value Store (PostgreSQL JSONB)">
|
|
**Purpose**: Flexible metadata
|
|
|
|
- Schema-less metadata storage
|
|
- Fast JSON queries with GIN indexes
|
|
- Eliminates need for separate Redis
|
|
- Simplifies infrastructure
|
|
</Accordion>
|
|
</AccordionGroup>
|
|
|
|
### 2. MCP Server Implementation
|
|
|
|
**Custom vs. Pre-built**
|
|
|
|
<Info>
|
|
We built a custom MCP server instead of using OpenMemory MCP because:
|
|
- OpenMemory uses Qdrant (we need Supabase)
|
|
- Full control over Supabase + Neo4j integration
|
|
- Exact match to our storage stack
|
|
</Info>
|
|
|
|
### 3. Docker Networking Strategy
|
|
|
|
**localai Network Integration**
|
|
|
|
All services run on the `localai` Docker network (172.21.0.0/16):
|
|
|
|
```yaml
|
|
services:
|
|
neo4j: 172.21.0.x:7687
|
|
api: 172.21.0.x:8080
|
|
mcp-server: 172.21.0.x:8765
|
|
supabase: 172.21.0.12:5432 (existing)
|
|
```
|
|
|
|
**Benefits:**
|
|
- Container-to-container communication
|
|
- Service discovery via Docker DNS
|
|
- No host networking complications
|
|
- Persistent IPs via Docker Compose
|
|
|
|
## Data Flow
|
|
|
|
### Adding a Memory
|
|
|
|
<Steps>
|
|
<Step title="Client Request">
|
|
Client sends conversation messages via MCP or REST API
|
|
</Step>
|
|
<Step title="Mem0 Processing">
|
|
- LLM extracts key facts from messages
|
|
- Generates embedding vector (1536-dim)
|
|
- Identifies entities and relationships
|
|
</Step>
|
|
<Step title="Vector Storage">
|
|
Stores embedding + metadata in Supabase (pgvector)
|
|
</Step>
|
|
<Step title="Graph Storage">
|
|
Creates nodes and relationships in Neo4j
|
|
</Step>
|
|
<Step title="Response">
|
|
Returns memory ID and confirmation to client
|
|
</Step>
|
|
</Steps>
|
|
|
|
### Searching Memories
|
|
|
|
<Steps>
|
|
<Step title="Query Embedding">
|
|
Convert search query to vector using OpenAI
|
|
</Step>
|
|
<Step title="Vector Search">
|
|
Find similar memories in Supabase (cosine similarity)
|
|
</Step>
|
|
<Step title="Graph Enrichment">
|
|
Fetch related context from Neo4j graph
|
|
</Step>
|
|
<Step title="Ranked Results">
|
|
Return memories sorted by relevance score
|
|
</Step>
|
|
</Steps>
|
|
|
|
## Performance Characteristics
|
|
|
|
Based on mem0.ai research:
|
|
|
|
<CardGroup cols={3}>
|
|
<Card title="26% Accuracy Boost" icon="chart-line">
|
|
Higher accuracy vs baseline OpenAI
|
|
</Card>
|
|
<Card title="91% Lower Latency" icon="bolt">
|
|
Compared to full-context approaches
|
|
</Card>
|
|
<Card title="90% Token Savings" icon="dollar-sign">
|
|
Through selective memory retrieval
|
|
</Card>
|
|
</CardGroup>
|
|
|
|
## Security Architecture
|
|
|
|
### Authentication
|
|
|
|
- **REST API**: Bearer token authentication
|
|
- **MCP Server**: Client-specific SSE endpoints
|
|
- **Tokens**: Stored securely in environment variables
|
|
|
|
### Data Privacy
|
|
|
|
<Check>
|
|
All data stored locally - no cloud sync or external storage
|
|
</Check>
|
|
|
|
- Supabase instance is local (172.21.0.12)
|
|
- Neo4j runs in Docker container
|
|
- User isolation via `user_id` filtering
|
|
|
|
### Network Security
|
|
|
|
- Services on private Docker network
|
|
- No public exposure (use reverse proxy if needed)
|
|
- Internal communication only
|
|
|
|
## Scalability
|
|
|
|
### Horizontal Scaling
|
|
|
|
<Tabs>
|
|
<Tab title="REST API">
|
|
Deploy multiple API containers behind load balancer
|
|
</Tab>
|
|
<Tab title="MCP Server">
|
|
Dedicated instances per client group
|
|
</Tab>
|
|
<Tab title="Mem0 Core">
|
|
Stateless design scales with containers
|
|
</Tab>
|
|
</Tabs>
|
|
|
|
### Vertical Scaling
|
|
|
|
- **Supabase**: PostgreSQL connection pooling
|
|
- **Neo4j**: Memory configuration tuning
|
|
- **Vector Indexing**: HNSW for performance
|
|
|
|
## Technology Choices
|
|
|
|
| Component | Technology | Why? |
|
|
|-----------|-----------|------|
|
|
| Core Library | mem0ai | Production-ready, 26% accuracy boost |
|
|
| Vector DB | Supabase (pgvector) | Existing infrastructure, PostgreSQL |
|
|
| Graph DB | Neo4j | Best-in-class graph database |
|
|
| LLM | OpenAI | High-quality embeddings, GPT-4o |
|
|
| REST API | FastAPI | Fast, modern, auto-docs |
|
|
| MCP Protocol | Python MCP SDK | Official MCP implementation |
|
|
| Containers | Docker Compose | Simple orchestration |
|
|
|
|
## Phase 2: Ollama Integration
|
|
|
|
**Configuration-driven provider switching:**
|
|
|
|
```python
|
|
# Phase 1 (OpenAI)
|
|
"llm": {
|
|
"provider": "openai",
|
|
"config": {"model": "gpt-4o-mini"}
|
|
}
|
|
|
|
# Phase 2 (Ollama)
|
|
"llm": {
|
|
"provider": "ollama",
|
|
"config": {
|
|
"model": "llama3.1:8b",
|
|
"ollama_base_url": "http://172.21.0.1:11434"
|
|
}
|
|
}
|
|
```
|
|
|
|
**No code changes required** - just environment variables!
|
|
|
|
## Monitoring & Observability
|
|
|
|
### Metrics to Track
|
|
|
|
- Memory operations per second
|
|
- Average response time
|
|
- Vector search latency
|
|
- Graph query complexity
|
|
- OpenAI token usage
|
|
|
|
### Logging
|
|
|
|
- Structured JSON logs
|
|
- Request/response tracking
|
|
- Error aggregation
|
|
- Performance profiling
|
|
|
|
<Tip>
|
|
Use Prometheus + Grafana for production monitoring
|
|
</Tip>
|
|
|
|
## Deep Dive Resources
|
|
|
|
For complete architectural details, see:
|
|
|
|
- [ARCHITECTURE.md](https://git.colsys.tech/klas/t6_mem0_v2/blob/main/ARCHITECTURE.md)
|
|
- [PROJECT_REQUIREMENTS.md](https://git.colsys.tech/klas/t6_mem0_v2/blob/main/PROJECT_REQUIREMENTS.md)
|
|
|
|
## Next Steps
|
|
|
|
<CardGroup cols={2}>
|
|
<Card
|
|
title="Setup Supabase"
|
|
icon="database"
|
|
href="/setup/supabase"
|
|
>
|
|
Configure vector store
|
|
</Card>
|
|
<Card
|
|
title="Setup Neo4j"
|
|
icon="diagram-project"
|
|
href="/setup/neo4j"
|
|
>
|
|
Configure graph database
|
|
</Card>
|
|
<Card
|
|
title="API Reference"
|
|
icon="code"
|
|
href="/api-reference/introduction"
|
|
>
|
|
Explore endpoints
|
|
</Card>
|
|
<Card
|
|
title="MCP Integration"
|
|
icon="plug"
|
|
href="/mcp/introduction"
|
|
>
|
|
Connect with Claude Code
|
|
</Card>
|
|
</CardGroup>
|