t6_mem0_v2/ARCHITECTURE.md

# T6 Mem0 v2 - System Architecture

## Executive Summary

Comprehensive memory system for LLM applications based on mem0.ai, featuring MCP server integration, REST API, hybrid storage (Supabase + Neo4j), and OpenAI embeddings.

## Architecture Overview

```
┌─────────────────────────────────────────────────────────────┐
│                      Client Layer                            │
├──────────────────┬──────────────────┬──────────────────────┤
│ Claude Code (MCP)│  N8N Workflows   │  External Apps       │
└──────────────────┴──────────────────┴──────────────────────┘
         │                    │                    │
         │                    │                    │
         ▼                    ▼                    ▼
┌─────────────────────────────────────────────────────────────┐
│                   Interface Layer                            │
├──────────────────────────────┬──────────────────────────────┤
│   MCP Server (Port 8765)     │   REST API (Port 8080)       │
│   - SSE Connections          │   - FastAPI                  │
│   - MCP Protocol             │   - OpenAPI Spec             │
│   - Tool Registration        │   - Auth Middleware          │
└──────────────────────────────┴──────────────────────────────┘
                    │                    │
                    └────────┬───────────┘
                             ▼
┌─────────────────────────────────────────────────────────────┐
│                    Core Layer                                │
│                 Mem0 Core Library                            │
│  - Memory Management   - Embedding Generation                │
│  - Semantic Search     - Relationship Extraction             │
│  - Multi-Agent Support - Deduplication                       │
└─────────────────────────────────────────────────────────────┘
                             │
         ┌───────────────────┼───────────────────┐
         ▼                   ▼                   ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│  Vector Store   │ │   Graph Store   │ │  External LLM   │
│   Supabase      │ │     Neo4j       │ │    OpenAI       │
│   (pgvector)    │ │   (Cypher)      │ │  (Embeddings)   │
│ 172.21.0.12     │ │  172.21.0.x     │ │   API Cloud     │
└─────────────────┘ └─────────────────┘ └─────────────────┘
```

## Design Decisions

### 1. MCP Server Approach: Custom Implementation ✅

**Decision**: Build custom MCP server using mem0 core library with Supabase + Neo4j

**Rationale**:
- Official OpenMemory MCP uses Qdrant (requirement is Supabase)
- Community implementations provide good templates but need customization
- Custom build ensures exact stack matching and full control

**Implementation**:
- Python-based MCP server using `mcp` library
- SSE (Server-Sent Events) for MCP protocol communication
- Shares mem0 configuration with REST API

### 2. Storage Architecture: Hybrid Multi-Store ✅

**Vector Storage** (Supabase + pgvector):
- Semantic search via cosine similarity
- 1536-dimensional embeddings (OpenAI text-embedding-3-small)
- PostgreSQL with pgvector extension
- Connection: `172.21.0.12:5432`

**Graph Storage** (Neo4j):
- Relationship modeling between memory nodes
- Entity extraction and connection
- Visual exploration via Neo4j Browser
- New container on localai network

**Key-Value Storage** (PostgreSQL JSONB):
- Metadata storage in Supabase
- Eliminates need for separate Redis
- Simplifies infrastructure

**Why This Works**:
- Mem0's hybrid architecture expects multiple stores
- Each store optimized for specific query patterns
- Supabase handles both vector and structured data
- Neo4j specializes in relationship queries

### 3. API Layer Design: Dual Interface Pattern ✅

**REST API**:
- FastAPI framework
- Port: 8080
- Authentication: Bearer token
- OpenAPI documentation
- CRUD operations on memories

**MCP Server**:
- Port: 8765
- MCP protocol (SSE transport)
- Tool-based interface
- Compatible with Claude, Cursor, etc.

**Shared Core**:
- Both use same mem0 configuration
- Single source of truth for storage
- Consistent behavior across interfaces

### 4. Docker Networking: LocalAI Network Integration ✅

**Network**: `localai` (172.21.0.0/16)

**Services**:
```yaml
Existing:
- Supabase: 172.21.0.12:5432
- N8N: (existing container)

New:
- Neo4j: 172.21.0.x:7687 (Bolt) + :7474 (Browser)
- REST API: 172.21.0.x:8080
- MCP Server: 172.21.0.x:8765
```

**Benefits**:
- All services on same network
- Direct container-to-container communication
- No host networking complications
- Persistent IPs via Docker Compose

### 5. Phase 1 vs Phase 2: Provider Abstraction ✅

**Phase 1** (OpenAI):
```python
config = {
    "llm": {
        "provider": "openai",
        "config": {
            "model": "gpt-4o-mini",
            "temperature": 0.1
        }
    },
    "embedder": {
        "provider": "openai",
        "config": {
            "model": "text-embedding-3-small"
        }
    }
}
```

**Phase 2** (Ollama):
```python
config = {
    "llm": {
        "provider": "ollama",
        "config": {
            "model": "llama3.1:8b",
            "ollama_base_url": "http://172.21.0.1:11434"
        }
    },
    "embedder": {
        "provider": "ollama",
        "config": {
            "model": "nomic-embed-text"
        }
    }
}
```

**Strategy**:
- Configuration-driven provider selection
- Environment variable overrides
- No code changes for provider swap
- Mem0 natively supports both providers

## Component Details

### Mem0 Configuration

```python
from mem0 import Memory

config = {
    # Vector Store
    "vector_store": {
        "provider": "supabase",
        "config": {
            "connection_string": "postgresql://user:pass@172.21.0.12:5432/postgres",
            "collection_name": "t6_memories",
            "embedding_model_dims": 1536,
            "index_method": "hnsw",
            "index_measure": "cosine_distance"
        }
    },

    # Graph Store
    "graph_store": {
        "provider": "neo4j",
        "config": {
            "url": "neo4j://172.21.0.x:7687",
            "username": "neo4j",
            "password": "${NEO4J_PASSWORD}"
        }
    },

    # LLM Provider
    "llm": {
        "provider": "openai",
        "config": {
            "model": "gpt-4o-mini",
            "temperature": 0.1,
            "max_tokens": 2000
        }
    },

    # Embedder
    "embedder": {
        "provider": "openai",
        "config": {
            "model": "text-embedding-3-small",
            "embedding_dims": 1536
        }
    },

    # Version
    "version": "v1.1"
}

memory = Memory.from_config(config_dict=config)
```

### REST API Endpoints

```
POST   /v1/memories/          - Add new memory
GET    /v1/memories/{id}      - Get specific memory
GET    /v1/memories/search    - Search memories
PATCH  /v1/memories/{id}      - Update memory
DELETE /v1/memories/{id}      - Delete memory
GET    /v1/memories/user/{id} - Get user memories
GET    /v1/health             - Health check
GET    /v1/stats              - System statistics
```

### MCP Server Tools

```
add_memory              - Add new memory to system
search_memories         - Search memories by query
get_memory              - Retrieve specific memory
update_memory           - Update existing memory
delete_memory           - Remove memory
list_user_memories      - List all memories for user
get_memory_graph        - Visualize memory relationships
```

## Data Flow

### Adding a Memory

```
1. Client → MCP/REST API
   POST memory data with user_id

2. Interface Layer → Mem0 Core
   Validate and process request

3. Mem0 Core → OpenAI
   Generate embeddings (1536-dim vector)

4. Mem0 Core → Supabase
   Store vector + metadata in PostgreSQL

5. Mem0 Core → Neo4j
   Extract entities and relationships
   Create nodes and edges in graph

6. Response → Client
   Return memory_id and confirmation
```

### Searching Memories

```
1. Client → MCP/REST API
   Search query + filters (user_id, etc.)

2. Interface Layer → Mem0 Core
   Process search request

3. Mem0 Core → OpenAI
   Generate query embedding

4. Mem0 Core → Supabase
   Vector similarity search (cosine)
   Retrieve top-k matches

5. Mem0 Core → Neo4j
   Fetch related graph context
   Enrich results with relationships

6. Response → Client
   Ranked results with relevance scores
```

## Performance Characteristics

Based on mem0.ai research findings:

- **Accuracy**: 26% improvement over baseline OpenAI
- **Latency**: 91% lower p95 than full-context approaches
- **Token Efficiency**: 90% reduction via selective memory retrieval
- **Storage**: Hybrid approach optimal for different query patterns

## Security Considerations

### Authentication
- Bearer token authentication for REST API
- MCP server uses client-specific SSE endpoints
- Tokens stored in environment variables

### Data Privacy
- All data stored locally (Supabase + Neo4j)
- No cloud sync or external storage
- User isolation via user_id filtering

### Network Security
- Services on private Docker network
- No public exposure (use reverse proxy if needed)
- Internal communication only

## Scalability Considerations

### Horizontal Scaling
- REST API: Multiple containers behind load balancer
- MCP Server: Dedicated instances per client group
- Mem0 Core: Stateless, scales with API containers

### Vertical Scaling
- Supabase: PostgreSQL connection pooling
- Neo4j: Memory configuration tuning
- Vector indexing: HNSW for performance

## Monitoring & Observability

### Metrics
- Memory operations (add/search/delete) per second
- Average response time
- Vector store query latency
- Graph query complexity
- Token usage (OpenAI API)

### Logging
- Structured logging (JSON)
- Request/response tracking
- Error aggregation
- Performance profiling

## Migration Path to Phase 2 (Ollama)

### Changes Required
1. Update configuration to use Ollama provider
2. Deploy Ollama container on localai network
3. Pull required models (llama3.1, nomic-embed-text)
4. Update embedding dimensions if needed
5. Test and validate performance

### No Changes Required
- Storage layer (Supabase + Neo4j)
- API interfaces (REST + MCP)
- Docker networking
- Client integrations

## Technology Stack Summary

| Layer | Technology | Version | Purpose |
|-------|-----------|---------|---------|
| Core | mem0ai | latest | Memory management |
| Vector DB | Supabase (pgvector) | existing | Semantic search |
| Graph DB | Neo4j | 5.x | Relationships |
| LLM | OpenAI API | latest | Embeddings + reasoning |
| REST API | FastAPI | 0.115+ | HTTP interface |
| MCP Server | Python MCP SDK | latest | MCP protocol |
| Containerization | Docker Compose | latest | Orchestration |
| Documentation | Mintlify | latest | Docs site |

## Next Steps

1. Initialize Git repository
2. Set up Docker Compose configuration
3. Configure Supabase migrations (pgvector + tables)
4. Deploy Neo4j container
5. Implement REST API with FastAPI
6. Build MCP server
7. Create Mintlify documentation site
8. Testing and validation
9. Push to git repository

---

**Last Updated**: 2025-10-13
**Status**: Architecture Design Complete
**Next Phase**: Implementation