{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Neptune as Graph Memory\n", "\n", "In this notebook, we will be connecting using a Amazon Neptune Analytics instance as our memory graph storage for Mem0.\n", "\n", "The Graph Memory storage persists memories in a graph or relationship form when performing `m.add` memory operations. It then uses vector distance algorithms to find related memories during a `m.search` operation. Relationships are returned in the result, and add context to the memories.\n", "\n", "Reference: [Vector Similarity using Neptune Analytics](https://docs.aws.amazon.com/neptune-analytics/latest/userguide/vector-similarity.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prerequisites\n", "\n", "### 1. Install Mem0 with Graph Memory support \n", "\n", "To use Mem0 with Graph Memory support, install it using pip:\n", "\n", "```bash\n", "pip install \"mem0ai[graph]\"\n", "```\n", "\n", "This command installs Mem0 along with the necessary dependencies for graph functionality.\n", "\n", "### 2. Connect to Neptune\n", "\n", "To connect to Amazon Neptune Analytics, you need to configure Neptune with your Amazon profile credentials. The best way to do this is to declare environment variables with IAM permission to your Neptune Analytics instance. The `graph-identifier` for the instance to persist memories needs to be defined in the Mem0 configuration under `\"graph_store\"`, with the `\"neptune\"` provider. Note that the Neptune Analytics instance needs to have `vector-search-configuration` defined to meet the needs of the llm model's vector dimensions, see: https://docs.aws.amazon.com/neptune-analytics/latest/userguide/vector-index.html.\n", "\n", "```python\n", "embedding_dimensions = 1536\n", "graph_identifier = \"\" # graph with 1536 dimensions for vector search\n", "config = {\n", " \"embedder\": {\n", " \"provider\": \"openai\",\n", " \"config\": {\n", " \"model\": \"text-embedding-3-large\",\n", " \"embedding_dims\": embedding_dimensions\n", " },\n", " },\n", " \"graph_store\": {\n", " \"provider\": \"neptune\",\n", " \"config\": {\n", " \"endpoint\": f\"neptune-graph://{graph_identifier}\",\n", " },\n", " },\n", "}\n", "```\n", "\n", "### 3. Configure OpenSearch\n", "\n", "We're going to use OpenSearch as our vector store. You can run [OpenSearch from docker image](https://docs.opensearch.org/docs/latest/install-and-configure/install-opensearch/docker/):\n", "\n", "```bash\n", "docker pull opensearchproject/opensearch:2\n", "```\n", "\n", "And verify that it's running with a ``:\n", "\n", "```bash\n", " docker run -d -p 9200:9200 -p 9600:9600 -e \"discovery.type=single-node\" -e \"OPENSEARCH_INITIAL_ADMIN_PASSWORD=\" opensearchproject/opensearch:latest\n", "\n", " curl https://localhost:9200 -ku admin:\n", "```\n", "\n", "We're going to connect [OpenSearch using the python client](https://github.com/opensearch-project/opensearch-py):\n", "\n", "```bash\n", "pip install \"opensearch-py\"\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Configuration\n", "\n", "Do all the imports and configure OpenAI (enter your OpenAI API key):" ] }, { "cell_type": "code", "metadata": { "ExecuteTime": { "end_time": "2025-07-03T20:52:48.330121Z", "start_time": "2025-07-03T20:52:47.092369Z" } }, "source": [ "from mem0 import Memory\n", "import os\n", "import logging\n", "import sys\n", "\n", "logging.getLogger(\"mem0.graphs.neptune.main\").setLevel(logging.DEBUG)\n", "logging.getLogger(\"mem0.graphs.neptune.base\").setLevel(logging.DEBUG)\n", "logger = logging.getLogger(__name__)\n", "logger.setLevel(logging.DEBUG)\n", "\n", "logging.basicConfig(\n", " format=\"%(levelname)s - %(message)s\",\n", " datefmt=\"%Y-%m-%d %H:%M:%S\",\n", " stream=sys.stdout, # Explicitly set output to stdout\n", ")" ], "outputs": [], "execution_count": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Setup the Mem0 configuration using:\n", "- openai as the embedder\n", "- Amazon Neptune Analytics instance as a graph store\n", "- OpenSearch as the vector store" ] }, { "cell_type": "code", "metadata": { "ExecuteTime": { "end_time": "2025-07-03T20:52:50.958741Z", "start_time": "2025-07-03T20:52:50.955127Z" } }, "source": [ "graph_identifier = os.environ.get(\"GRAPH_ID\")\n", "opensearch_username = os.environ.get(\"OS_USERNAME\")\n", "opensearch_password = os.environ.get(\"OS_PASSWORD\")\n", "config = {\n", " \"embedder\": {\n", " \"provider\": \"openai\",\n", " \"config\": {\"model\": \"text-embedding-3-large\", \"embedding_dims\": 1536},\n", " },\n", " \"graph_store\": {\n", " \"provider\": \"neptune\",\n", " \"config\": {\n", " \"endpoint\": f\"neptune-graph://{graph_identifier}\",\n", " },\n", " },\n", " \"vector_store\": {\n", " \"provider\": \"opensearch\",\n", " \"config\": {\n", " \"collection_name\": \"vector_store\",\n", " \"host\": \"localhost\",\n", " \"port\": 9200,\n", " \"user\": opensearch_username,\n", " \"password\": opensearch_password,\n", " \"use_ssl\": False,\n", " \"verify_certs\": False,\n", " },\n", " },\n", "}" ], "outputs": [], "execution_count": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Graph Memory initializiation\n", "\n", "Initialize Memgraph as a Graph Memory store:" ] }, { "cell_type": "code", "metadata": { "ExecuteTime": { "end_time": "2025-07-03T20:52:55.655673Z", "start_time": "2025-07-03T20:52:54.141041Z" } }, "source": [ "m = Memory.from_config(config_dict=config)\n", "\n", "app_id = \"movies\"\n", "user_id = \"alice\"\n", "\n", "m.delete_all(user_id=user_id)" ], "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "WARNING - Creating index vector_store, it might take 1-2 minutes...\n", "WARNING - Creating index mem0migrations, it might take 1-2 minutes...\n", "DEBUG - delete_all query=\n", " MATCH (n {user_id: $user_id})\n", " DETACH DELETE n\n", " \n" ] }, { "data": { "text/plain": [ "{'message': 'Memories deleted successfully!'}" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Store memories\n", "\n", "Create memories and store one at a time:" ] }, { "cell_type": "code", "metadata": { "ExecuteTime": { "end_time": "2025-07-03T20:53:05.338249Z", "start_time": "2025-07-03T20:52:57.528210Z" } }, "source": [ "messages = [\n", " {\n", " \"role\": \"user\",\n", " \"content\": \"I'm planning to watch a movie tonight. Any recommendations?\",\n", " },\n", "]\n", "\n", "# Store inferred memories (default behavior)\n", "result = m.add(messages, user_id=user_id, metadata={\"category\": \"movie_recommendations\"})\n", "\n", "all_results = m.get_all(user_id=user_id)\n", "for n in all_results[\"results\"]:\n", " print(f\"node \\\"{n['memory']}\\\": [hash: {n['hash']}]\")\n", "\n", "for e in all_results[\"relations\"]:\n", " print(f\"edge \\\"{e['source']}\\\" --{e['relationship']}--> \\\"{e['target']}\\\"\")" ], "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "DEBUG - Extracted entities: [{'source': 'alice', 'relationship': 'plans_to_watch', 'destination': 'movie'}]\n", "DEBUG - _search_graph_db\n", " query=\n", " MATCH (n )\n", " WHERE n.user_id = $user_id\n", " WITH n, $n_embedding as n_embedding\n", " CALL neptune.algo.vectors.distanceByEmbedding(\n", " n_embedding,\n", " n,\n", " {metric:\"CosineSimilarity\"}\n", " ) YIELD distance\n", " WITH n, distance as similarity\n", " WHERE similarity >= $threshold\n", " CALL {\n", " WITH n\n", " MATCH (n)-[r]->(m) \n", " RETURN n.name AS source, id(n) AS source_id, type(r) AS relationship, id(r) AS relation_id, m.name AS destination, id(m) AS destination_id\n", " UNION ALL\n", " WITH n\n", " MATCH (m)-[r]->(n) \n", " RETURN m.name AS source, id(m) AS source_id, type(r) AS relationship, id(r) AS relation_id, n.name AS destination, id(n) AS destination_id\n", " }\n", " WITH distinct source, source_id, relationship, relation_id, destination, destination_id, similarity\n", " RETURN source, source_id, relationship, relation_id, destination, destination_id, similarity\n", " ORDER BY similarity DESC\n", " LIMIT $limit\n", " \n", "DEBUG - Deleted relationships: []\n", "DEBUG - _search_source_node\n", " query=\n", " MATCH (source_candidate )\n", " WHERE source_candidate.user_id = $user_id \n", "\n", " WITH source_candidate, $source_embedding as v_embedding\n", " CALL neptune.algo.vectors.distanceByEmbedding(\n", " v_embedding,\n", " source_candidate,\n", " {metric:\"CosineSimilarity\"}\n", " ) YIELD distance\n", " WITH source_candidate, distance AS cosine_similarity\n", " WHERE cosine_similarity >= $threshold\n", "\n", " WITH source_candidate, cosine_similarity\n", " ORDER BY cosine_similarity DESC\n", " LIMIT 1\n", "\n", " RETURN id(source_candidate), cosine_similarity\n", " \n", "DEBUG - _search_destination_node\n", " query=\n", " MATCH (destination_candidate )\n", " WHERE destination_candidate.user_id = $user_id\n", " \n", " WITH destination_candidate, $destination_embedding as v_embedding\n", " CALL neptune.algo.vectors.distanceByEmbedding(\n", " v_embedding,\n", " destination_candidate, \n", " {metric:\"CosineSimilarity\"}\n", " ) YIELD distance\n", " WITH destination_candidate, distance AS cosine_similarity\n", " WHERE cosine_similarity >= $threshold\n", "\n", " WITH destination_candidate, cosine_similarity\n", " ORDER BY cosine_similarity DESC\n", " LIMIT 1\n", " \n", " RETURN id(destination_candidate), cosine_similarity\n", " \n", "DEBUG - _add_entities:\n", " destination_node_search_result=[]\n", " source_node_search_result=[]\n", " query=\n", " MERGE (n :`__User__` {name: $source_name, user_id: $user_id})\n", " ON CREATE SET n.created = timestamp(),\n", " n.mentions = 1\n", " \n", " ON MATCH SET n.mentions = coalesce(n.mentions, 0) + 1\n", " WITH n, $source_embedding as source_embedding\n", " CALL neptune.algo.vectors.upsert(n, source_embedding)\n", " WITH n\n", " MERGE (m :`entertainment` {name: $dest_name, user_id: $user_id})\n", " ON CREATE SET m.created = timestamp(),\n", " m.mentions = 1\n", " \n", " ON MATCH SET m.mentions = coalesce(m.mentions, 0) + 1\n", " WITH n, m, $dest_embedding as dest_embedding\n", " CALL neptune.algo.vectors.upsert(m, dest_embedding)\n", " WITH n, m\n", " MERGE (n)-[rel:plans_to_watch]->(m)\n", " ON CREATE SET rel.created = timestamp(), rel.mentions = 1\n", " ON MATCH SET rel.mentions = coalesce(rel.mentions, 0) + 1\n", " RETURN n.name AS source, type(rel) AS relationship, m.name AS target\n", " \n", "DEBUG - Retrieved 1 relationships\n", "node \"Planning to watch a movie tonight\": [hash: bf55418607cfdca4afa311b5fd8496bd]\n", "edge \"alice\" --plans_to_watch--> \"movie\"\n" ] } ], "execution_count": 4 }, { "cell_type": "code", "metadata": { "ExecuteTime": { "end_time": "2025-07-03T20:53:17.755933Z", "start_time": "2025-07-03T20:53:11.568772Z" } }, "source": [ "messages = [\n", " {\n", " \"role\": \"assistant\",\n", " \"content\": \"How about a thriller movies? They can be quite engaging.\",\n", " },\n", "]\n", "\n", "# Store inferred memories (default behavior)\n", "result = m.add(messages, user_id=user_id, metadata={\"category\": \"movie_recommendations\"})\n", "\n", "all_results = m.get_all(user_id=user_id)\n", "for n in all_results[\"results\"]:\n", " print(f\"node \\\"{n['memory']}\\\": [hash: {n['hash']}]\")\n", "\n", "for e in all_results[\"relations\"]:\n", " print(f\"edge \\\"{e['source']}\\\" --{e['relationship']}--> \\\"{e['target']}\\\"\")" ], "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "DEBUG - Extracted entities: [{'source': 'thriller_movies', 'relationship': 'is_engaging', 'destination': 'thriller_movies'}]\n", "DEBUG - _search_graph_db\n", " query=\n", " MATCH (n )\n", " WHERE n.user_id = $user_id\n", " WITH n, $n_embedding as n_embedding\n", " CALL neptune.algo.vectors.distanceByEmbedding(\n", " n_embedding,\n", " n,\n", " {metric:\"CosineSimilarity\"}\n", " ) YIELD distance\n", " WITH n, distance as similarity\n", " WHERE similarity >= $threshold\n", " CALL {\n", " WITH n\n", " MATCH (n)-[r]->(m) \n", " RETURN n.name AS source, id(n) AS source_id, type(r) AS relationship, id(r) AS relation_id, m.name AS destination, id(m) AS destination_id\n", " UNION ALL\n", " WITH n\n", " MATCH (m)-[r]->(n) \n", " RETURN m.name AS source, id(m) AS source_id, type(r) AS relationship, id(r) AS relation_id, n.name AS destination, id(n) AS destination_id\n", " }\n", " WITH distinct source, source_id, relationship, relation_id, destination, destination_id, similarity\n", " RETURN source, source_id, relationship, relation_id, destination, destination_id, similarity\n", " ORDER BY similarity DESC\n", " LIMIT $limit\n", " \n", "DEBUG - Deleted relationships: []\n", "DEBUG - _search_source_node\n", " query=\n", " MATCH (source_candidate )\n", " WHERE source_candidate.user_id = $user_id \n", "\n", " WITH source_candidate, $source_embedding as v_embedding\n", " CALL neptune.algo.vectors.distanceByEmbedding(\n", " v_embedding,\n", " source_candidate,\n", " {metric:\"CosineSimilarity\"}\n", " ) YIELD distance\n", " WITH source_candidate, distance AS cosine_similarity\n", " WHERE cosine_similarity >= $threshold\n", "\n", " WITH source_candidate, cosine_similarity\n", " ORDER BY cosine_similarity DESC\n", " LIMIT 1\n", "\n", " RETURN id(source_candidate), cosine_similarity\n", " \n", "DEBUG - _search_destination_node\n", " query=\n", " MATCH (destination_candidate )\n", " WHERE destination_candidate.user_id = $user_id\n", " \n", " WITH destination_candidate, $destination_embedding as v_embedding\n", " CALL neptune.algo.vectors.distanceByEmbedding(\n", " v_embedding,\n", " destination_candidate, \n", " {metric:\"CosineSimilarity\"}\n", " ) YIELD distance\n", " WITH destination_candidate, distance AS cosine_similarity\n", " WHERE cosine_similarity >= $threshold\n", "\n", " WITH destination_candidate, cosine_similarity\n", " ORDER BY cosine_similarity DESC\n", " LIMIT 1\n", " \n", " RETURN id(destination_candidate), cosine_similarity\n", " \n", "DEBUG - _add_entities:\n", " destination_node_search_result=[{'id(destination_candidate)': '67c49d52-e305-47fe-9fce-2cd5adc5d83c0', 'cosine_similarity': 0.999999}]\n", " source_node_search_result=[{'id(source_candidate)': '67c49d52-e305-47fe-9fce-2cd5adc5d83c0', 'cosine_similarity': 0.999999}]\n", " query=\n", " MATCH (source)\n", " WHERE id(source) = $source_id\n", " SET source.mentions = coalesce(source.mentions, 0) + 1\n", " WITH source\n", " MATCH (destination)\n", " WHERE id(destination) = $destination_id\n", " SET destination.mentions = coalesce(destination.mentions) + 1\n", " MERGE (source)-[r:is_engaging]->(destination)\n", " ON CREATE SET \n", " r.created_at = timestamp(),\n", " r.updated_at = timestamp(),\n", " r.mentions = 1\n", " ON MATCH SET r.mentions = coalesce(r.mentions, 0) + 1\n", " RETURN source.name AS source, type(r) AS relationship, destination.name AS target\n", " \n", "DEBUG - Retrieved 3 relationships\n", "node \"Planning to watch a movie tonight\": [hash: bf55418607cfdca4afa311b5fd8496bd]\n", "edge \"thriller_movies\" --is_a_type_of--> \"movie\"\n", "edge \"alice\" --plans_to_watch--> \"movie\"\n", "edge \"thriller_movies\" --is_engaging--> \"thriller_movies\"\n" ] } ], "execution_count": 6 }, { "cell_type": "code", "metadata": { "jupyter": { "is_executing": true }, "ExecuteTime": { "start_time": "2025-07-03T20:53:17.775656Z" } }, "source": [ "messages = [\n", " {\n", " \"role\": \"user\",\n", " \"content\": \"I'm not a big fan of thriller movies but I love sci-fi movies.\",\n", " },\n", "]\n", "\n", "# Store inferred memories (default behavior)\n", "result = m.add(messages, user_id=user_id, metadata={\"category\": \"movie_recommendations\"})\n", "\n", "all_results = m.get_all(user_id=user_id)\n", "for n in all_results[\"results\"]:\n", " print(f\"node \\\"{n['memory']}\\\": [hash: {n['hash']}]\")\n", "\n", "for e in all_results[\"relations\"]:\n", " print(f\"edge \\\"{e['source']}\\\" --{e['relationship']}--> \\\"{e['target']}\\\"\")" ], "outputs": [], "execution_count": null }, { "cell_type": "code", "metadata": {}, "source": [ "messages = [\n", " {\n", " \"role\": \"assistant\",\n", " \"content\": \"Got it! I'll avoid thriller recommendations and suggest sci-fi movies in the future.\",\n", " },\n", "]\n", "\n", "# Store inferred memories (default behavior)\n", "result = m.add(messages, user_id=user_id, metadata={\"category\": \"movie_recommendations\"})\n", "\n", "all_results = m.get_all(user_id=user_id)\n", "for n in all_results[\"results\"]:\n", " print(f\"node \\\"{n['memory']}\\\": [hash: {n['hash']}]\")\n", "\n", "for e in all_results[\"relations\"]:\n", " print(f\"edge \\\"{e['source']}\\\" --{e['relationship']}--> \\\"{e['target']}\\\"\")" ], "outputs": [], "execution_count": null }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Search memories" ] }, { "cell_type": "code", "metadata": {}, "source": [ "search_results = m.search(\"what does alice love?\", user_id=user_id)\n", "for result in search_results[\"results\"]:\n", " print(f\"\\\"{result['memory']}\\\" [score: {result['score']}]\")\n", "for relation in search_results[\"relations\"]:\n", " print(f\"{relation}\")" ], "outputs": [], "execution_count": null }, { "cell_type": "code", "metadata": {}, "source": [ "m.delete_all(\"user_id\")\n", "m.reset()" ], "outputs": [], "execution_count": null } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.2" } }, "nbformat": 4, "nbformat_minor": 2 }