diff --git a/docs/components/vectordb.mdx b/docs/components/vectordb.mdx deleted file mode 100644 index 288887c6..00000000 --- a/docs/components/vectordb.mdx +++ /dev/null @@ -1,106 +0,0 @@ ---- -title: Supported Vector Databases ---- - -## Overview - -Mem0 includes built-in support for various popular databases. Memory can utilize the database provided by the user, ensuring efficient use for specific needs. - - - - - - - - -## Qdrant - -[Qdrant](https://qdrant.tech/) is an open-source vector search engine. It is designed to work with large-scale datasets and provides a high-performance search engine for vector data. - -To use Qdrant you can do like this: - -```python -import os -from mem0 import Memory - -os.environ["OPENAI_API_KEY"] = "sk-xx" - -config = { - "vector_store": { - "provider": "qdrant", - "config": { - "collection_name": "test", - "host": "localhost", - "port": 6333, - } - } -} - -m = Memory.from_config(config) -m.add("Likes to play cricket on weekends", user_id="alice", metadata={"category": "hobbies"}) -``` - -## Chroma - -[Chroma](https://www.trychroma.com/) is an AI-native open-source vector database that simplifies building LLM apps by providing tools for storing, embedding, and searching embeddings with a focus on simplicity and speed. - -To use ChromaDB you can do like this: - -```python -import os -from mem0 import Memory - -os.environ["OPENAI_API_KEY"] = "sk-xx" - -config = { - "vector_store": { - "provider": "chroma", - "config": { - "collection_name": "test", - "path": "db", - } - } -} - -m = Memory.from_config(config) -m.add("Likes to play cricket on weekends", user_id="alice", metadata={"category": "hobbies"}) -``` - -## pgvector - -[pgvector](https://github.com/pgvector/pgvector) is open-source vector similarity search for Postgres. After connecting with postgres run `CREATE EXTENSION IF NOT EXISTS vector;` to create the vector extension. - -Here's how to use it: - -```python -import os -from mem0 import Memory - -os.environ["OPENAI_API_KEY"] = "sk-xx" - -config = { - "vector_store": { - "provider": "pgvector", - "config": { - "user": "test", - "password": "123", - "host": "127.0.0.1", - "port": "5432", - } - } -} - -m = Memory.from_config(config) -m.add("Likes to play cricket on weekends", user_id="alice", metadata={"category": "hobbies"}) -``` - -## Common issues - -### Using model with different dimensions - -If you are using customized model, which is having different dimensions other than 1536 -for example 768, you may encounter below error: - -`ValueError: shapes (0,1536) and (768,) not aligned: 1536 (dim 1) != 768 (dim 0)` - -you could add `"embedding_model_dims": 768,` to the config of the vector_store to overcome this issue. diff --git a/docs/components/vectordbs/config.mdx b/docs/components/vectordbs/config.mdx new file mode 100644 index 00000000..fe0f1fcd --- /dev/null +++ b/docs/components/vectordbs/config.mdx @@ -0,0 +1,72 @@ +## What is Config? + +Config in mem0 is a dictionary that specifies the settings for your vector database. It allows you to customize the behavior and connection details of your chosen vector store. + +## How to Define Config + +The config is defined as a Python dictionary with two main keys: +- `vector_store`: Specifies the vector database provider and its configuration + - `provider`: The name of the vector database (e.g., "chroma", "pgvector", "qdrant") + - `config`: A nested dictionary containing provider-specific settings + +## How to Use Config + +Here's a general example of how to use the config with mem0: + +```python +import os +from mem0 import Memory + +os.environ["OPENAI_API_KEY"] = "sk-xx" + +config = { + "vector_store": { + "provider": "your_chosen_provider", + "config": { + # Provider-specific settings go here + } + } +} + +m = Memory.from_config(config) +m.add("Your text here", user_id="user", metadata={"category": "example"}) +``` + +## Why is Config Needed? + +Config is essential for: +1. Specifying which vector database to use. +2. Providing necessary connection details (e.g., host, port, credentials). +3. Customizing database-specific settings (e.g., collection name, path). +4. Ensuring proper initialization and connection to your chosen vector store. + +## Master List of All Params in Config + +Here's a comprehensive list of all parameters that can be used across different vector databases: + +| Parameter | Description | +|-----------|-------------| +| `collection_name` | Name of the collection | +| `embedding_model_dims` | Dimensions of the embedding model | +| `client` | Custom client for the database | +| `path` | Path for the database | +| `host` | Host where the server is running | +| `port` | Port where the server is running | +| `user` | Username for database connection | +| `password` | Password for database connection | +| `dbname` | Name of the database | +| `url` | Full URL for the server | +| `api_key` | API key for the server | +| `on_disk` | Enable persistent storage | + +## Customizing Config + +Each vector database has its own specific configuration requirements. To customize the config for your chosen vector store: + +1. Identify the vector database you want to use from [supported vector databases](./dbs). +2. Refer to the `Config` section in the respective vector database's documentation. +3. Include only the relevant parameters for your chosen database in the `config` dictionary. + +## Supported Vector Databases + +For detailed information on configuring specific vector databases, please visit the [Supported Vector Databases](./dbs) section. There you'll find individual pages for each supported vector store with provider-specific usage examples and configuration details. diff --git a/docs/components/vectordbs/dbs/chroma.mdx b/docs/components/vectordbs/dbs/chroma.mdx new file mode 100644 index 00000000..a5fd527e --- /dev/null +++ b/docs/components/vectordbs/dbs/chroma.mdx @@ -0,0 +1,35 @@ +[Chroma](https://www.trychroma.com/) is an AI-native open-source vector database that simplifies building LLM apps by providing tools for storing, embedding, and searching embeddings with a focus on simplicity and speed. + +### Usage + +```python +import os +from mem0 import Memory + +os.environ["OPENAI_API_KEY"] = "sk-xx" + +config = { + "vector_store": { + "provider": "chroma", + "config": { + "collection_name": "test", + "path": "db", + } + } +} + +m = Memory.from_config(config) +m.add("Likes to play cricket on weekends", user_id="alice", metadata={"category": "hobbies"}) +``` + +### Config + +Here are the parameters available for configuring Chroma: + +| Parameter | Description | Default Value | +| --- | --- | --- | +| `collection_name` | The name of the collection | `mem0` | +| `client` | Custom client for Chroma | `None` | +| `path` | Path for the Chroma database | `db` | +| `host` | The host where the Chroma server is running | `None` | +| `port` | The port where the Chroma server is running | `None` | \ No newline at end of file diff --git a/docs/components/vectordbs/dbs/pgvector.mdx b/docs/components/vectordbs/dbs/pgvector.mdx new file mode 100644 index 00000000..6aa8de05 --- /dev/null +++ b/docs/components/vectordbs/dbs/pgvector.mdx @@ -0,0 +1,39 @@ +[pgvector](https://github.com/pgvector/pgvector) is open-source vector similarity search for Postgres. After connecting with postgres run `CREATE EXTENSION IF NOT EXISTS vector;` to create the vector extension. + +### Usage + +```python +import os +from mem0 import Memory + +os.environ["OPENAI_API_KEY"] = "sk-xx" + +config = { + "vector_store": { + "provider": "pgvector", + "config": { + "user": "test", + "password": "123", + "host": "127.0.0.1", + "port": "5432", + } + } +} + +m = Memory.from_config(config) +m.add("Likes to play cricket on weekends", user_id="alice", metadata={"category": "hobbies"}) +``` + +### Config + +Here's the parameters available for configuring pgvector: + +| Parameter | Description | Default Value | +| --- | --- | --- | +| `dbname` | The name of the database | `postgres` | +| `collection_name` | The name of the collection | `mem0` | +| `embedding_model_dims` | Dimensions of the embedding model | `1536` | +| `user` | User name to connect to the database | `None` | +| `password` | Password to connect to the database | `None` | +| `host` | The host where the Postgres server is running | `None` | +| `port` | The port where the Postgres server is running | `None` | \ No newline at end of file diff --git a/docs/components/vectordbs/dbs/qdrant.mdx b/docs/components/vectordbs/dbs/qdrant.mdx new file mode 100644 index 00000000..6097588b --- /dev/null +++ b/docs/components/vectordbs/dbs/qdrant.mdx @@ -0,0 +1,40 @@ +[Qdrant](https://qdrant.tech/) is an open-source vector search engine. It is designed to work with large-scale datasets and provides a high-performance search engine for vector data. + +### Usage + +```python +import os +from mem0 import Memory + +os.environ["OPENAI_API_KEY"] = "sk-xx" + +config = { + "vector_store": { + "provider": "qdrant", + "config": { + "collection_name": "test", + "host": "localhost", + "port": 6333, + } + } +} + +m = Memory.from_config(config) +m.add("Likes to play cricket on weekends", user_id="alice", metadata={"category": "hobbies"}) +``` + +### Config + +Let's see the available parameters for the `qdrant` config: + +| Parameter | Description | Default Value | +| --- | --- | --- | +| `collection_name` | The name of the collection to store the vectors | `mem0` | +| `embedding_model_dims` | Dimensions of the embedding model | `1536` | +| `client` | Custom client for qdrant | `None` | +| `host` | The host where the qdrant server is running | `None` | +| `port` | The port where the qdrant server is running | `None` | +| `path` | Path for the qdrant database | `/tmp/qdrant` | +| `url` | Full URL for the qdrant server | `None` | +| `api_key` | API key for the qdrant server | `None` | +| `on_disk` | For enabling persistent storage | `False` | \ No newline at end of file diff --git a/docs/components/vectordbs/overview.mdx b/docs/components/vectordbs/overview.mdx new file mode 100644 index 00000000..afe143f4 --- /dev/null +++ b/docs/components/vectordbs/overview.mdx @@ -0,0 +1,24 @@ +--- +title: Overview +--- + +Mem0 includes built-in support for various popular databases. Memory can utilize the database provided by the user, ensuring efficient use for specific needs. + +## Usage + +To utilize a vector database, you must provide a configuration to customize its usage. If no configuration is supplied, a default configuration will be applied, and `Qdrant` will be used as the vector database. + +For a comprehensive list of available parameters for vector database configuration, please refer to [Config](./config). + +To view all supported vector databases, visit the [Supported Vector Databases](./dbs). + +## Common issues + +### Using model with different dimensions + +If you are using customized model, which is having different dimensions other than 1536 +for example 768, you may encounter below error: + +`ValueError: shapes (0,1536) and (768,) not aligned: 1536 (dim 1) != 768 (dim 0)` + +you could add `"embedding_model_dims": 768,` to the config of the vector_store to overcome this issue. diff --git a/docs/mint.json b/docs/mint.json index 742e855e..f2effbfd 100644 --- a/docs/mint.json +++ b/docs/mint.json @@ -66,8 +66,19 @@ "pages": ["components/llms"] }, { - "group": "Vector Database", - "pages": ["components/vectordb"] + "group": "Vector Databases", + "pages": [ + "components/vectordbs/overview", + "components/vectordbs/config", + { + "group": "Supported Vector Databases", + "pages": [ + "components/vectordbs/dbs/chroma", + "components/vectordbs/dbs/pgvector", + "components/vectordbs/dbs/qdrant" + ] + } + ] }, { "group": "Embedding Models",