Add configs to VectorDB docs (#1699)

This commit is contained in:
Dev Khant
2024-08-14 00:27:04 +05:30
committed by GitHub
parent 2180b83a8b
commit 64218db7bd
7 changed files with 223 additions and 108 deletions

View File

@@ -1,106 +0,0 @@
---
title: Supported Vector Databases
---
## Overview
Mem0 includes built-in support for various popular databases. Memory can utilize the database provided by the user, ensuring efficient use for specific needs.
<CardGroup>
<Card title="Qdrant" href="#qdrant"></Card>
<Card title="Chroma" href="#chroma"></Card>
<Card title="pgvector" href="#pgvector"></Card>
</CardGroup>
## Qdrant
[Qdrant](https://qdrant.tech/) is an open-source vector search engine. It is designed to work with large-scale datasets and provides a high-performance search engine for vector data.
To use Qdrant you can do like this:
```python
import os
from mem0 import Memory
os.environ["OPENAI_API_KEY"] = "sk-xx"
config = {
"vector_store": {
"provider": "qdrant",
"config": {
"collection_name": "test",
"host": "localhost",
"port": 6333,
}
}
}
m = Memory.from_config(config)
m.add("Likes to play cricket on weekends", user_id="alice", metadata={"category": "hobbies"})
```
## Chroma
[Chroma](https://www.trychroma.com/) is an AI-native open-source vector database that simplifies building LLM apps by providing tools for storing, embedding, and searching embeddings with a focus on simplicity and speed.
To use ChromaDB you can do like this:
```python
import os
from mem0 import Memory
os.environ["OPENAI_API_KEY"] = "sk-xx"
config = {
"vector_store": {
"provider": "chroma",
"config": {
"collection_name": "test",
"path": "db",
}
}
}
m = Memory.from_config(config)
m.add("Likes to play cricket on weekends", user_id="alice", metadata={"category": "hobbies"})
```
## pgvector
[pgvector](https://github.com/pgvector/pgvector) is open-source vector similarity search for Postgres. After connecting with postgres run `CREATE EXTENSION IF NOT EXISTS vector;` to create the vector extension.
Here's how to use it:
```python
import os
from mem0 import Memory
os.environ["OPENAI_API_KEY"] = "sk-xx"
config = {
"vector_store": {
"provider": "pgvector",
"config": {
"user": "test",
"password": "123",
"host": "127.0.0.1",
"port": "5432",
}
}
}
m = Memory.from_config(config)
m.add("Likes to play cricket on weekends", user_id="alice", metadata={"category": "hobbies"})
```
## Common issues
### Using model with different dimensions
If you are using customized model, which is having different dimensions other than 1536
for example 768, you may encounter below error:
`ValueError: shapes (0,1536) and (768,) not aligned: 1536 (dim 1) != 768 (dim 0)`
you could add `"embedding_model_dims": 768,` to the config of the vector_store to overcome this issue.

View File

@@ -0,0 +1,72 @@
## What is Config?
Config in mem0 is a dictionary that specifies the settings for your vector database. It allows you to customize the behavior and connection details of your chosen vector store.
## How to Define Config
The config is defined as a Python dictionary with two main keys:
- `vector_store`: Specifies the vector database provider and its configuration
- `provider`: The name of the vector database (e.g., "chroma", "pgvector", "qdrant")
- `config`: A nested dictionary containing provider-specific settings
## How to Use Config
Here's a general example of how to use the config with mem0:
```python
import os
from mem0 import Memory
os.environ["OPENAI_API_KEY"] = "sk-xx"
config = {
"vector_store": {
"provider": "your_chosen_provider",
"config": {
# Provider-specific settings go here
}
}
}
m = Memory.from_config(config)
m.add("Your text here", user_id="user", metadata={"category": "example"})
```
## Why is Config Needed?
Config is essential for:
1. Specifying which vector database to use.
2. Providing necessary connection details (e.g., host, port, credentials).
3. Customizing database-specific settings (e.g., collection name, path).
4. Ensuring proper initialization and connection to your chosen vector store.
## Master List of All Params in Config
Here's a comprehensive list of all parameters that can be used across different vector databases:
| Parameter | Description |
|-----------|-------------|
| `collection_name` | Name of the collection |
| `embedding_model_dims` | Dimensions of the embedding model |
| `client` | Custom client for the database |
| `path` | Path for the database |
| `host` | Host where the server is running |
| `port` | Port where the server is running |
| `user` | Username for database connection |
| `password` | Password for database connection |
| `dbname` | Name of the database |
| `url` | Full URL for the server |
| `api_key` | API key for the server |
| `on_disk` | Enable persistent storage |
## Customizing Config
Each vector database has its own specific configuration requirements. To customize the config for your chosen vector store:
1. Identify the vector database you want to use from [supported vector databases](./dbs).
2. Refer to the `Config` section in the respective vector database's documentation.
3. Include only the relevant parameters for your chosen database in the `config` dictionary.
## Supported Vector Databases
For detailed information on configuring specific vector databases, please visit the [Supported Vector Databases](./dbs) section. There you'll find individual pages for each supported vector store with provider-specific usage examples and configuration details.

View File

@@ -0,0 +1,35 @@
[Chroma](https://www.trychroma.com/) is an AI-native open-source vector database that simplifies building LLM apps by providing tools for storing, embedding, and searching embeddings with a focus on simplicity and speed.
### Usage
```python
import os
from mem0 import Memory
os.environ["OPENAI_API_KEY"] = "sk-xx"
config = {
"vector_store": {
"provider": "chroma",
"config": {
"collection_name": "test",
"path": "db",
}
}
}
m = Memory.from_config(config)
m.add("Likes to play cricket on weekends", user_id="alice", metadata={"category": "hobbies"})
```
### Config
Here are the parameters available for configuring Chroma:
| Parameter | Description | Default Value |
| --- | --- | --- |
| `collection_name` | The name of the collection | `mem0` |
| `client` | Custom client for Chroma | `None` |
| `path` | Path for the Chroma database | `db` |
| `host` | The host where the Chroma server is running | `None` |
| `port` | The port where the Chroma server is running | `None` |

View File

@@ -0,0 +1,39 @@
[pgvector](https://github.com/pgvector/pgvector) is open-source vector similarity search for Postgres. After connecting with postgres run `CREATE EXTENSION IF NOT EXISTS vector;` to create the vector extension.
### Usage
```python
import os
from mem0 import Memory
os.environ["OPENAI_API_KEY"] = "sk-xx"
config = {
"vector_store": {
"provider": "pgvector",
"config": {
"user": "test",
"password": "123",
"host": "127.0.0.1",
"port": "5432",
}
}
}
m = Memory.from_config(config)
m.add("Likes to play cricket on weekends", user_id="alice", metadata={"category": "hobbies"})
```
### Config
Here's the parameters available for configuring pgvector:
| Parameter | Description | Default Value |
| --- | --- | --- |
| `dbname` | The name of the database | `postgres` |
| `collection_name` | The name of the collection | `mem0` |
| `embedding_model_dims` | Dimensions of the embedding model | `1536` |
| `user` | User name to connect to the database | `None` |
| `password` | Password to connect to the database | `None` |
| `host` | The host where the Postgres server is running | `None` |
| `port` | The port where the Postgres server is running | `None` |

View File

@@ -0,0 +1,40 @@
[Qdrant](https://qdrant.tech/) is an open-source vector search engine. It is designed to work with large-scale datasets and provides a high-performance search engine for vector data.
### Usage
```python
import os
from mem0 import Memory
os.environ["OPENAI_API_KEY"] = "sk-xx"
config = {
"vector_store": {
"provider": "qdrant",
"config": {
"collection_name": "test",
"host": "localhost",
"port": 6333,
}
}
}
m = Memory.from_config(config)
m.add("Likes to play cricket on weekends", user_id="alice", metadata={"category": "hobbies"})
```
### Config
Let's see the available parameters for the `qdrant` config:
| Parameter | Description | Default Value |
| --- | --- | --- |
| `collection_name` | The name of the collection to store the vectors | `mem0` |
| `embedding_model_dims` | Dimensions of the embedding model | `1536` |
| `client` | Custom client for qdrant | `None` |
| `host` | The host where the qdrant server is running | `None` |
| `port` | The port where the qdrant server is running | `None` |
| `path` | Path for the qdrant database | `/tmp/qdrant` |
| `url` | Full URL for the qdrant server | `None` |
| `api_key` | API key for the qdrant server | `None` |
| `on_disk` | For enabling persistent storage | `False` |

View File

@@ -0,0 +1,24 @@
---
title: Overview
---
Mem0 includes built-in support for various popular databases. Memory can utilize the database provided by the user, ensuring efficient use for specific needs.
## Usage
To utilize a vector database, you must provide a configuration to customize its usage. If no configuration is supplied, a default configuration will be applied, and `Qdrant` will be used as the vector database.
For a comprehensive list of available parameters for vector database configuration, please refer to [Config](./config).
To view all supported vector databases, visit the [Supported Vector Databases](./dbs).
## Common issues
### Using model with different dimensions
If you are using customized model, which is having different dimensions other than 1536
for example 768, you may encounter below error:
`ValueError: shapes (0,1536) and (768,) not aligned: 1536 (dim 1) != 768 (dim 0)`
you could add `"embedding_model_dims": 768,` to the config of the vector_store to overcome this issue.