feat: enhance Azure AI Search Integration with Binary Quantization, Pre/Post Filter Options, and user agent header (#2354)
This commit is contained in:
@@ -1,12 +1,14 @@
|
||||
[Azure AI Search](https://learn.microsoft.com/en-us/azure/search/search-what-is-azure-search/) (formerly known as "Azure Cognitive Search") provides secure information retrieval at scale over user-owned content in traditional and generative AI search applications.
|
||||
# Azure AI Search
|
||||
|
||||
### Usage
|
||||
[Azure AI Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search/) (formerly known as "Azure Cognitive Search") provides secure information retrieval at scale over user-owned content in traditional and generative AI search applications.
|
||||
|
||||
## Usage
|
||||
|
||||
```python
|
||||
import os
|
||||
from mem0 import Memory
|
||||
|
||||
os.environ["OPENAI_API_KEY"] = "sk-xx" #this key is used for embedding purpose
|
||||
os.environ["OPENAI_API_KEY"] = "sk-xx" # This key is used for embedding purpose
|
||||
|
||||
config = {
|
||||
"vector_store": {
|
||||
@@ -15,8 +17,8 @@ config = {
|
||||
"service_name": "ai-search-test",
|
||||
"api_key": "*****",
|
||||
"collection_name": "mem0",
|
||||
"embedding_model_dims": 1536 ,
|
||||
"use_compression": False
|
||||
"embedding_model_dims": 1536,
|
||||
"compression_type": "none"
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -25,20 +27,61 @@ m = Memory.from_config(config)
|
||||
messages = [
|
||||
{"role": "user", "content": "I'm planning to watch a movie tonight. Any recommendations?"},
|
||||
{"role": "assistant", "content": "How about a thriller movies? They can be quite engaging."},
|
||||
{"role": "user", "content": "I’m not a big fan of thriller movies but I love sci-fi movies."},
|
||||
{"role": "user", "content": "I'm not a big fan of thriller movies but I love sci-fi movies."},
|
||||
{"role": "assistant", "content": "Got it! I'll avoid thriller recommendations and suggest sci-fi movies in the future."}
|
||||
]
|
||||
m.add(messages, user_id="alice", metadata={"category": "movies"})
|
||||
```
|
||||
|
||||
### Config
|
||||
## Advanced Usage
|
||||
|
||||
Let's see the available parameters for the `qdrant` config:
|
||||
service_name (str): Azure Cognitive Search service name.
|
||||
| Parameter | Description | Default Value |
|
||||
| --- | --- | --- |
|
||||
| `service_name` | Azure AI Search service name | `None` |
|
||||
| `api_key` | API key of the Azure AI Search service | `None` |
|
||||
| `collection_name` | The name of the collection/index to store the vectors, it will be created automatically if not exist | `mem0` |
|
||||
| `embedding_model_dims` | Dimensions of the embedding model | `1536` |
|
||||
| `use_compression` | Use scalar quantization vector compression | False |
|
||||
```python
|
||||
# Search with specific filter mode
|
||||
result = m.search(
|
||||
"sci-fi movies",
|
||||
filters={"user_id": "alice"},
|
||||
limit=5,
|
||||
vector_filter_mode="preFilter" # Apply filters before vector search
|
||||
)
|
||||
|
||||
# Using binary compression for large vector collections
|
||||
config = {
|
||||
"vector_store": {
|
||||
"provider": "azure_ai_search",
|
||||
"config": {
|
||||
"service_name": "ai-search-test",
|
||||
"api_key": "*****",
|
||||
"collection_name": "mem0",
|
||||
"embedding_model_dims": 1536,
|
||||
"compression_type": "binary",
|
||||
"use_float16": True # Use half precision for storage efficiency
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Configuration Parameters
|
||||
|
||||
| Parameter | Description | Default Value | Options |
|
||||
| --- | --- | --- | --- |
|
||||
| `service_name` | Azure AI Search service name | Required | - |
|
||||
| `api_key` | API key of the Azure AI Search service | Required | - |
|
||||
| `collection_name` | The name of the collection/index to store vectors | `mem0` | Any valid index name |
|
||||
| `embedding_model_dims` | Dimensions of the embedding model | `1536` | Any integer value |
|
||||
| `compression_type` | Type of vector compression to use | `none` | `none`, `scalar`, `binary` |
|
||||
| `use_float16` | Store vectors in half precision (Edm.Half) | `False` | `True`, `False` |
|
||||
|
||||
## Notes on Configuration Options
|
||||
|
||||
- **compression_type**:
|
||||
- `none`: No compression, uses full vector precision
|
||||
- `scalar`: Scalar quantization with reasonable balance of speed and accuracy
|
||||
- `binary`: Binary quantization for maximum compression with some accuracy trade-off
|
||||
|
||||
- **vector_filter_mode**:
|
||||
- `preFilter`: Applies filters before vector search (faster)
|
||||
- `postFilter`: Applies filters after vector search (may provide better relevance)
|
||||
|
||||
- **use_float16**: Using half precision (float16) reduces storage requirements but may slightly impact accuracy. Useful for very large vector collections.
|
||||
|
||||
- **Filterable Fields**: The implementation automatically extracts `user_id`, `run_id`, and `agent_id` fields from payloads for filtering.
|
||||
Reference in New Issue
Block a user