[docs]: Revamp embedchain docs (#799)
This commit is contained in:
135
docs/components/embedding-models.mdx
Normal file
135
docs/components/embedding-models.mdx
Normal file
@@ -0,0 +1,135 @@
|
||||
---
|
||||
title: 🧩 Embedding models
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Embedchain supports several embedding models from the following providers:
|
||||
|
||||
<CardGroup cols={4}>
|
||||
<Card title="OpenAI" href="#openai"></Card>
|
||||
<Card title="GPT4All" href="#gpt4all"></Card>
|
||||
<Card title="Hugging Face" href="#hugging-face"></Card>
|
||||
<Card title="Vertex AI" href="#vertex-ai"></Card>
|
||||
</CardGroup>
|
||||
|
||||
## OpenAI
|
||||
|
||||
To use OpenAI embedding function, you have to set the `OPENAI_API_KEY` environment variable. You can obtain the OpenAI API key from the [OpenAI Platform](https://platform.openai.com/account/api-keys).
|
||||
|
||||
Once you have obtained the key, you can use it like this:
|
||||
|
||||
<CodeGroup>
|
||||
|
||||
```python main.py
|
||||
import os
|
||||
from embedchain import App
|
||||
|
||||
os.environ['OPENAI_API_KEY'] = 'xxx'
|
||||
|
||||
# load embedding model configuration from openai.yaml file
|
||||
app = App.from_config(yaml_path="openai.yaml")
|
||||
|
||||
app.add("https://en.wikipedia.org/wiki/OpenAI")
|
||||
app.query("What is OpenAI?")
|
||||
```
|
||||
|
||||
```yaml openai.yaml
|
||||
embedder:
|
||||
provider: openai
|
||||
config:
|
||||
model: 'text-embedding-ada-002'
|
||||
```
|
||||
|
||||
</CodeGroup>
|
||||
|
||||
## GPT4ALL
|
||||
|
||||
GPT4All supports generating high quality embeddings of arbitrary length documents of text using a CPU optimized contrastively trained Sentence Transformer.
|
||||
|
||||
<CodeGroup>
|
||||
|
||||
```python main.py
|
||||
from embedchain import App
|
||||
|
||||
# load embedding model configuration from gpt4all.yaml file
|
||||
app = App.from_config(yaml_path="gpt4all.yaml")
|
||||
```
|
||||
|
||||
```yaml gpt4all.yaml
|
||||
llm:
|
||||
provider: gpt4all
|
||||
model: 'orca-mini-3b.ggmlv3.q4_0.bin'
|
||||
config:
|
||||
temperature: 0.5
|
||||
max_tokens: 1000
|
||||
top_p: 1
|
||||
stream: false
|
||||
|
||||
embedder:
|
||||
provider: gpt4all
|
||||
config:
|
||||
model: 'all-MiniLM-L6-v2'
|
||||
```
|
||||
|
||||
</CodeGroup>
|
||||
|
||||
## Hugging Face
|
||||
|
||||
Hugging Face supports generating embeddings of arbitrary length documents of text using Sentence Transformer library. Example of how to generate embeddings using hugging face is given below:
|
||||
|
||||
<CodeGroup>
|
||||
|
||||
```python main.py
|
||||
from embedchain import App
|
||||
|
||||
# load embedding model configuration from huggingface.yaml file
|
||||
app = App.from_config(yaml_path="huggingface.yaml")
|
||||
```
|
||||
|
||||
```yaml huggingface.yaml
|
||||
llm:
|
||||
provider: huggingface
|
||||
model: 'google/flan-t5-xxl'
|
||||
config:
|
||||
temperature: 0.5
|
||||
max_tokens: 1000
|
||||
top_p: 0.5
|
||||
stream: false
|
||||
|
||||
embedder:
|
||||
provider: huggingface
|
||||
config:
|
||||
model: 'sentence-transformers/all-mpnet-base-v2'
|
||||
```
|
||||
|
||||
</CodeGroup>
|
||||
|
||||
## Vertex AI
|
||||
|
||||
Embedchain supports Google's VertexAI embeddings model through a simple interface. You just have to pass the `model_name` in the config yaml and it would work out of the box.
|
||||
|
||||
<CodeGroup>
|
||||
|
||||
```python main.py
|
||||
from embedchain import App
|
||||
|
||||
# load embedding model configuration from vertexai.yaml file
|
||||
app = App.from_config(yaml_path="vertexai.yaml")
|
||||
```
|
||||
|
||||
```yaml vertexai.yaml
|
||||
llm:
|
||||
provider: vertexai
|
||||
model: 'chat-bison'
|
||||
config:
|
||||
temperature: 0.5
|
||||
top_p: 0.5
|
||||
|
||||
embedder:
|
||||
provider: vertexai
|
||||
config:
|
||||
model: 'textembedding-gecko'
|
||||
```
|
||||
|
||||
</CodeGroup>
|
||||
Reference in New Issue
Block a user