[docs]: Revamp embedchain docs (#799)

2023-10-13 15:38:15 -07:00
parent a86d7f52e9
commit 4a8c50f886
68 changed files with 1175 additions and 673 deletions
--- a/docs/components/embedding-models.mdx
+++ b/docs/components/embedding-models.mdx
@@ -0,0 +1,135 @@
+---
+title: 🧩 Embedding models
+---
+
+## Overview
+
+Embedchain supports several embedding models from the following providers:
+
+<CardGroup cols={4}>
+  <Card title="OpenAI" href="#openai"></Card>
+  <Card title="GPT4All" href="#gpt4all"></Card>
+  <Card title="Hugging Face" href="#hugging-face"></Card>
+  <Card title="Vertex AI" href="#vertex-ai"></Card>
+</CardGroup>
+
+## OpenAI
+
+To use OpenAI embedding function, you have to set the `OPENAI_API_KEY` environment variable. You can obtain the OpenAI API key from the [OpenAI Platform](https://platform.openai.com/account/api-keys).
+
+Once you have obtained the key, you can use it like this:
+
+<CodeGroup>
+
+```python main.py
+import os
+from embedchain import App
+
+os.environ['OPENAI_API_KEY'] = 'xxx'
+
+# load embedding model configuration from openai.yaml file
+app = App.from_config(yaml_path="openai.yaml")
+
+app.add("https://en.wikipedia.org/wiki/OpenAI")
+app.query("What is OpenAI?")
+```
+
+```yaml openai.yaml
+embedder:
+  provider: openai
+  config:
+    model: 'text-embedding-ada-002'
+```
+
+</CodeGroup>
+
+## GPT4ALL
+
+GPT4All supports generating high quality embeddings of arbitrary length documents of text using a CPU optimized contrastively trained Sentence Transformer.
+
+<CodeGroup>
+
+```python main.py
+from embedchain import App
+
+# load embedding model configuration from gpt4all.yaml file
+app = App.from_config(yaml_path="gpt4all.yaml")
+```
+
+```yaml gpt4all.yaml
+llm:
+  provider: gpt4all
+  model: 'orca-mini-3b.ggmlv3.q4_0.bin'
+  config:
+    temperature: 0.5
+    max_tokens: 1000
+    top_p: 1
+    stream: false
+
+embedder:
+  provider: gpt4all
+  config:
+    model: 'all-MiniLM-L6-v2'
+```
+
+</CodeGroup>
+
+## Hugging Face
+
+Hugging Face supports generating embeddings of arbitrary length documents of text using Sentence Transformer library. Example of how to generate embeddings using hugging face is given below:
+
+<CodeGroup>
+
+```python main.py
+from embedchain import App
+
+# load embedding model configuration from huggingface.yaml file
+app = App.from_config(yaml_path="huggingface.yaml")
+```
+
+```yaml huggingface.yaml
+llm:
+  provider: huggingface
+  model: 'google/flan-t5-xxl'
+  config:
+    temperature: 0.5
+    max_tokens: 1000
+    top_p: 0.5
+    stream: false
+
+embedder:
+  provider: huggingface
+  config:
+    model: 'sentence-transformers/all-mpnet-base-v2'
+```
+
+</CodeGroup>
+
+## Vertex AI
+
+Embedchain supports Google's VertexAI embeddings model through a simple interface. You just have to pass the `model_name` in the config yaml and it would work out of the box.
+
+<CodeGroup>
+
+```python main.py
+from embedchain import App
+
+# load embedding model configuration from vertexai.yaml file
+app = App.from_config(yaml_path="vertexai.yaml")
+```
+
+```yaml vertexai.yaml
+llm:
+  provider: vertexai
+  model: 'chat-bison'
+  config:
+    temperature: 0.5
+    top_p: 0.5
+
+embedder:
+  provider: vertexai
+  config:
+    model: 'textembedding-gecko'
+```
+
+</CodeGroup>