[Docs] Revamp documentation (#1010)

This commit is contained in:
Deshraj Yadav
2023-12-15 05:14:17 +05:30
committed by GitHub
parent b7a44ef472
commit d54cdc5b00
81 changed files with 1223 additions and 378 deletions

View File

@@ -98,8 +98,8 @@ Comprehensive guides and API documentation are available to help you get the mos
- [Getting Started](https://docs.embedchain.ai/get-started/quickstart)
- [Introduction](https://docs.embedchain.ai/get-started/introduction#what-is-embedchain)
- [Examples](https://docs.embedchain.ai/get-started/examples)
- [Supported data types](https://docs.embedchain.ai/data-sources/)
- [Examples](https://docs.embedchain.ai/examples)
- [Supported data types](https://docs.embedchain.ai/components/data-sources/overview)
## 🔗 Join the Community

View File

@@ -1,11 +1,11 @@
<CardGroup cols={3}>
<Card title="Talk to founders" icon="calendar" href="https://cal.com/taranjeetio/ec">
Schedule a call
</Card>
<Card title="Slack" icon="slack" href="https://join.slack.com/t/embedchain/shared_invite/zt-22uwz3c46-Zg7cIh5rOBteT_xe1jwLDw" color="#4A154B">
Join our slack community
</Card>
<Card title="Discord" icon="discord" href="https://discord.gg/6PzXDgEjG5" color="#7289DA">
Join our discord community
</Card>
<Card title="Schedule a call" icon="calendar" href="https://cal.com/taranjeetio/ec">
Schedule a call with Embedchain founder
</Card>
</CardGroup>

View File

@@ -1,14 +1,14 @@
---
title: '⚙️ Custom configurations'
title: 'Custom configurations'
---
Embedchain is made to work out of the box. However, for advanced users we're also offering configuration options. All of these configuration options are optional and have sane defaults.
Embedchain offers several configuration options for your LLM, vector database, and embedding model. All of these configuration options are optional and have sane defaults.
You can configure different components of your app (`llm`, `embedding model`, or `vector database`) through a simple yaml configuration that Embedchain offers. Here is a generic full-stack example of the yaml config:
<Tip>
Embedchain applications are configurable using YAML file, JSON file or by directly passing the config dictionary.
Embedchain applications are configurable using YAML file, JSON file or by directly passing the config dictionary. Checkout the [docs here](/api-reference/pipeline/overview#usage) on how to use other formats.
</Tip>
<CodeGroup>

View File

View File

@@ -0,0 +1,44 @@
---
title: '📊 add'
---
`add()` method is used to load the data sources from different data sources to a RAG pipeline. You can find the signature below:
### Parameters
<ParamField path="source" type="str">
The data to embed, can be a URL, local file or raw content, depending on the data type.. You can find the full list of supported data sources [here](/components/data-sources/overview).
</ParamField>
<ParamField path="data_type" type="str" optional>
Type of data source. It can be automatically detected but user can force what data type to load as.
</ParamField>
<ParamField path="metadata" type="dict" optional>
Any metadata that you want to store with the data source. Metadata is generally really useful for doing metadata filtering on top of semantic search to yield faster search and better results.
</ParamField>
## Usage
### Load data from webpage
```python Code example
from embedchain import Pipeline as App
app = App()
app.add("https://www.forbes.com/profile/elon-musk")
# Inserting batches in chromadb: 100%|███████████████| 1/1 [00:00<00:00, 1.19it/s]
# Successfully saved https://www.forbes.com/profile/elon-musk (DataType.WEB_PAGE). New chunks count: 4
```
### Load data from sitemap
```python Code example
from embedchain import Pipeline as App
app = App()
app.add("https://python.langchain.com/sitemap.xml", data_type="sitemap")
# Loading pages: 100%|█████████████| 1108/1108 [00:47<00:00, 23.17it/s]
# Inserting batches in chromadb: 100%|█████████| 111/111 [04:41<00:00, 2.54s/it]
# Successfully saved https://python.langchain.com/sitemap.xml (DataType.SITEMAP). New chunks count: 11024
```
You can find complete list of supported data sources [here](/components/data-sources/overview).

View File

@@ -0,0 +1,97 @@
---
title: '💬 chat'
---
`chat()` method allows you to chat over your data sources using a user-friendly chat API. You can find the signature below:
### Parameters
<ParamField path="input_query" type="str">
Question to ask
</ParamField>
<ParamField path="config" type="BaseLlmConfig" optional>
Configure different llm settings such as prompt, temprature, number_documents etc.
</ParamField>
<ParamField path="dry_run" type="bool" optional>
The purpose is to test the prompt structure without actually running LLM inference. Defaults to `False`
</ParamField>
<ParamField path="where" type="dict" optional>
A dictionary of key-value pairs to filter the chunks from the vector database. Defaults to `None`
</ParamField>
<ParamField path="citations" type="bool" optional>
Return citations along with the LLM answer. Defaults to `False`
</ParamField>
### Returns
<ResponseField name="answer" type="str | tuple">
If `citations=False`, return a stringified answer to the question asked. <br />
If `citations=True`, returns a tuple with answer and citations respectively.
</ResponseField>
## Usage
### With citations
If you want to get the answer to question and return both answer and citations, use the following code snippet:
```python With Citations
from embedchain import Pipeline as App
# Initialize app
app = App()
# Add data source
app.add("https://www.forbes.com/profile/elon-musk")
# Get relevant answer for your query
answer, sources = app.chat("What is the net worth of Elon?", citations=True)
print(answer)
# Answer: The net worth of Elon Musk is $221.9 billion.
print(sources)
# [
# (
# 'Elon Musk PROFILEElon MuskCEO, Tesla$247.1B$2.3B (0.96%)Real Time Net Worthas of 12/7/23 ...',
# 'https://www.forbes.com/profile/elon-musk',
# '4651b266--4aa78839fe97'
# ),
# (
# '74% of the company, which is now called X.Wealth HistoryHOVER TO REVEAL NET WORTH BY YEARForbes ...',
# 'https://www.forbes.com/profile/elon-musk',
# '4651b266--4aa78839fe97'
# ),
# (
# 'founded in 2002, is worth nearly $150 billion after a $750 million tender offer in June 2023 ...',
# 'https://www.forbes.com/profile/elon-musk',
# '4651b266--4aa78839fe97'
# )
# ]
```
<Note>
When `citations=True`, note that the returned `sources` are a list of tuples where each tuple has three elements (in the following order):
1. source chunk
2. link of the source document
3. document id (used for book keeping purposes)
</Note>
### Without citations
If you just want to return answers and don't want to return citations, you can use the following example:
```python Without Citations
from embedchain import Pipeline as App
# Initialize app
app = App()
# Add data source
app.add("https://www.forbes.com/profile/elon-musk")
# Chat on your data using `.chat()`
answer = app.chat("What is the net worth of Elon?")
print(answer)
# Answer: The net worth of Elon Musk is $221.9 billion.
```

View File

@@ -0,0 +1,31 @@
---
title: 🚀 deploy
---
Using the `deploy()` method, Embedchain allows developers to easily launch their LLM-powered applications on the [Embedchain Platform](https://app.embedchain.ai). This platform facilitates seamless access to your data's context via a free and user-friendly REST API. Once your pipeline is deployed, you can update your data sources at any time.
The `deploy()` method not only deploys your pipeline but also efficiently manages LLMs, vector databases, embedding models, and data syncing, enabling you to focus on querying, chatting, or searching without the hassle of infrastructure management.
## Usage
```python
from embedchain import Pipeline as App
# Initialize app
app = App()
# Add data source
app.add("https://www.forbes.com/profile/elon-musk")
# Deploy your pipeline to Embedchain Platform
app.deploy()
# 🔑 Enter your Embedchain API key. You can find the API key at https://app.embedchain.ai/settings/keys/
# ec-xxxxxx
# 🛠️ Creating pipeline on the platform...
# 🎉🎉🎉 Pipeline created successfully! View your pipeline: https://app.embedchain.ai/pipelines/xxxxx
# 🛠️ Adding data to your pipeline...
# ✅ Data of type: web_page, value: https://www.forbes.com/profile/elon-musk added successfully.
```

View File

@@ -0,0 +1,130 @@
---
title: "Pipeline"
---
Create a RAG pipeline object on Embedchain. This is the main entrypoint for a developer to interact with Embedchain APIs. A pipeline configures the llm, vector database, embedding model, and retrieval strategy of your choice.
### Attributes
<ParamField path="local_id" type="str">
Pipeline ID
</ParamField>
<ParamField path="name" type="str" optional>
Name of the pipeline
</ParamField>
<ParamField path="config" type="BaseConfig">
Configuration of the pipeline
</ParamField>
<ParamField path="llm" type="BaseLlm">
Configured LLM for the RAG pipeline
</ParamField>
<ParamField path="db" type="BaseVectorDB">
Configured vector database for the RAG pipeline
</ParamField>
<ParamField path="embedding_model" type="BaseEmbedder">
Configured embedding model for the RAG pipeline
</ParamField>
<ParamField path="chunker" type="ChunkerConfig">
Chunker configuration
</ParamField>
<ParamField path="client" type="Client" optional>
Client object (used to deploy a pipeline to Embedchain platform)
</ParamField>
<ParamField path="logger" type="logging.Logger">
Logger object
</ParamField>
## Usage
You can create an embedchain pipeline instance using the following methods:
### Default setting
```python Code Example
from embedchain import Pipeline as App
app = App()
```
### Python Dict
```python Code Example
from embedchain import Pipeline as App
config_dict = {
'llm': {
'provider': 'gpt4all',
'config': {
'model': 'orca-mini-3b-gguf2-q4_0.gguf',
'temperature': 0.5,
'max_tokens': 1000,
'top_p': 1,
'stream': False
}
},
'embedder': {
'provider': 'gpt4all'
}
}
# load llm configuration from config dict
app = App.from_config(config=config_dict)
```
### YAML Config
<CodeGroup>
```python main.py
from embedchain import Pipeline as App
# load llm configuration from config.yaml file
app = App.from_config(config_path="config.yaml")
```
```yaml config.yaml
llm:
provider: gpt4all
config:
model: 'orca-mini-3b-gguf2-q4_0.gguf'
temperature: 0.5
max_tokens: 1000
top_p: 1
stream: false
embedder:
provider: gpt4all
```
</CodeGroup>
### JSON Config
<CodeGroup>
```python main.py
from embedchain import Pipeline as App
# load llm configuration from config.json file
app = App.from_config(config_path="config.json")
```
```json config.json
{
"llm": {
"provider": "gpt4all",
"config": {
"model": "orca-mini-3b-gguf2-q4_0.gguf",
"temperature": 0.5,
"max_tokens": 1000,
"top_p": 1,
"stream": false
}
},
"embedder": {
"provider": "gpt4all"
}
}
```
</CodeGroup>

View File

@@ -0,0 +1,97 @@
---
title: '❓ query'
---
`.query()` method empowers developers to ask questions and receive relevant answers through a user-friendly query API. Function signature is given below:
### Parameters
<ParamField path="input_query" type="str">
Question to ask
</ParamField>
<ParamField path="config" type="BaseLlmConfig" optional>
Configure different llm settings such as prompt, temprature, number_documents etc.
</ParamField>
<ParamField path="dry_run" type="bool" optional>
The purpose is to test the prompt structure without actually running LLM inference. Defaults to `False`
</ParamField>
<ParamField path="where" type="dict" optional>
A dictionary of key-value pairs to filter the chunks from the vector database. Defaults to `None`
</ParamField>
<ParamField path="citations" type="bool" optional>
Return citations along with the LLM answer. Defaults to `False`
</ParamField>
### Returns
<ResponseField name="answer" type="str | tuple">
If `citations=False`, return a stringified answer to the question asked. <br />
If `citations=True`, returns a tuple with answer and citations respectively.
</ResponseField>
## Usage
### With citations
If you want to get the answer to question and return both answer and citations, use the following code snippet:
```python With Citations
from embedchain import Pipeline as App
# Initialize app
app = App()
# Add data source
app.add("https://www.forbes.com/profile/elon-musk")
# Get relevant answer for your query
answer, sources = app.query("What is the net worth of Elon?", citations=True)
print(answer)
# Answer: The net worth of Elon Musk is $221.9 billion.
print(sources)
# [
# (
# 'Elon Musk PROFILEElon MuskCEO, Tesla$247.1B$2.3B (0.96%)Real Time Net Worthas of 12/7/23 ...',
# 'https://www.forbes.com/profile/elon-musk',
# '4651b266--4aa78839fe97'
# ),
# (
# '74% of the company, which is now called X.Wealth HistoryHOVER TO REVEAL NET WORTH BY YEARForbes ...',
# 'https://www.forbes.com/profile/elon-musk',
# '4651b266--4aa78839fe97'
# ),
# (
# 'founded in 2002, is worth nearly $150 billion after a $750 million tender offer in June 2023 ...',
# 'https://www.forbes.com/profile/elon-musk',
# '4651b266--4aa78839fe97'
# )
# ]
```
<Note>
When `citations=True`, note that the returned `sources` are a list of tuples where each tuple has three elements (in the following order):
1. source chunk
2. link of the source document
3. document id (used for book keeping purposes)
</Note>
### Without citations
If you just want to return answers and don't want to return citations, you can use the following example:
```python Without Citations
from embedchain import Pipeline as App
# Initialize app
app = App()
# Add data source
app.add("https://www.forbes.com/profile/elon-musk")
# Get relevant answer for your query
answer = app.query("What is the net worth of Elon?")
print(answer)
# Answer: The net worth of Elon Musk is $221.9 billion.
```

View File

@@ -0,0 +1,17 @@
---
title: 🔄 reset
---
`reset()` method allows you to wipe the data from your RAG application and start from scratch.
## Usage
```python
from embedchain import Pipeline as App
app = App()
app.add("https://www.forbes.com/profile/elon-musk")
# Reset the app
app.reset()
```

View File

@@ -0,0 +1,51 @@
---
title: '🔍 search'
---
`.search()` enables you to uncover the most pertinent context by performing a semantic search across your data sources based on a given query. Refer to the function signature below:
### Parameters
<ParamField path="query" type="str">
Question
</ParamField>
<ParamField path="num_documents" type="int" optional>
Number of relevant documents to fetch. Defaults to `3`
</ParamField>
### Returns
<ResponseField name="answer" type="dict">
Return list of dictionaries that contain the relevant chunk and their source information.
</ResponseField>
## Usage
Refer to the following example on how to use the search api:
```python Code example
from embedchain import Pipeline as App
# Initialize app
app = App()
# Add data source
app.add("https://www.forbes.com/profile/elon-musk")
# Get relevant context using semantic search
context = app.search("What is the net worth of Elon?", num_documents=2)
print(context)
# Context:
# [
# {
# 'context': 'Elon Musk PROFILEElon MuskCEO, Tesla$221.9BReal Time Net Worthas of 10/29/23Reflects change since 5 pm ET of prior trading day. 1 in the world todayPhoto by Martin Schoeller for ForbesAbout Elon MuskElon Musk cofounded six companies, including electric car maker Tesla, rocket producer SpaceX and tunneling startup Boring Company.He owns about 21% of Tesla between stock and options, but has pledged more than half his shares as collateral for personal loans of up to $3.5 billion.SpaceX, founded in',
# 'source': 'https://www.forbes.com/profile/elon-musk',
# 'document_id': 'some_document_id'
# },
# {
# 'context': 'company, which is now called X.Wealth HistoryHOVER TO REVEAL NET WORTH BY YEARForbes Lists 1Forbes 400 (2023)The Richest Person In Every State (2023) 2Billionaires (2023) 1Innovative Leaders (2019) 25Powerful People (2018) 12Richest In Tech (2017)Global Game Changers (2016)More ListsPersonal StatsAge52Source of WealthTesla, SpaceX, Self MadeSelf-Made Score8Philanthropy Score1ResidenceAustin, TexasCitizenshipUnited StatesMarital StatusSingleChildren11EducationBachelor of Arts/Science, University',
# 'source': 'https://www.forbes.com/profile/elon-musk',
# 'document_id': 'some_document_id'
# }
# ]
```

View File

@@ -0,0 +1,54 @@
---
title: 'AI Assistant'
---
The `AIAssistant` class, an alternative to the OpenAI Assistant API, is designed for those who prefer using large language models (LLMs) other than those provided by OpenAI. It facilitates the creation of AI Assistants with several key benefits:
- **Visibility into Citations**: It offers transparent access to the sources and citations used by the AI, enhancing the understanding and trustworthiness of its responses.
- **Debugging Capabilities**: Users have the ability to delve into and debug the AI's processes, allowing for a deeper understanding and fine-tuning of its performance.
- **Customizable Prompts**: The class provides the flexibility to modify and tailor prompts according to specific needs, enabling more precise and relevant interactions.
- **Chain of Thought Integration**: It supports the incorporation of a 'chain of thought' approach, which helps in breaking down complex queries into simpler, sequential steps, thereby improving the clarity and accuracy of responses.
It is ideal for those who value customization, transparency, and detailed control over their AI Assistant's functionalities.
### Arguments
<ParamField path="name" type="string" optional>
Name for your AI assistant
</ParamField>
<ParamField path="instructions" type="string" optional>
How the Assistant and model should behave or respond
</ParamField>
<ParamField path="assistant_id" type="string" optional>
Load existing AI Assistant. If you pass this, you don't have to pass other arguments.
</ParamField>
<ParamField path="thread_id" type="string" optional>
Existing thread id if exists
</ParamField>
<ParamField path="yaml_path" type="str" Optional>
Embedchain pipeline config yaml path to use. This will define the configuration of the AI Assistant (such as configuring the LLM, vector database, and embedding model)
</ParamField>
<ParamField path="data_sources" type="list" default="[]">
Add data sources to your assistant. You can add in the following format: `[{"source": "https://example.com", "data_type": "web_page"}]`
</ParamField>
<ParamField path="collect_metrics" type="boolean" default="True">
Anonymous telemetry (doesn't collect any user information or user's files). Used to improve the Embedchain package utilization. Default is `True`.
</ParamField>
## Usage
For detailed guidance on creating your own AI Assistant, click the link below. It provides step-by-step instructions to help you through the process:
<Card title="Guide to Creating Your AI Assistant" icon="link" href="/examples/opensource-assistant">
Learn how to build a customized AI Assistant using the `AIAssistant` class.
</Card>

View File

@@ -0,0 +1,45 @@
---
title: 'OpenAI Assistant'
---
### Arguments
<ParamField path="name" type="string">
Name for your AI assistant
</ParamField>
<ParamField path="instructions" type="string">
how the Assistant and model should behave or respond
</ParamField>
<ParamField path="assistant_id" type="string">
Load existing OpenAI Assistant. If you pass this, you don't have to pass other arguments.
</ParamField>
<ParamField path="thread_id" type="string">
Existing OpenAI thread id if exists
</ParamField>
<ParamField path="model" type="str" default="gpt-4-1106-preview">
OpenAI model to use
</ParamField>
<ParamField path="tools" type="list">
OpenAI tools to use. Default set to `[{"type": "retrieval"}]`
</ParamField>
<ParamField path="data_sources" type="list" default="[]">
Add data sources to your assistant. You can add in the following format: `[{"source": "https://example.com", "data_type": "web_page"}]`
</ParamField>
<ParamField path="telemetry" type="boolean" default="True">
Anonymous telemetry (doesn't collect any user information or user's files). Used to improve the Embedchain package utilization. Default is `True`.
</ParamField>
## Usage
For detailed guidance on creating your own OpenAI Assistant, click the link below. It provides step-by-step instructions to help you through the process:
<Card title="Guide to Creating Your OpenAI Assistant" icon="link" href="/examples/openai-assistant">
Learn how to build an OpenAI Assistant using the `OpenAIAssistant` class.
</Card>

View File

@@ -0,0 +1,36 @@
---
title: Overview
---
Embedchain comes with built-in support for various data sources. We handle the complexity of loading unstructured data from these data sources, allowing you to easily customize your app through a user-friendly interface.
<CardGroup cols={4}>
<Card title="📰 PDF file" href="/components/data-sources/pdf-file"></Card>
<Card title="📊 CSV file" href="/components/data-sources/csv"></Card>
<Card title="📃 JSON file" href="/components/data-sources/json"></Card>
<Card title="📺 Youtube" href="/components/data-sources/youtube-video"></Card>
<Card title="📝 Text" href="/components/data-sources/text"></Card>
<Card title="📚 Documentation website" href="/components/data-sources/docs-site"></Card>
<Card title="📄 DOCX file" href="/components/data-sources/docx"></Card>
<Card title="📝 MDX file" href="/components/data-sources/mdx"></Card>
<Card title="📓 Notion" href="/components/data-sources/notion"></Card>
<Card title="❓💬 Q&A pair" href="/components/data-sources/qna"></Card>
<Card title="🗺️ Sitemap" href="/components/data-sources/sitemap"></Card>
<Card title="🌐 Web page" href="/components/data-sources/web-page"></Card>
<Card title="🧾 XML file" href="/components/data-sources/xml"></Card>
<Card title="🙌 OpenAPI" href="/components/data-sources/openapi"></Card>
<Card title="📬 Gmail" href="/components/data-sources/gmail"></Card>
<Card title="🐘 Postgres" href="/components/data-sources/postgres"></Card>
<Card title="🐬 MySQL" href="/components/data-sources/mysql"></Card>
<Card title="🤖 Slack" href="/components/data-sources/slack"></Card>
<Card title="🗨️ Discourse" href="/components/data-sources/discourse"></Card>
<Card title="💬 Discord" href="/components/data-sources/discord"></Card>
<Card title="📝 Github" href="/components/data-sources/github"></Card>
<Card title="⚙️ Custom" href="/components/data-sources/custom"></Card>
<Card title="📝 Substack" href="/components/data-sources/substack"></Card>
<Card title="🐝 Beehiiv" href="/components/data-sources/beehiiv"></Card>
</CardGroup>
<br/ >
<Snippet file="missing-data-source-tip.mdx" />

View File

View File

@@ -22,17 +22,6 @@ make lint format
5. **Create a pull request**: When you are ready to contribute your changes, submit a pull request to the EmbedChain repository. Provide a clear and descriptive title for your pull request, along with a detailed description of the changes you have made.
# Tech Stack
embedchain is built on the following stack:
- [Langchain](https://github.com/hwchase17/langchain) as an LLM framework to load, chunk and index data
- [OpenAI's Ada embedding model](https://platform.openai.com/docs/guides/embeddings) to create embeddings
- [OpenAI's ChatGPT API](https://platform.openai.com/docs/guides/gpt/chat-completions-api) as LLM to get answers given the context
- [Chroma](https://github.com/chroma-core/chroma) as the vector database to store embeddings
- [gpt4all](https://github.com/nomic-ai/gpt4all) as an open source LLM
- [sentence-transformers](https://huggingface.co/sentence-transformers) as open source embedding model
## Team
### Authors

View File

View File

View File

View File

View File

View File

View File

@@ -1,36 +0,0 @@
---
title: Overview
---
Embedchain comes with built-in support for various data sources. We handle the complexity of loading unstructured data from these data sources, allowing you to easily customize your app through a user-friendly interface.
<CardGroup cols={4}>
<Card title="📊 CSV" href="/data-sources/csv"></Card>
<Card title="📃 JSON" href="/data-sources/json"></Card>
<Card title="📚 docs site" href="/data-sources/docs-site"></Card>
<Card title="📄 docx" href="/data-sources/docx"></Card>
<Card title="📝 mdx" href="/data-sources/mdx"></Card>
<Card title="📓 Notion" href="/data-sources/notion"></Card>
<Card title="📰 PDF" href="/data-sources/pdf-file"></Card>
<Card title="❓💬 q&a pair" href="/data-sources/qna"></Card>
<Card title="🗺️ sitemap" href="/data-sources/sitemap"></Card>
<Card title="📝 text" href="/data-sources/text"></Card>
<Card title="🌐 web page" href="/data-sources/web-page"></Card>
<Card title="🧾 xml" href="/data-sources/xml"></Card>
<Card title="🙌 OpenAPI" href="/data-sources/openapi"></Card>
<Card title="📺 Youtube" href="/data-sources/youtube-video"></Card>
<Card title="📬 Gmail" href="/data-sources/gmail"></Card>
<Card title="🐘 Postgres" href="/data-sources/postgres"></Card>
<Card title="🐬 MySQL" href="/data-sources/mysql"></Card>
<Card title="🤖 Slack" href="/data-sources/slack"></Card>
<Card title="🗨️ Discourse" href="/data-sources/discourse"></Card>
<Card title="💬 Discord" href="/data-sources/discord"></Card>
<Card title="📝 Github" href="/data-sources/github"></Card>
<Card title="⚙️ Custom" href="/data-sources/custom"></Card>
<Card title="📝 Substack" href="/data-sources/substack"></Card>
<Card title="🐝 Beehiiv" href="/data-sources/beehiiv"></Card>
</CardGroup>
<br/ >
<Snippet file="missing-data-source-tip.mdx" />

View File

@@ -1,5 +1,5 @@
---
title: '🌐 Full Stack'
title: 'Full Stack'
---
The Full Stack app example can be found [here](https://github.com/embedchain/embedchain/tree/main/examples/full_stack).

View File

@@ -1,5 +1,5 @@
---
title: '🤖 OpenAI Assistant'
title: 'OpenAI Assistant'
---
<img src="https://blogs.swarthmore.edu/its/wp-content/uploads/2022/05/openai.jpg" align="center" width="500" alt="OpenAI Logo"/>
@@ -38,40 +38,6 @@ assistant = OpenAIAssistant(assistant_id="asst_xxx")
assistant = OpenAIAssistant(assistant_id="asst_xxx", thread_id="thread_xxx")
```
### Arguments
<ResponseField name="name" type="string">
Name for your AI assistant
</ResponseField>
<ResponseField name="instructions" type="string">
how the Assistant and model should behave or respond
</ResponseField>
<ResponseField name="assistant_id" type="string">
Load existing OpenAI Assistant. If you pass this, you don't have to pass other arguments.
</ResponseField>
<ResponseField name="thread_id" type="string">
Existing OpenAI thread id if exists
</ResponseField>
<ResponseField name="model" type="str" default="gpt-4-1106-preview">
OpenAI model to use
</ResponseField>
<ResponseField name="tools" type="list">
OpenAI tools to use. Default set to `[{"type": "retrieval"}]`
</ResponseField>
<ResponseField name="data_sources" type="list" default="[]">
Add data sources to your assistant. You can add in the following format: `[{"source": "https://example.com", "data_type": "web_page"}]`
</ResponseField>
<ResponseField name="telemetry" type="boolean" default="True">
Anonymous telemetry (doesn't collect any user information or user's files). Used to improve the Embedchain package utilization. Default is `True`.
</ResponseField>
## Step-2: Add data to thread
You can add any custom data source that is supported by Embedchain. Else, you can directly pass the file path on your local system and Embedchain propagates it to OpenAI Assistant.
@@ -92,4 +58,3 @@ You can try it out yourself using the following Google Colab notebook:
<a href="https://colab.research.google.com/drive/1BKlXZYSl6AFRgiHZ5XIzXrXC_24kDYHQ?usp=sharing">
<img src="https://camo.githubusercontent.com/84f0493939e0c4de4e6dbe113251b4bfb5353e57134ffd9fcab6b8714514d4d1/68747470733a2f2f636f6c61622e72657365617263682e676f6f676c652e636f6d2f6173736574732f636f6c61622d62616467652e737667" alt="Open in Colab" />
</a>

View File

@@ -0,0 +1,51 @@
---
title: 'Open-Source AI Assistant'
---
Embedchain also provides support for creating Open-Source AI Assistants (similar to [OpenAI Assistants API](https://platform.openai.com/docs/assistants/overview)) which allows you to build AI assistants within your own applications using any LLM (OpenAI or otherwise). An Assistant has instructions and can leverage models, tools, and knowledge to respond to user queries.
At a high level, the Open-Source AI Assistants API has the following flow:
1. Create an AI Assistant by picking a model
2. Create a Thread when a user starts a conversation
3. Add Messages to the Thread as the user ask questions
4. Run the Assistant on the Thread to trigger responses. This automatically calls the relevant tools.
Creating an Open-Source AI Assistant is a simple 3 step process.
## Step 1: Instantiate AI Assistant
```python Initialize
from embedchain.store.assistants import AIAssistant
assistant = AIAssistant(
name="My Assistant",
data_sources=[{"source": "https://www.youtube.com/watch?v=U9mJuUkhUzk"}])
```
If you want to use the existing assistant, you can do something like this:
```python Initialize
# Load an assistant and create a new thread
assistant = AIAssistant(assistant_id="asst_xxx")
# Load a specific thread for an assistant
assistant = AIAssistant(assistant_id="asst_xxx", thread_id="thread_xxx")
```
## Step-2: Add data to thread
You can add any custom data source that is supported by Embedchain. Else, you can directly pass the file path on your local system and Embedchain propagates it to OpenAI Assistant.
```python Add data
assistant.add("/path/to/file.pdf")
assistant.add("https://www.youtube.com/watch?v=U9mJuUkhUzk")
assistant.add("https://openai.com/blog/new-models-and-developer-products-announced-at-devday")
```
## Step-3: Chat with your AI Assistant
```python Chat
assistant.chat("How much OpenAI credits were offered to attendees during OpenAI DevDay?")
# Response: 'Every attendee of OpenAI DevDay 2023 was offered $500 in OpenAI credits.'
```

115
docs/examples/showcase.mdx Normal file
View File

@@ -0,0 +1,115 @@
---
title: '🎪 Community showcase'
---
Embedchain community has been super active in creating demos on top of Embedchain. On this page, we showcase all the apps, blogs, videos, and tutorials created by the community. ❤️
## Apps
### Open Source
- [My GSoC23 bot- Streamlit chat](https://github.com/lucifertrj/EmbedChain_GSoC23_BOT) by Tarun Jain
- [Discord Bot for LLM chat](https://github.com/Reidond/discord_bots_playground/tree/c8b0c36541e4b393782ee506804c4b6962426dd6/python/chat-channel-bot) by Reidond
- [EmbedChain-Streamlit-Docker App](https://github.com/amjadraza/embedchain-streamlit-app) by amjadraza
- [Harry Potter Philosphers Stone Bot](https://github.com/vinayak-kempawad/Harry_Potter_Philosphers_Stone_Bot/) by Vinayak Kempawad, ([LinkedIn post](https://www.linkedin.com/feed/update/urn:li:activity:7080907532155686912/))
- [LLM bot trained on own messages](https://github.com/Harin329/harinBot) by Hao Wu
### Closed Source
- [Taobot.io](https://taobot.io) - chatbot & knowledgebase hybrid by [cachho](https://github.com/cachho)
- [Create Instant ChatBot 🤖 using embedchain](https://databutton.com/v/h3e680h9) by Avra, ([Tweet](https://twitter.com/Avra_b/status/1674704745154641920/))
- [JOBO 🤖 — The AI-driven sidekick to craft your resume](https://try-jobo.com/) by Enrico Willemse, ([LinkedIn Post](https://www.linkedin.com/posts/enrico-willemse_jobai-gptfun-embedchain-activity-7090340080879374336-ueLB/))
- [Explore Your Knowledge Base: Interactive chats over various forms of documents](https://chatdocs.dkedar.com/) by Kedar Dabhadkar, ([LinkedIn Post](https://www.linkedin.com/posts/dkedar7_machinelearning-llmops-activity-7092524836639424513-2O3L/))
- [Chatbot trained on 1000+ videos of Ester hicks the co-author behind the famous book Secret](https://ask-abraham.thoughtseed.repl.co) by Mohan Kumar
## Templates
### Replit
- [Embedchain Chat Bot](https://replit.com/@taranjeet1/Embedchain-Chat-Bot) by taranjeetio
- [Embedchain Memory Chat Bot Template](https://replit.com/@taranjeetio/Embedchain-Memory-Chat-Bot-Template) by taranjeetio
- [Chatbot app to demonstrate question-answering using retrieved information](https://replit.com/@AllisonMorrell/EmbedChainlitPublic) by Allison Morrell, ([LinkedIn Post](https://www.linkedin.com/posts/allison-morrell-2889275a_retrievalbot-screenshots-activity-7080339991754649600-wihZ/))
## Posts
### Blogs
- [Customer Service LINE Bot](https://www.evanlin.com/langchain-embedchain/) by Evan Lin
- [Chatbot in Under 5 mins using Embedchain](https://medium.com/@ayush.wattal/chatbot-in-under-5-mins-using-embedchain-a4f161fcf9c5) by Ayush Wattal
- [Understanding what the LLM framework embedchain does](https://zenn.dev/hijikix/articles/4bc8d60156a436) by Daisuke Hashimoto
- [In bed with GPT and Node.js](https://dev.to/worldlinetech/in-bed-with-gpt-and-nodejs-4kh2) by Raphaël Semeteys, ([LinkedIn Post](https://www.linkedin.com/posts/raphaelsemeteys_in-bed-with-gpt-and-nodejs-activity-7088113552326029313-nn87/))
- [Using Embedchain — A powerful LangChain Python wrapper to build Chat Bots even faster!⚡](https://medium.com/@avra42/using-embedchain-a-powerful-langchain-python-wrapper-to-build-chat-bots-even-faster-35c12994a360) by Avra, ([Tweet](https://twitter.com/Avra_b/status/1686767751560310784/))
- [What is the Embedchain library?](https://jahaniwww.com/%da%a9%d8%aa%d8%a7%d8%a8%d8%ae%d8%a7%d9%86%d9%87-embedchain/) by Ali Jahani, ([LinkedIn Post](https://www.linkedin.com/posts/ajahani_aepaetaeqaexaggahyaeu-aetaexaesabraeaaeqaepaeu-activity-7097605202135904256-ppU-/))
- [LangChain is Nice, But Have You Tried EmbedChain ?](https://medium.com/thoughts-on-machine-learning/langchain-is-nice-but-have-you-tried-embedchain-215a34421cde) by FS Ndzomga, ([Tweet](https://twitter.com/ndzfs/status/1695583640372035951/))
- [Simplest Method to Build a Custom Chatbot with GPT-3.5 (via Embedchain)](https://www.ainewsletter.today/p/simplest-method-to-build-a-custom) by Arjun, ([Tweet](https://twitter.com/aiguy_arjun/status/1696393808467091758/))
### LinkedIn
- [What is embedchain](https://www.linkedin.com/posts/activity-7079393104423698432-wRyi/) by Rithesh Sreenivasan
- [Building a chatbot with EmbedChain](https://www.linkedin.com/posts/activity-7078434598984060928-Zdso/) by Lior Sinclair
- [Making chatbot without vs with embedchain](https://www.linkedin.com/posts/kalyanksnlp_llms-chatbots-langchain-activity-7077453416221863936-7N1L/) by Kalyan KS
- [EmbedChain - very intuitive, first you index your data and then query!](https://www.linkedin.com/posts/shubhamsaboo_embedchain-a-framework-to-easily-create-activity-7079535460699557888-ad1X/) by Shubham Saboo
- [EmbedChain - Harnessing power of LLM](https://www.linkedin.com/posts/uditsaini_chatbotrevolution-llmpoweredbots-embedchainframework-activity-7077520356827181056-FjTK/) by Udit S.
- [AI assistant for ABBYY Vantage](https://www.linkedin.com/posts/maximevermeir_llm-github-abbyy-activity-7081658972071424000-fXfZ/) by Maxime V.
- [About embedchain](https://www.linkedin.com/feed/update/urn:li:activity:7080984218914189312/) by Morris Lee
- [How to use Embedchain](https://www.linkedin.com/posts/nehaabansal_github-embedchainembedchain-framework-activity-7085830340136595456-kbW5/) by Neha Bansal
- [Youtube/Webpage summary for Energy Study](https://www.linkedin.com/posts/bar%C4%B1%C5%9F-sanl%C4%B1-34b82715_enerji-python-activity-7082735341563977730-Js0U/) by Barış Sanlı, ([Tweet](https://twitter.com/barissanli/status/1676968784979193857/))
- [Demo: How to use Embedchain? (Contains Collab Notebook link)](https://www.linkedin.com/posts/liorsinclair_embedchain-is-getting-a-lot-of-traction-because-activity-7103044695995424768-RckT/) by Lior Sinclair
### Twitter
- [What is embedchain](https://twitter.com/AlphaSignalAI/status/1672668574450847745) by Lior
- [Building a chatbot with Embedchain](https://twitter.com/Saboo_Shubham_/status/1673537044419686401) by Shubham Saboo
- [Chatbot docker image behind an API with yaml configs with Embedchain](https://twitter.com/tricalt/status/1678411430192730113/) by Vasilije
- [Build AI powered PDF chatbot with just five lines of Python code with Embedchain!](https://twitter.com/Saboo_Shubham_/status/1676627104866156544/) by Shubham Saboo
- [Chatbot against a youtube video using embedchain](https://twitter.com/smaameri/status/1675201443043704834/) by Sami Maameri
- [Highlights of EmbedChain](https://twitter.com/carl_AIwarts/status/1673542204328120321/) by carl_AIwarts
- [Build Llama-2 chatbot in less than 5 minutes](https://twitter.com/Saboo_Shubham_/status/1682168956918833152/) by Shubham Saboo
- [All cool features of embedchain](https://twitter.com/DhravyaShah/status/1683497882438217728/) by Dhravya Shah, ([LinkedIn Post](https://www.linkedin.com/posts/dhravyashah_what-if-i-tell-you-that-you-can-make-an-ai-activity-7089459599287726080-ZIYm/))
- [Read paid Medium articles for Free using embedchain](https://twitter.com/kumarkaushal_/status/1688952961622585344) by Kaushal Kumar
## Videos
- [Embedchain in one shot](https://www.youtube.com/watch?v=vIhDh7H73Ww&t=82s) by AI with Tarun
- [embedChain Create LLM powered bots over any dataset Python Demo Tesla Neurallink Chatbot Example](https://www.youtube.com/watch?v=bJqAn22a6Gc) by Rithesh Sreenivasan
- [Embedchain - NEW 🔥 Langchain BABY to build LLM Bots](https://www.youtube.com/watch?v=qj_GNQ06I8o) by 1littlecoder
- [EmbedChain -- NEW!: Build LLM-Powered Bots with Any Dataset](https://www.youtube.com/watch?v=XmaBezzGHu4) by DataInsightEdge
- [Chat With Your PDFs in less than 10 lines of code! EMBEDCHAIN tutorial](https://www.youtube.com/watch?v=1ugkcsAcw44) by Phani Reddy
- [How To Create A Custom Knowledge AI Powered Bot | Install + How To Use](https://www.youtube.com/watch?v=VfCrIiAst-c) by The Ai Solopreneur
- [Build Custom Chatbot in 6 min with this Framework [Beginner Friendly]](https://www.youtube.com/watch?v=-8HxOpaFySM) by Maya Akim
- [embedchain-streamlit-app](https://www.youtube.com/watch?v=3-9GVd-3v74) by Amjad Raza
- [🤖CHAT with ANY ONLINE RESOURCES using EMBEDCHAIN - a LangChain wrapper, in few lines of code !](https://www.youtube.com/watch?v=Mp7zJe4TIdM) by Avra
- [Building resource-driven LLM-powered bots with Embedchain](https://www.youtube.com/watch?v=IVfcAgxTO4I) by BugBytes
- [embedchain-streamlit-demo](https://www.youtube.com/watch?v=yJAWB13FhYQ) by Amjad Raza
- [Embedchain - create your own AI chatbots using open source models](https://www.youtube.com/shorts/O3rJWKwSrWE) by Dhravya Shah
- [AI ChatBot in 5 lines Python Code](https://www.youtube.com/watch?v=zjWvLJLksv8) by Data Engineering
- [Interview with Karl Marx](https://www.youtube.com/watch?v=5Y4Tscwj1xk) by Alexander Ray Williams
- [Vlog where we try to build a bot based on our content on the internet](https://www.youtube.com/watch?v=I2w8CWM3bx4) by DV, ([Tweet](https://twitter.com/dvcoolster/status/1688387017544261632))
- [CHAT with ANY ONLINE RESOURCES using EMBEDCHAIN|STREAMLIT with MEMORY |All OPENSOURCE](https://www.youtube.com/watch?v=TqQIHWoWTDQ&pp=ygUKZW1iZWRjaGFpbg%3D%3D) by DataInsightEdge
- [Build POWERFUL LLM Bots EASILY with Your Own Data - Embedchain - Langchain 2.0? (Tutorial)](https://www.youtube.com/watch?v=jE24Y_GasE8) by WorldofAI, ([Tweet](https://twitter.com/intheworldofai/status/1696229166922780737))
- [Embedchain: An AI knowledge base assistant for customizing enterprise private data, which can be connected to discord, whatsapp, slack, tele and other terminals (with gradio to build a request interface) in Chinese](https://www.youtube.com/watch?v=5RZzCJRk-d0) by AIGC LINK
- [Embedchain Introduction](https://www.youtube.com/watch?v=Jet9zAqyggI) by Fahd Mirza
## Mentions
### Github repos
- [Awesome-LLM](https://github.com/Hannibal046/Awesome-LLM)
- [awesome-chatgpt-api](https://github.com/reorx/awesome-chatgpt-api)
- [awesome-langchain](https://github.com/kyrolabs/awesome-langchain)
- [Awesome-Prompt-Engineering](https://github.com/promptslab/Awesome-Prompt-Engineering)
- [awesome-chatgpt](https://github.com/eon01/awesome-chatgpt)
- [Awesome-LLMOps](https://github.com/tensorchord/Awesome-LLMOps)
- [awesome-generative-ai](https://github.com/filipecalegario/awesome-generative-ai)
- [awesome-gpt](https://github.com/formulahendry/awesome-gpt)
- [awesome-ChatGPT-repositories](https://github.com/taishi-i/awesome-ChatGPT-repositories)
- [awesome-gpt-prompt-engineering](https://github.com/snwfdhmp/awesome-gpt-prompt-engineering)
- [awesome-chatgpt](https://github.com/awesome-chatgpt/awesome-chatgpt)
- [awesome-llm-and-aigc](https://github.com/sjinzh/awesome-llm-and-aigc)
- [awesome-compbio-chatgpt](https://github.com/csbl-br/awesome-compbio-chatgpt)
- [Awesome-LLM4Tool](https://github.com/OpenGVLab/Awesome-LLM4Tool)
## Meetups
- [Dash and ChatGPT: Future of AI-enabled apps 30/08/23](https://go.plotly.com/dash-chatgpt)
- [Pie & AI: Bangalore - Build end-to-end LLM app using Embedchain 01/09/23](https://www.eventbrite.com/e/pie-ai-bangalore-build-end-to-end-llm-app-using-embedchain-tickets-698045722547)

View File

@@ -0,0 +1,53 @@
---
title: '🚀 Deployment'
description: 'Deploy your embedchain RAG application to production'
---
After successfully setting up and testing your Embedchain application locally, the next step is to deploy it to a hosting service to make it accessible to a wider audience. This section offers various options for hosting your app on the [Embedchain platform](https://app.embedchain.ai) or through [self-hosting options](#self-hosting).
## Option 1: Deploy on Embedchain Platform
Embedchain enables developers to deploy their LLM-powered apps in production using the [Embedchain platform](https://app.embedchain.ai). The platform offers free access to context on your data through its REST API. Once the pipeline is deployed, you can update your data sources anytime after deployment.
See the example below on how to use the deploy your app (for free):
```python
from embedchain import Pipeline as App
# Initialize app
app = App()
# Add data source
app.add("https://www.forbes.com/profile/elon-musk")
# Deploy your pipeline to Embedchain Platform
app.deploy()
# 🔑 Enter your Embedchain API key. You can find the API key at https://app.embedchain.ai/settings/keys/
# ec-xxxxxx
# 🛠️ Creating pipeline on the platform...
# 🎉🎉🎉 Pipeline created successfully! View your pipeline: https://app.embedchain.ai/pipelines/xxxxx
# 🛠️ Adding data to your pipeline...
# ✅ Data of type: web_page, value: https://www.forbes.com/profile/elon-musk added successfully.
```
## Option 2: Self-hosting
You can also deploy Embedchain as a self-hosted service using the dockerized REST API service that we provide. Please follow the [guide here](/examples/rest-api) on how to use the REST API service. Here are some tutorials on how to deploy a containerized application to different platforms like AWS, GCP, Azure etc:
- [AWS EKS](https://docs.aws.amazon.com/eks/latest/userguide/sample-deployment.html)
- [AWS ECS](https://docs.aws.amazon.com/codecatalyst/latest/userguide/deploy-tut-ecs.html)
- [Google GKE](https://cloud.google.com/kubernetes-engine/docs/tutorials/hello-app)
- [Azure App Service](https://learn.microsoft.com/en-us/training/modules/deploy-run-container-app-service/)
- [Fly.io](https://fly.io/docs/languages-and-frameworks/python/)
- [Render.com](https://render.com/docs/deploy-an-image)
- [Huggingface Spaces](https://huggingface.co/new-space)
## Seeking help?
If you run into issues with deployment, please feel free to reach out to us via any of the following methods:
<Snippet file="get-help.mdx" />

View File

@@ -115,7 +115,7 @@ embedder:
</Accordion>
</AccordionGroup>
#### Need more help?
#### Still have questions?
If docs aren't sufficient, please feel free to reach out to us using one of the following methods:
<Snippet file="get-help.mdx" />

View File

View File

@@ -1,221 +1,66 @@
---
title: 📚 Introduction
description: '📝 Embedchain is a Data Platform for LLMs - load, index, retrieve, and sync any unstructured data'
---
## 🌐 What is Embedchain?
## What is Embedchain?
Embedchain simplifies data handling by automatically processing unstructured data, breaking it into chunks, generating embeddings, and storing it in a vector database.
Embedchain is a production ready Open-Source RAG framework - load, index, retrieve, and sync any unstructured data.
Through various APIs, you can obtain contextual information for queries, find answers to specific questions, and engage in chat conversations using your data.
## 🔍 Search
Embedchain streamlines the creation of RAG applications, offering a seamless process for managing various types of unstructured data. It efficiently segments data into manageable chunks, generates relevant embeddings, and stores them in a vector database for optimized retrieval. With a suite of diverse APIs, it enables users to extract contextual information, find precise answers, or engage in interactive chat conversations, all tailored to their own data.
Embedchain lets you get most relevant context by doing semantic search over your data sources for a provided query. See the example below:
## Who is Embedchain for?
```python
from embedchain import Pipeline as App
Embedchain is designed for a diverse range of users, from AI professionals like Data Scientists and Machine Learning Engineers to those just starting their AI journey, including college students, independent developers, and hobbyists. Essentially, it's for anyone with an interest in AI, regardless of their expertise level.
# Initialize app
app = App()
Our APIs are user-friendly yet adaptable, enabling beginners to effortlessly create LLM-powered applications with as few as 4 lines of code. At the same time, we offer extensive customization options for every aspect of the RAG pipeline. This includes the choice of LLMs, vector databases, loaders and chunkers, retrieval strategies, re-ranking, and more.
# Add data source
app.add("https://www.forbes.com/profile/elon-musk")
Our platform's clear and well-structured abstraction layers ensure that users can tailor the system to meet their specific needs, whether they're crafting a simple project or a complex, nuanced AI application.
# Get relevant context using semantic search
context = app.search("What is the net worth of Elon?", num_documents=2)
print(context)
# Context:
# [
# {
# 'context': 'Elon Musk PROFILEElon MuskCEO, Tesla$221.9BReal Time Net Worthas of 10/29/23Reflects change since 5 pm ET of prior trading day. 1 in the world todayPhoto by Martin Schoeller for ForbesAbout Elon MuskElon Musk cofounded six companies, including electric car maker Tesla, rocket producer SpaceX and tunneling startup Boring Company.He owns about 21% of Tesla between stock and options, but has pledged more than half his shares as collateral for personal loans of up to $3.5 billion.SpaceX, founded in',
# 'source': 'https://www.forbes.com/profile/elon-musk',
# 'document_id': 'some_document_id'
# },
# {
# 'context': 'company, which is now called X.Wealth HistoryHOVER TO REVEAL NET WORTH BY YEARForbes Lists 1Forbes 400 (2023)The Richest Person In Every State (2023) 2Billionaires (2023) 1Innovative Leaders (2019) 25Powerful People (2018) 12Richest In Tech (2017)Global Game Changers (2016)More ListsPersonal StatsAge52Source of WealthTesla, SpaceX, Self MadeSelf-Made Score8Philanthropy Score1ResidenceAustin, TexasCitizenshipUnited StatesMarital StatusSingleChildren11EducationBachelor of Arts/Science, University',
# 'source': 'https://www.forbes.com/profile/elon-musk',
# 'document_id': 'some_document_id'
# }
# ]
```
## Why Use Embedchain?
## ❓Query
Developing a robust and efficient RAG (Retrieval-Augmented Generation) pipeline for production use presents numerous complexities, such as:
Embedchain empowers developers to ask questions and receive relevant answers through a user-friendly query API. Refer to the following example to learn how to utilize the query API:
- Integrating and indexing data from diverse sources.
- Determining optimal data chunking methods for each source.
- Synchronizing the RAG pipeline with regularly updated data sources.
- Implementing efficient data storage in a vector store.
- Deciding whether to include metadata with document chunks.
- Handling permission management.
- Configuring Large Language Models (LLMs).
- Selecting effective prompts.
- Choosing suitable retrieval strategies.
- Assessing the performance of your RAG pipeline.
- Deploying the pipeline into a production environment, among other concerns.
<CodeGroup>
Embedchain is designed to simplify these tasks, offering conventional yet customizable APIs. Our solution handles the intricate processes of loading, chunking, indexing, and retrieving data. This enables you to concentrate on aspects that are crucial for your specific use case or business objectives, ensuring a smoother and more focused development process.
```python With Citations
from embedchain import Pipeline as App
## How it works?
# Initialize app
app = App()
Embedchain makes it easy to add data to your RAG pipeline with these straightforward steps:
# Add data source
app.add("https://www.forbes.com/profile/elon-musk")
1. **Automatic Data Handling**: It automatically recognizes the data type and loads it.
2. **Efficient Data Processing**: The system creates embeddings for key parts of your data.
3. **Flexible Data Storage**: You get to choose where to store this processed data in a vector database.
# Get relevant answer for your query
answer, sources = app.query("What is the net worth of Elon?", citations=True)
print(answer)
# Answer: The net worth of Elon Musk is $221.9 billion.
When a user asks a question, whether for chatting, searching, or querying, Embedchain simplifies the response process:
print(sources)
# [
# (
# 'Elon Musk PROFILEElon MuskCEO, Tesla$247.1B$2.3B (0.96%)Real Time Net Worthas of 12/7/23 ...',
# 'https://www.forbes.com/profile/elon-musk',
# '4651b266--4aa78839fe97'
# ),
# (
# '74% of the company, which is now called X.Wealth HistoryHOVER TO REVEAL NET WORTH BY YEARForbes ...',
# 'https://www.forbes.com/profile/elon-musk',
# '4651b266--4aa78839fe97'
# ),
# (
# 'founded in 2002, is worth nearly $150 billion after a $750 million tender offer in June 2023 ...',
# 'https://www.forbes.com/profile/elon-musk',
# '4651b266--4aa78839fe97'
# )
# ]
```
1. **Query Processing**: It turns the user's question into embeddings.
2. **Document Retrieval**: These embeddings are then used to find related documents in the database.
3. **Answer Generation**: The related documents are used by the LLM to craft a precise answer.
With Embedchain, you dont have to worry about the complexities of building a RAG pipeline. It offers an easy-to-use interface for developing applications with any kind of data.
```python Without Citations
from embedchain import Pipeline as App
## Getting started
# Initialize app
app = App()
Checkout our [quickstart guide](/get-started/quickstart) to start your first RAG application.
# Add data source
app.add("https://www.forbes.com/profile/elon-musk")
## Support
# Get relevant answer for your query
answer = app.query("What is the net worth of Elon?")
print(answer)
# Answer: The net worth of Elon Musk is $221.9 billion.
```
Feel free to reach out to us if you have ideas, feedback or questions that we can help out with.
</CodeGroup>
<Snippet file="get-help.mdx" />
When `citations=True`, note that the returned `sources` are a list of tuples where each tuple has three elements (in the following order):
1. source chunk
2. link of the source document
3. document id (used for book keeping purposes)
## Contribute
## 💬 Chat
Embedchain allows easy chatting over your data sources using a user-friendly chat API. Check out the example below to understand how to use the chat API:
<CodeGroup>
```python With Citations
from embedchain import Pipeline as App
# Initialize app
app = App()
# Add data source
app.add("https://www.forbes.com/profile/elon-musk")
# Get relevant answer for your query
answer, sources = app.chat("What is the net worth of Elon?", citations=True)
print(answer)
# Answer: The net worth of Elon Musk is $221.9 billion.
print(sources)
# [
# (
# 'Elon Musk PROFILEElon MuskCEO, Tesla$247.1B$2.3B (0.96%)Real Time Net Worthas of 12/7/23 ...',
# 'https://www.forbes.com/profile/elon-musk',
# '4651b266--4aa78839fe97'
# ),
# (
# '74% of the company, which is now called X.Wealth HistoryHOVER TO REVEAL NET WORTH BY YEARForbes ...',
# 'https://www.forbes.com/profile/elon-musk',
# '4651b266--4aa78839fe97'
# ),
# (
# 'founded in 2002, is worth nearly $150 billion after a $750 million tender offer in June 2023 ...',
# 'https://www.forbes.com/profile/elon-musk',
# '4651b266--4aa78839fe97'
# )
# ]
```
```python Without Citations
from embedchain import Pipeline as App
# Initialize app
app = App()
# Add data source
app.add("https://www.forbes.com/profile/elon-musk")
# Chat on your data using `.chat()`
answer = app.chat("What is the net worth of Elon?")
print(answer)
# Answer: The net worth of Elon Musk is $221.9 billion.
```
</CodeGroup>
Similar to `query()` function, when `citations=True`, note that the returned `sources` are a list of tuples where each tuple has three elements (in the following order):
1. source chunk
2. link of the source document
3. document id (used for book keeping purposes)
## 🚀 Deploy
Embedchain enables developers to deploy their LLM-powered apps in production using the Embedchain platform. The platform offers free access to context on your data through its REST API. Once the pipeline is deployed, you can update your data sources anytime after deployment.
See the example below on how to use the deploy API:
```python
from embedchain import Pipeline as App
# Initialize app
app = App()
# Add data source
app.add("https://www.forbes.com/profile/elon-musk")
# Deploy your pipeline to Embedchain Platform
app.deploy()
# 🔑 Enter your Embedchain API key. You can find the API key at https://app.embedchain.ai/settings/keys/
# ec-xxxxxx
# 🛠️ Creating pipeline on the platform...
# 🎉🎉🎉 Pipeline created successfully! View your pipeline: https://app.embedchain.ai/pipelines/xxxxx
# 🛠️ Adding data to your pipeline...
# ✅ Data of type: web_page, value: https://www.forbes.com/profile/elon-musk added successfully.
```
## 🛠️ How it works?
Embedchain abstracts out the following steps from you to easily create LLM powered apps:
1. Detect the data type and load data
2. Create meaningful chunks
3. Create embeddings for each chunk
4. Store chunks in a vector database
When a user asks a query, the following process happens to find the answer:
1. Create an embedding for the query
2. Find similar documents for the query from the vector database
3. Pass the similar documents as context to LLM to get the final answer
The process of loading the dataset and querying involves multiple steps, each with its own nuances:
- How should I chunk the data? What is a meaningful chunk size?
- How should I create embeddings for each chunk? Which embedding model should I use?
- How should I store the chunks in a vector database? Which vector database should I use?
- Should I store metadata along with the embeddings?
- How should I find similar documents for a query? Which ranking model should I use?
Embedchain takes care of all these nuances and provides a simple interface to create apps on any data.
## [🚀 Get started](https://docs.embedchain.ai/get-started/quickstart)
- [GitHub](https://github.com/embedchain/embedchain)
- [Contribution docs](/contribution/dev)

View File

@@ -1,20 +1,14 @@
---
title: '🚀 Quickstart'
description: '💡 Start building LLM powered apps under 30 seconds'
title: ' Quickstart'
description: '💡 Start building ChatGPT like apps in a minute on your own data'
---
Embedchain is a Data Platform for LLMs - load, index, retrieve, and sync any unstructured data. Using embedchain, you can easily create LLM powered apps over any data.
Install embedchain python package:
Install python package:
```bash
pip install embedchain
```
<Tip>
Embedchain now supports OpenAI's latest `gpt-4-turbo` model. Checkout the [FAQs](/get-started/faq#how-to-use-gpt-4-turbo-model-released-on-openai-devday).
</Tip>
Creating an app involves 3 steps:
<Steps>
@@ -59,6 +53,7 @@ Creating an app involves 3 steps:
app.query("What is the net worth of Elon Musk today?")
# Answer: The net worth of Elon Musk today is $258.7 billion.
```
<hr />
<Accordion title="Want to chat with your app?" icon="face-thinking">
Embedchain provides a wide range of features to interact with your app. You can chat with your app, ask questions, search through your data, and much more.
```python
@@ -88,9 +83,3 @@ Creating an app involves 3 steps:
</Accordion>
</Step>
</Steps>
Putting it together, you can run your first app using the following Google Colab. Make sure to set the `OPENAI_API_KEY` 🔑 environment variable in the code.
<a href="https://colab.research.google.com/drive/17ON1LPonnXAtLaZEebnOktstB_1cJJmh?usp=sharing">
<img src="https://camo.githubusercontent.com/84f0493939e0c4de4e6dbe113251b4bfb5353e57134ffd9fcab6b8714514d4d1/68747470733a2f2f636f6c61622e72657365617263682e676f6f676c652e636f6d2f6173736574732f636f6c61622d62616467652e737667" alt="Open in Colab" />
</a>

View File

@@ -48,4 +48,4 @@ app.query("How many companies did Elon found?")
* Now the entire log for this will be visible in langsmith.
<img src="/images/langsmith.png"/>
<img src="/images/langsmith.png"/>

View File

@@ -4,7 +4,7 @@
"logo": {
"dark": "/logo/dark.svg",
"light": "/logo/light.svg",
"href": "https://embedchain.ai/"
"href": "https://github.com/embedchain/embedchain"
},
"favicon": "/favicon.png",
"colors": {
@@ -19,106 +19,145 @@
"modeToggle": {
"default": "dark"
},
"openapi": ["/rest-api.json"],
"openapi": [
"/rest-api.json"
],
"metadata": {
"og:image": "/images/og.png",
"twitter:site": "@embedchain"
},
"tabs": [
{
"name": "Examples",
"url": "examples"
},
{
"name": "API Reference",
"url": "api-reference"
}
],
"anchors": [
{
"name": "Embedchain Platform",
"icon": "tv",
"url": "https://app.embedchain.ai/"
"name": "Talk to founders",
"icon": "calendar",
"url": "https://cal.com/taranjeetio/ec"
},
{
"name": "Join our slack",
"icon": "slack",
"url": "https://join.slack.com/t/embedchain/shared_invite/zt-22uwz3c46-Zg7cIh5rOBteT_xe1jwLDw"
},
{
"name": "Join our discord",
"icon": "discord",
"url": "https://discord.gg/CUU9FPhRNt"
}
],
"topbarLinks": [
{
"name": "Create account",
"url": "https://app.embedchain.ai/login/"
"name": "GitHub",
"url": "https://github.com/embedchain/embedchain"
}
],
"topbarCtaButton": {
"name": "Get started",
"url": "https://app.embedchain.ai"
"name": "Join our slack",
"url": "https://join.slack.com/t/embedchain/shared_invite/zt-22uwz3c46-Zg7cIh5rOBteT_xe1jwLDw"
},
"primaryTab": {
"name": "Docs"
"name": "Documentation"
},
"navigation": [
{
"group": "Get started",
"group": "Get Started",
"pages": [
"get-started/quickstart",
"get-started/introduction",
"get-started/openai-assistant",
"get-started/faq",
"get-started/examples"
"get-started/quickstart",
"get-started/deployment",
{
"group": "🔗 Integrations",
"pages": [
"integration/langsmith"
]
},
"get-started/faq"
]
},
{
"group": "Use cases",
"pages": [
"use-cases/chatbots",
"use-cases/question-answering",
"use-cases/semantic-search"
]
},
{
"group": "Components",
"pages": [
"components/llms",
"components/embedding-models",
"components/vector-databases"
]
},
{
"group": "Data sources",
"pages": [
"data-sources/overview",
{
"group": "Supported data sources",
"group": "Data sources",
"pages": [
"data-sources/csv",
"data-sources/json",
"data-sources/docs-site",
"data-sources/docx",
"data-sources/mdx",
"data-sources/notion",
"data-sources/pdf-file",
"data-sources/qna",
"data-sources/sitemap",
"data-sources/text",
"data-sources/web-page",
"data-sources/openapi",
"data-sources/youtube-video",
"data-sources/discourse",
"data-sources/substack",
"data-sources/discord",
"data-sources/beehiiv"
"components/data-sources/overview",
{
"group": "Data types",
"pages": [
"components/data-sources/csv",
"components/data-sources/json",
"components/data-sources/docs-site",
"components/data-sources/docx",
"components/data-sources/mdx",
"components/data-sources/notion",
"components/data-sources/pdf-file",
"components/data-sources/qna",
"components/data-sources/sitemap",
"components/data-sources/text",
"components/data-sources/web-page",
"components/data-sources/openapi",
"components/data-sources/youtube-video",
"components/data-sources/discourse",
"components/data-sources/substack",
"components/data-sources/discord",
"components/data-sources/beehiiv"
]
},
"components/data-sources/data-type-handling"
]
},
"data-sources/data-type-handling"
"components/llms",
"components/vector-databases",
"components/embedding-models"
]
},
{
"group": "Advanced",
"pages": ["advanced/configuration"]
},
{
"group": "REST API",
"group": "Community",
"pages": [
"rest-api/getting-started",
"rest-api/create",
"rest-api/get-all-apps",
"rest-api/add-data",
"rest-api/get-data",
"rest-api/query",
"rest-api/deploy",
"rest-api/delete",
"rest-api/check-status"
"community/connect-with-us"
]
},
{
"group": "Use Cases",
"group": "Examples",
"pages": [
{
"group": "REST API Service",
"pages": [
"examples/rest-api/getting-started",
"examples/rest-api/create",
"examples/rest-api/get-all-apps",
"examples/rest-api/add-data",
"examples/rest-api/get-data",
"examples/rest-api/query",
"examples/rest-api/deploy",
"examples/rest-api/delete",
"examples/rest-api/check-status"
]
},
"examples/full_stack",
"examples/openai-assistant",
"examples/opensource-assistant"
]
},
{
"group": "Chatbots",
"pages": [
"examples/discord_bot",
"examples/slack_bot",
"examples/telegram_bot",
@@ -127,15 +166,33 @@
]
},
{
"group": "Community",
"pages": ["community/connect-with-us", "community/showcase"]
"group": "Showcase",
"pages": [
"examples/showcase"
]
},
{
"group": "Integrations",
"pages": ["integration/langsmith"]
"group": "API Reference",
"pages": [
"api-reference/pipeline/overview",
{
"group": "Pipeline methods",
"pages": [
"api-reference/pipeline/add",
"api-reference/pipeline/query",
"api-reference/pipeline/chat",
"api-reference/pipeline/search",
"api-reference/pipeline/deploy",
"api-reference/pipeline/reset"
]
},
"api-reference/store/openai-assistant",
"api-reference/store/ai-assistants",
"api-reference/advanced/configuration"
]
},
{
"group": "Contribute",
"group": "Contributing",
"pages": [
"contribution/guidelines",
"contribution/dev",
@@ -146,7 +203,9 @@
},
{
"group": "Product",
"pages": ["product/release-notes"]
"pages": [
"product/release-notes"
]
}
],
"footerSocials": {
@@ -175,4 +234,4 @@
"api": {
"baseUrl": "http://localhost:8080"
}
}
}

3
docs/platform/faq.mdx Normal file
View File

@@ -0,0 +1,3 @@
---
title: 'FAQs'
---

View File

@@ -0,0 +1,3 @@
---
title: 'Overview'
---

View File

@@ -0,0 +1,3 @@
---
title: 'Quickstart'
---

View File

@@ -0,0 +1,3 @@
---
title: 'Roadmap'
---

View File

@@ -0,0 +1,3 @@
---
title: 'Security'
---

View File

@@ -245,7 +245,7 @@
"/{app_id}/deploy": {
"post": {
"tags": ["Apps"],
"summary": "Deploy App",
"summary": "Deploy app",
"description": "Deploy an existing app.",
"operationId": "deploy_app__app_id__deploy_post",
"parameters": [

View File

View File

@@ -0,0 +1,41 @@
---
title: 'Chatbots'
---
Chatbots, especially those powered by Large Language Models (LLMs), have a wide range of use cases, significantly enhancing various aspects of business, education, and personal assistance. Here are some key applications:
- **Customer Service**: Automating responses to common queries and providing 24/7 support.
- **Education**: Offering personalized tutoring and learning assistance.
- **E-commerce**: Assisting in product discovery, recommendations, and transactions.
- **Content Management**: Aiding in writing, summarizing, and organizing content.
- **Data Analysis**: Extracting insights from large datasets.
- **Language Translation**: Providing real-time multilingual support.
- **Mental Health**: Offering preliminary mental health support and conversation.
- **Entertainment**: Engaging users with games, quizzes, and humorous chats.
- **Accessibility Aid**: Enhancing information and service access for individuals with disabilities.
Embedchain provides the right set of tools to create chatbots for the above use cases. Refer to the following examples of chatbots on and you can built on top of these examples:
<CardGroup cols={2}>
<Card title="Full Stack Chatbot" href="/examples/full_stack" icon="link">
Learn to integrate a chatbot within a full-stack application.
</Card>
<Card title="Custom GPT Creation" href="https://app.embedchain.ai/create-your-gpt/" target="_blank" icon="link">
Build a tailored GPT chatbot suited for your specific needs.
</Card>
<Card title="Slack Integration Bot" href="/examples/slack_bot" icon="slack">
Enhance your Slack workspace with a specialized bot.
</Card>
<Card title="Discord Community Bot" href="/examples/discord_bot" icon="discord">
Create an engaging bot for your Discord server.
</Card>
<Card title="Telegram Assistant Bot" href="/examples/telegram_bot" icon="telegram">
Develop a handy assistant for Telegram users.
</Card>
<Card title="WhatsApp Helper Bot" href="/examples/whatsapp_bot" icon="whatsapp">
Design a WhatsApp bot for efficient communication.
</Card>
<Card title="Poe Bot for Unique Interactions" href="/examples/poe_bot" icon="link">
Explore advanced bot interactions with Poe Bot.
</Card>
</CardGroup>

View File

@@ -0,0 +1,75 @@
---
title: 'Question Answering'
---
Utilizing large language models (LLMs) for question answering is a transformative application, bringing significant benefits to various real-world situations. Embedchain extensively supports tasks related to question answering, including summarization, content creation, language translation, and data analysis. The versatility of question answering with LLMs enables solutions for numerous practical applications such as:
- **Educational Aid**: Enhancing learning experiences and aiding with homework
- **Customer Support**: Addressing and resolving customer queries efficiently
- **Research Assistance**: Facilitating academic and professional research endeavors
- **Healthcare Information**: Providing fundamental medical knowledge
- **Technical Support**: Resolving technology-related inquiries
- **Legal Information**: Offering basic legal advice and information
- **Business Insights**: Delivering market analysis and strategic business advice
- **Language Learning** Assistance: Aiding in understanding and translating languages
- **Travel Guidance**: Supplying information on travel and hospitality
- **Content Development**: Assisting authors and creators with research and idea generation
## Example: Build a Q&A System with Embedchain for Next.JS
Quickly create a RAG pipeline to answer queries about the [Next.JS Framework](https://nextjs.org/) using Embedchain tools.
### Step 1: Set Up Your RAG Pipeline
First, let's create your RAG pipeline. Open your Python environment and enter:
```python Create pipeline
from embedchain import Pipeline as App
app = App()
```
This initializes your application.
### Step 2: Populate Your Pipeline with Data
Now, let's add data to your pipeline. We'll include the Next.JS website and its documentation:
```python Ingest data sources
# Add Next.JS Website and docs
app.add("https://nextjs.org/sitemap.xml", data_type="sitemap")
# Add Next.JS Forum data
app.add("https://nextjs-forum.com/sitemap.xml", data_type="sitemap")
```
This step incorporates over **15K pages** from the Next.JS website and forum into your pipeline. For more data source options, check the [Embedchain data sources overview](/components/data-sources/overview).
### Step 3: Local Testing of Your Pipeline
Test the pipeline on your local machine:
```python Query App
app.query("Summarize the features of Next.js 14?")
```
Run this query to see how your pipeline responds with information about Next.js 14.
### (Optional) Step 4: Deploying Your RAG Pipeline
Want to go live? Deploy your pipeline with these options:
- Deploy on the Embedchain Platform
- Self-host on your preferred cloud provider
For detailed deployment instructions, follow these guides:
- [Deploying on Embedchain Platform](/get-started/deployment#deploy-on-embedchain-platform)
- [Self-hosting Guide](/get-started/deployment#self-hosting)
## Need help?
If you are looking to configure the RAG pipeline further, feel free to checkout the [API reference](/api-reference/pipeline/query).
In case you run into issues, feel free to contact us via any of the following methods:
<Snippet file="get-help.mdx" />

View File

@@ -0,0 +1,91 @@
Semantic searching, which involves understanding the intent and contextual meaning behind search queries, is yet another popular use-case of RAG. It has several popular use cases across various domains:
- **Information Retrieval**: Enhances search accuracy in databases and websites
- **E-commerce**: Improves product discovery in online shopping
- **Customer Support**: Powers smarter chatbots for effective responses
- **Content Discovery**: Aids in finding relevant media content
- **Knowledge Management**: Streamlines document and data retrieval in enterprises
- **Healthcare**: Facilitates medical research and literature search
- **Legal Research**: Assists in legal document and case law search
- **Academic Research**: Aids in academic paper discovery
- **Language Processing**: Enables multilingual search capabilities
Embedchain offers a simple yet customizable `search()` API that you can use for semantic search. See the example in the next section to know more.
## Example: Semantic Search over Next.JS Website + Forum
### Step 1: Set Up Your RAG Pipeline
First, let's create your RAG pipeline. Open your Python environment and enter:
```python Create pipeline
from embedchain import Pipeline as App
app = App()
```
This initializes your application.
### Step 2: Populate Your Pipeline with Data
Now, let's add data to your pipeline. We'll include the Next.JS website and its documentation:
```python Ingest data sources
# Add Next.JS Website and docs
app.add("https://nextjs.org/sitemap.xml", data_type="sitemap")
# Add Next.JS Forum data
app.add("https://nextjs-forum.com/sitemap.xml", data_type="sitemap")
```
This step incorporates over **15K pages** from the Next.JS website and forum into your pipeline. For more data source options, check the [Embedchain data sources overview](/components/data-sources/overview).
### Step 3: Local Testing of Your Pipeline
Test the pipeline on your local machine:
```python Search App
app.search("Summarize the features of Next.js 14?")
[
{
'context': 'Next.js 14 | Next.jsBack to BlogThursday, October 26th 2023Next.js 14Posted byLee Robinson@leeerobTim Neutkens@timneutkensAs we announced at Next.js Conf, Next.js 14 is our most focused release with: Turbopack: 5,000 tests passing for App & Pages Router 53% faster local server startup 94% faster code updates with Fast Refresh Server Actions (Stable): Progressively enhanced mutations Integrated with caching & revalidating Simple function calls, or works natively with forms Partial Prerendering',
'source': 'https://nextjs.org/blog/next-14',
'document_id': '6c8d1a7b-ea34-4927-8823-daa29dcfc5af--b83edb69b8fc7e442ff8ca311b48510e6c80bf00caa806b3a6acb34e1bcdd5d5'
},
{
'context': 'Next.js 13.3 | Next.jsBack to BlogThursday, April 6th 2023Next.js 13.3Posted byDelba de Oliveira@delba_oliveiraTim Neutkens@timneutkensNext.js 13.3 adds popular community-requested features, including: File-Based Metadata API: Dynamically generate sitemaps, robots, favicons, and more. Dynamic Open Graph Images: Generate OG images using JSX, HTML, and CSS. Static Export for App Router: Static / Single-Page Application (SPA) support for Server Components. Parallel Routes and Interception: Advanced',
'source': 'https://nextjs.org/blog/next-13-3',
'document_id': '6c8d1a7b-ea34-4927-8823-daa29dcfc5af--b83edb69b8fc7e442ff8ca311b48510e6c80bf00caa806b3a6acb34e1bcdd5d5'
},
{
'context': 'Upgrading: Version 14 | Next.js MenuUsing App RouterFeatures available in /appApp Router.UpgradingVersion 14Version 14 Upgrading from 13 to 14 To update to Next.js version 14, run the following command using your preferred package manager: Terminalnpm i next@latest react@latest react-dom@latest eslint-config-next@latest Terminalyarn add next@latest react@latest react-dom@latest eslint-config-next@latest Terminalpnpm up next react react-dom eslint-config-next -latest Terminalbun add next@latest',
'source': 'https://nextjs.org/docs/app/building-your-application/upgrading/version-14',
'document_id': '6c8d1a7b-ea34-4927-8823-daa29dcfc5af--b83edb69b8fc7e442ff8ca311b48510e6c80bf00caa806b3a6acb34e1bcdd5d5'
}
]
```
The `source` key contains the url of the document that yielded that document chunk.
If you are interested in configuring the search further, refer to our [API documentation](/api-reference/pipeline/search).
### (Optional) Step 4: Deploying Your RAG Pipeline
Want to go live? Deploy your pipeline with these options:
- Deploy on the Embedchain Platform
- Self-host on your preferred cloud provider
For detailed deployment instructions, follow these guides:
- [Deploying on Embedchain Platform](/get-started/deployment#deploy-on-embedchain-platform)
- [Self-hosting Guide](/get-started/deployment#self-hosting)
----
This guide will help you swiftly set up a semantic search pipeline with Embedchain, making it easier to access and analyze specific information from large data sources.
## Need help?
In case you run into issues, feel free to contact us via any of the following methods:
<Snippet file="get-help.mdx" />

View File

@@ -228,15 +228,6 @@ embedchain is a framework which takes care of all these nuances and provides a s
In the first release, we are making it easier for anyone to get a chatbot over any dataset up and running in less than a minute. All you need to do is create an app instance, add the data sets using `.add` function and then use `.query` function to get the relevant answer.
# Tech Stack
embedchain is built on the following stack:
- [Langchain](https://github.com/hwchase17/langchain) as an LLM framework to load, chunk and index data
- [OpenAI's Ada embedding model](https://platform.openai.com/docs/guides/embeddings) to create embeddings
- [OpenAI's ChatGPT API](https://platform.openai.com/docs/guides/gpt/chat-completions-api) as LLM to get answers given the context
- [Chroma](https://github.com/chroma-core/chroma) as the vector database to store embeddings
# Team
## Author