[Feature] Discourse Loader (#948)

Co-authored-by: Deven Patel <deven298@yahoo.com>
This commit is contained in:
Deven Patel
2023-11-13 16:39:11 -08:00
committed by GitHub
parent 919cc74e94
commit 95c0d47236
12 changed files with 324 additions and 4 deletions

View File

@@ -0,0 +1,44 @@
---
title: '🗨️ Discourse'
---
You can now easily load data from your community built with [Discourse](https://discourse.org/).
## Example
1. Setup the Discourse Loader with your community url.
```Python
from embedchain.loaders.discourse import DiscourseLoader
dicourse_loader = DiscourseLoader(config={"domain": "https://community.openai.com"})
```
2. Once you setup the loader, you can create an app and load data using the above discourse loader
```Python
import os
from embedchain.pipeline import Pipeline as App
os.environ["OPENAI_API_KEY"] = "sk-xxx"
app = App()
app.add("openai", data_type="discourse", loader=dicourse_loader)
question = "Where can I find the OpenAI API status page?"
app.query(question)
# Answer: You can find the OpenAI API status page at https:/status.openai.com/.
```
NOTE: The `add` function of the app will accept any executable search query to load data. Refer [Discourse API Docs](https://docs.discourse.org/#tag/Search) to learn more about search queries.
3. We automatically create a chunker to chunk your discourse data, however if you wish to provide your own chunker class. Here is how you can do that:
```Python
from embedchain.chunkers.discourse import DiscourseChunker
from embedchain.config.add_config import ChunkerConfig
discourse_chunker_config = ChunkerConfig(chunk_size=1000, chunk_overlap=0, length_function=len)
discourse_chunker = DiscourseChunker(config=discourse_chunker_config)
app.add("openai", data_type='discourse', loader=dicourse_loader, chunker=discourse_chunker)
```

View File

@@ -18,11 +18,12 @@ Embedchain comes with built-in support for various data sources. We handle the c
<Card title="🌐📄 web page" href="/data-sources/web-page"></Card>
<Card title="🧾 xml" href="/data-sources/xml"></Card>
<Card title="🙌 OpenAPI" href="/data-sources/openapi"></Card>
<Card title="🎥📺 youtube video" href="/data-sources/youtube-video"></Card>
<Card title="📺 youtube video" href="/data-sources/youtube-video"></Card>
<Card title="📬 Gmail" href="/data-sources/gmail"></Card>
<Card title="🐘 Postgres" href="/data-sources/postgres"></Card>
<Card title="🐬 MySQL" href="/data-sources/mysql"></Card>
<Card title="🤖 Slack" href="/data-sources/slack"></Card>
<Card title="🗨️ Discourse" href="/data-sources/discourse"></Card>
</CardGroup>
<br/ >

View File

@@ -1,5 +1,5 @@
---
title: '🎥📺 Youtube video'
title: '📺 Youtube video'
---