[Feature] Discourse Loader (#948)
Co-authored-by: Deven Patel <deven298@yahoo.com>
This commit is contained in:
44
docs/data-sources/discourse.mdx
Normal file
44
docs/data-sources/discourse.mdx
Normal file
@@ -0,0 +1,44 @@
|
||||
---
|
||||
title: '🗨️ Discourse'
|
||||
---
|
||||
|
||||
You can now easily load data from your community built with [Discourse](https://discourse.org/).
|
||||
|
||||
## Example
|
||||
|
||||
1. Setup the Discourse Loader with your community url.
|
||||
```Python
|
||||
from embedchain.loaders.discourse import DiscourseLoader
|
||||
|
||||
dicourse_loader = DiscourseLoader(config={"domain": "https://community.openai.com"})
|
||||
```
|
||||
|
||||
2. Once you setup the loader, you can create an app and load data using the above discourse loader
|
||||
```Python
|
||||
import os
|
||||
from embedchain.pipeline import Pipeline as App
|
||||
|
||||
os.environ["OPENAI_API_KEY"] = "sk-xxx"
|
||||
|
||||
app = App()
|
||||
|
||||
app.add("openai", data_type="discourse", loader=dicourse_loader)
|
||||
|
||||
question = "Where can I find the OpenAI API status page?"
|
||||
app.query(question)
|
||||
# Answer: You can find the OpenAI API status page at https:/status.openai.com/.
|
||||
```
|
||||
|
||||
NOTE: The `add` function of the app will accept any executable search query to load data. Refer [Discourse API Docs](https://docs.discourse.org/#tag/Search) to learn more about search queries.
|
||||
|
||||
3. We automatically create a chunker to chunk your discourse data, however if you wish to provide your own chunker class. Here is how you can do that:
|
||||
```Python
|
||||
|
||||
from embedchain.chunkers.discourse import DiscourseChunker
|
||||
from embedchain.config.add_config import ChunkerConfig
|
||||
|
||||
discourse_chunker_config = ChunkerConfig(chunk_size=1000, chunk_overlap=0, length_function=len)
|
||||
discourse_chunker = DiscourseChunker(config=discourse_chunker_config)
|
||||
|
||||
app.add("openai", data_type='discourse', loader=dicourse_loader, chunker=discourse_chunker)
|
||||
```
|
||||
@@ -18,11 +18,12 @@ Embedchain comes with built-in support for various data sources. We handle the c
|
||||
<Card title="🌐📄 web page" href="/data-sources/web-page"></Card>
|
||||
<Card title="🧾 xml" href="/data-sources/xml"></Card>
|
||||
<Card title="🙌 OpenAPI" href="/data-sources/openapi"></Card>
|
||||
<Card title="🎥📺 youtube video" href="/data-sources/youtube-video"></Card>
|
||||
<Card title="📺 youtube video" href="/data-sources/youtube-video"></Card>
|
||||
<Card title="📬 Gmail" href="/data-sources/gmail"></Card>
|
||||
<Card title="🐘 Postgres" href="/data-sources/postgres"></Card>
|
||||
<Card title="🐬 MySQL" href="/data-sources/mysql"></Card>
|
||||
<Card title="🤖 Slack" href="/data-sources/slack"></Card>
|
||||
<Card title="🗨️ Discourse" href="/data-sources/discourse"></Card>
|
||||
</CardGroup>
|
||||
|
||||
<br/ >
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
---
|
||||
title: '🎥📺 Youtube video'
|
||||
title: '📺 Youtube video'
|
||||
---
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user