[Improvement] customize add method (#988)
This commit is contained in:
41
docs/data-sources/custom.mdx
Normal file
41
docs/data-sources/custom.mdx
Normal file
@@ -0,0 +1,41 @@
|
||||
---
|
||||
title: '⚙️ Custom'
|
||||
---
|
||||
|
||||
When we say "custom", we mean that you can customize the loader and chunker to your needs. This is done by passing a custom loader and chunker to the `add` method.
|
||||
|
||||
```python
|
||||
from embedchain import Pipeline as App
|
||||
import your_loader
|
||||
import your_chunker
|
||||
|
||||
app = App()
|
||||
loader = your_loader()
|
||||
chunker = your_chunker()
|
||||
|
||||
app.add("source", data_type="custom", loader=loader, chunker=chunker)
|
||||
```
|
||||
|
||||
<Note>
|
||||
The custom loader and chunker must be a class that inherits from the [`BaseLoader`](https://github.com/embedchain/embedchain/blob/main/embedchain/loaders/base_loader.py) and [`BaseChunker`](https://github.com/embedchain/embedchain/blob/main/embedchain/chunkers/base_chunker.py) classes respectively.
|
||||
</Note>
|
||||
|
||||
<Note>
|
||||
If the `data_type` is not a valid data type, the `add` method will fallback to the `custom` data type and expect a custom loader and chunker to be passed by the user.
|
||||
</Note>
|
||||
|
||||
Example:
|
||||
|
||||
```python
|
||||
from embedchain import Pipeline as App
|
||||
from embedchain.loaders.github import GithubLoader
|
||||
|
||||
app = App()
|
||||
|
||||
loader = GithubLoader(config={"token": "ghp_xxx"})
|
||||
|
||||
app.add("repo:embedchain/embedchain type:repo", data_type="github", loader=loader)
|
||||
|
||||
app.query("What is Embedchain?")
|
||||
# Answer: Embedchain is a Data Platform for Large Language Models (LLMs). It allows users to seamlessly load, index, retrieve, and sync unstructured data in order to build dynamic, LLM-powered applications. There is also a JavaScript implementation called embedchain-js available on GitHub.
|
||||
```
|
||||
@@ -26,6 +26,7 @@ Embedchain comes with built-in support for various data sources. We handle the c
|
||||
<Card title="🗨️ Discourse" href="/data-sources/discourse"></Card>
|
||||
<Card title="💬 Discord" href="/data-sources/discord"></Card>
|
||||
<Card title="📝 Github" href="/data-sources/github"></Card>
|
||||
<Card title="⚙️ Custom" href="/data-sources/custom"></Card>
|
||||
</CardGroup>
|
||||
|
||||
<br/ >
|
||||
|
||||
Reference in New Issue
Block a user