Docs for directory data loader usage (#1016)

This commit is contained in:
Sidharth Mohanty
2023-12-16 07:59:05 +05:30
committed by GitHub
parent b246d9823e
commit 54f43215cd
3 changed files with 44 additions and 1 deletions

View File

@@ -0,0 +1,41 @@
---
title: '📁 Directory'
---
To use an entire directory as data source, just add `data_type` as `directory` and pass in the path of the local directory.
### Without customization
```python
import os
from embedchain import Pipeline as App
os.environ["OPENAI_API_KEY"] = "sk-xxx"
app = App()
app.add("./elon-musk", data_type="directory")
response = app.query("list all files")
print(response)
# Answer: Files are elon-musk-1.txt, elon-musk-2.pdf.
```
### Customization
```python
import os
from embedchain import Pipeline as App
from embedchain.loaders.directory_loader import DirectoryLoader
os.environ["OPENAI_API_KEY"] = "sk-xxx"
lconfig = {
"recursive": True,
"extensions": [".txt"]
}
loader = DirectoryLoader(config=lconfig)
app = App()
app.add("./elon-musk", loader=loader)
response = app.query("what are all the files related to?")
print(response)
# Answer: The files are related to Elon Musk.
```

View File

@@ -29,6 +29,7 @@ Embedchain comes with built-in support for various data sources. We handle the c
<Card title="⚙️ Custom" href="/components/data-sources/custom"></Card>
<Card title="📝 Substack" href="/components/data-sources/substack"></Card>
<Card title="🐝 Beehiiv" href="/components/data-sources/beehiiv"></Card>
<Card title="📁 Directory" href="/components/data-sources/directory"></Card>
</CardGroup>
<br/ >