[Pipelines] Improvements in pipelines feature (#861)
This commit is contained in:
@@ -71,6 +71,10 @@
|
||||
"group": "Examples",
|
||||
"pages": ["examples/full_stack", "examples/api_server", "examples/discord_bot", "examples/slack_bot", "examples/telegram_bot", "examples/whatsapp_bot", "examples/poe_bot"]
|
||||
},
|
||||
{
|
||||
"group": "Pipelines",
|
||||
"pages": ["pipelines/quickstart"]
|
||||
},
|
||||
{
|
||||
"group": "Community",
|
||||
"pages": [
|
||||
|
||||
44
docs/pipelines/quickstart.mdx
Normal file
44
docs/pipelines/quickstart.mdx
Normal file
@@ -0,0 +1,44 @@
|
||||
---
|
||||
title: '🚀 Pipelines'
|
||||
description: '💡 Start building LLM powered data pipelines in 1 minute'
|
||||
---
|
||||
|
||||
Embedchain lets you build data pipelines on your own data sources and deploy it in production in less than a minute. It can load, index, retrieve, and sync any unstructured data.
|
||||
|
||||
Install embedchain python package:
|
||||
|
||||
```bash
|
||||
pip install embedchain
|
||||
```
|
||||
|
||||
Creating a pipeline involves 3 steps:
|
||||
|
||||
<Steps>
|
||||
<Step title="⚙️ Import pipeline instance">
|
||||
```python
|
||||
from embedchain import Pipeline
|
||||
p = Pipeline(name="Elon Musk")
|
||||
```
|
||||
</Step>
|
||||
|
||||
<Step title="🗃️ Add data sources">
|
||||
```python
|
||||
# Add different data sources
|
||||
p.add("https://en.wikipedia.org/wiki/Elon_Musk")
|
||||
p.add("https://www.forbes.com/profile/elon-musk")
|
||||
# You can also add local data sources such as pdf, csv files etc.
|
||||
# p.add("/path/to/file.pdf")
|
||||
```
|
||||
</Step>
|
||||
<Step title="💬 Deploy your pipeline to Embedchain platform">
|
||||
```python
|
||||
p.deploy()
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
That's it. Now, head to the [Embedchain platform](https://app.embedchain.ai) and your pipeline is available there. Make sure to set the `OPENAI_API_KEY` 🔑 environment variable in the code.
|
||||
|
||||
After you deploy your pipeline to Embedchain platform, you can still add more data sources and update the pipeline multiple times.
|
||||
|
||||
Here is a Google Colab notebook for you to get started: [](https://colab.research.google.com/drive/1YVXaBO4yqlHZY4ho67GCJ6aD4CHNiScD?usp=sharing)
|
||||
Reference in New Issue
Block a user