45 lines
1.6 KiB
Plaintext
45 lines
1.6 KiB
Plaintext
---
|
|
title: '🚀 Pipelines'
|
|
description: '💡 Start building LLM powered data pipelines in 1 minute'
|
|
---
|
|
|
|
Embedchain lets you build data pipelines on your own data sources and deploy it in production in less than a minute. It can load, index, retrieve, and sync any unstructured data.
|
|
|
|
Install embedchain python package:
|
|
|
|
```bash
|
|
pip install embedchain
|
|
```
|
|
|
|
Creating a pipeline involves 3 steps:
|
|
|
|
<Steps>
|
|
<Step title="⚙️ Import pipeline instance">
|
|
```python
|
|
from embedchain import Pipeline
|
|
p = Pipeline(name="Elon Musk")
|
|
```
|
|
</Step>
|
|
|
|
<Step title="🗃️ Add data sources">
|
|
```python
|
|
# Add different data sources
|
|
p.add("https://en.wikipedia.org/wiki/Elon_Musk")
|
|
p.add("https://www.forbes.com/profile/elon-musk")
|
|
# You can also add local data sources such as pdf, csv files etc.
|
|
# p.add("/path/to/file.pdf")
|
|
```
|
|
</Step>
|
|
<Step title="💬 Deploy your pipeline to Embedchain platform">
|
|
```python
|
|
p.deploy()
|
|
```
|
|
</Step>
|
|
</Steps>
|
|
|
|
That's it. Now, head to the [Embedchain platform](https://app.embedchain.ai) and your pipeline is available there. Make sure to set the `OPENAI_API_KEY` 🔑 environment variable in the code.
|
|
|
|
After you deploy your pipeline to Embedchain platform, you can still add more data sources and update the pipeline multiple times.
|
|
|
|
Here is a Google Colab notebook for you to get started: [](https://colab.research.google.com/drive/1YVXaBO4yqlHZY4ho67GCJ6aD4CHNiScD?usp=sharing)
|