[Refactor] Converge Pipeline and App classes (#1021)

Co-authored-by: Deven Patel <deven298@yahoo.com>
This commit is contained in:
Deven Patel
2023-12-29 16:52:41 +05:30
committed by GitHub
parent c0aafd38c9
commit a926bcc640
91 changed files with 646 additions and 875 deletions

View File

@@ -5,7 +5,7 @@ title: "🐝 Beehiiv"
To add any Beehiiv data sources to your app, just add the base url as the source and set the data_type to `beehiiv`.
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()

View File

@@ -5,7 +5,7 @@ title: '📊 CSV'
To add any csv file, use the data_type as `csv`. `csv` allows remote urls and conventional file paths. Headers are included for each line, so if you have an `age` column, `18` will be added as `age: 18`. Eg:
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()
app.add('https://people.sc.fsu.edu/~jburkardt/data/csv/airtravel.csv', data_type="csv")

View File

@@ -5,7 +5,7 @@ title: '⚙️ Custom'
When we say "custom", we mean that you can customize the loader and chunker to your needs. This is done by passing a custom loader and chunker to the `add` method.
```python
from embedchain import Pipeline as App
from embedchain import App
import your_loader
import your_chunker
@@ -27,7 +27,7 @@ app.add("source", data_type="custom", loader=loader, chunker=chunker)
Example:
```python
from embedchain import Pipeline as App
from embedchain import App
from embedchain.loaders.github import GithubLoader
app = App()

View File

@@ -35,7 +35,7 @@ Default behavior is to create a persistent vector db in the directory **./db**.
Create a local index:
```python
from embedchain import Pipeline as App
from embedchain import App
naval_chat_bot = App()
naval_chat_bot.add("https://www.youtube.com/watch?v=3qHkcs3kG44")
@@ -45,7 +45,7 @@ naval_chat_bot.add("https://navalmanack.s3.amazonaws.com/Eric-Jorgenson_The-Alma
You can reuse the local index with the same code, but without adding new documents:
```python
from embedchain import Pipeline as App
from embedchain import App
naval_chat_bot = App()
print(naval_chat_bot.query("What unique capacity does Naval argue humans possess when it comes to understanding explanations or concepts?"))
@@ -56,7 +56,7 @@ print(naval_chat_bot.query("What unique capacity does Naval argue humans possess
You can reset the app by simply calling the `reset` method. This will delete the vector database and all other app related files.
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()
app.add("https://www.youtube.com/watch?v=3qHkcs3kG44")

View File

@@ -8,7 +8,7 @@ To use an entire directory as data source, just add `data_type` as `directory` a
```python
import os
from embedchain import Pipeline as App
from embedchain import App
os.environ["OPENAI_API_KEY"] = "sk-xxx"
@@ -23,7 +23,7 @@ print(response)
```python
import os
from embedchain import Pipeline as App
from embedchain import App
from embedchain.loaders.directory_loader import DirectoryLoader
os.environ["OPENAI_API_KEY"] = "sk-xxx"

View File

@@ -12,7 +12,7 @@ To add any Discord channel messages to your app, just add the `channel_id` as th
```python
import os
from embedchain import Pipeline as App
from embedchain import App
# add your discord "BOT" token
os.environ["DISCORD_TOKEN"] = "xxx"

View File

@@ -5,7 +5,7 @@ title: '📚 Code documentation'
To add any code documentation website as a loader, use the data_type as `docs_site`. Eg:
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()
app.add("https://docs.embedchain.ai/", data_type="docs_site")

View File

@@ -7,7 +7,7 @@ title: '📄 Docx file'
To add any doc/docx file, use the data_type as `docx`. `docx` allows remote urls and conventional file paths. Eg:
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()
app.add('https://example.com/content/intro.docx', data_type="docx")

View File

@@ -24,7 +24,7 @@ To use this you need to save `credentials.json` in the directory from where you
12. Put the `.json` file in your current directory and rename it to `credentials.json`
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()

View File

@@ -21,7 +21,7 @@ If you would like to add other data structures (e.g. list, dict etc.), convert i
<CodeGroup>
```python python
from embedchain import Pipeline as App
from embedchain import App
app = App()

View File

@@ -5,7 +5,7 @@ title: '📝 Mdx file'
To add any `.mdx` file to your app, use the data_type (first argument to `.add()` method) as `mdx`. Note that this supports support mdx file present on machine, so this should be a file path. Eg:
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()
app.add('path/to/file.mdx', data_type='mdx')

View File

@@ -8,7 +8,7 @@ To load a notion page, use the data_type as `notion`. Since it is hard to automa
The next argument must **end** with the `notion page id`. The id is a 32-character string. Eg:
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()

View File

@@ -5,7 +5,7 @@ title: 🙌 OpenAPI
To add any OpenAPI spec yaml file (currently the json file will be detected as JSON data type), use the data_type as 'openapi'. 'openapi' allows remote urls and conventional file paths.
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()

View File

@@ -5,7 +5,7 @@ title: '📰 PDF file'
To add any pdf file, use the data_type as `pdf_file`. Eg:
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()

View File

@@ -5,7 +5,7 @@ title: '❓💬 Queston and answer pair'
QnA pair is a local data type. To supply your own QnA pair, use the data_type as `qna_pair` and enter a tuple. Eg:
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()

View File

@@ -5,7 +5,7 @@ title: '🗺️ Sitemap'
Add all web pages from an xml-sitemap. Filters non-text files. Use the data_type as `sitemap`. Eg:
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()

View File

@@ -16,7 +16,7 @@ This will automatically retrieve data from the workspace associated with the use
```python
import os
from embedchain import Pipeline as App
from embedchain import App
os.environ["SLACK_USER_TOKEN"] = "xoxp-xxx"
app = App()

View File

@@ -5,7 +5,7 @@ title: "📝 Substack"
To add any Substack data sources to your app, just add the main base url as the source and set the data_type to `substack`.
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()

View File

@@ -7,7 +7,7 @@ title: '📝 Text'
Text is a local data type. To supply your own text, use the data_type as `text` and enter a string. The text is not processed, this can be very versatile. Eg:
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()

View File

@@ -5,7 +5,7 @@ title: '🌐 Web page'
To add any web page, use the data_type as `web_page`. Eg:
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()

View File

@@ -7,7 +7,7 @@ title: '🧾 XML file'
To add any xml file, use the data_type as `xml`. Eg:
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()

View File

@@ -13,7 +13,7 @@ pip install -u "embedchain[youtube]"
</Note>
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()
app.add("@channel_name", data_type="youtube_channel")

View File

@@ -5,7 +5,7 @@ title: '📺 Youtube'
To add any youtube video to your app, use the data_type as `youtube_video`. Eg:
```python
from embedchain import Pipeline as App
from embedchain import App
app = App()
app.add('a_valid_youtube_url_here', data_type='youtube_video')