Commit Graph

51 Commits

Author SHA1 Message Date
Sidharth Mohanty
c62663f2e4 Add GPT4Vision Image loader (#1089)
Co-authored-by: Deshraj Yadav <deshrajdry@gmail.com>
2024-01-02 03:57:23 +05:30
Logie
bd88fe3980 fix: typo when import helpers.json_serializable on rss_feed files (#1095) 2024-01-01 21:38:41 +05:30
Deven Patel
c0ee680546 [Improvement] Add support for min chunk size (#1007) 2023-12-15 05:59:15 +05:30
Deven Patel
ae6f866901 [improvement] update web page default chunk size to 2000 (#1005)
Co-authored-by: Deven Patel <deven298@yahoo.com>
2023-12-13 12:36:13 +05:30
Sidharth Mohanty
d8897ce356 [Feature] RSS Feed loader (#942) 2023-12-07 14:18:35 -08:00
Sidharth Mohanty
51ebf3439b [New] Beehiiv loader (#963) 2023-12-07 14:11:56 -08:00
Deven Patel
51b4966801 [Improvements] Package improvements (#993)
Co-authored-by: Deven Patel <deven298@yahoo.com>
2023-12-05 23:42:45 -08:00
Deven Patel
512cfc9466 [Improvement] customize add method (#988) 2023-12-05 00:55:33 -08:00
Deshraj Yadav
f6b80e01a1 [Feature] Add support for custom streaming callback (#971) 2023-11-22 01:06:33 -08:00
Deshraj Yadav
9fcf2130b5 [Feature] Improve github and youtube channel loader (#966)
Co-authored-by: Deven Patel <deven298@yahoo.com>
2023-11-17 18:25:14 -08:00
Deven Patel
07fb6bee54 [Features] Add Github and Youtube Channel loaders (#957)
Co-authored-by: Deven Patel <deven298@yahoo.com>
Co-authored-by: Deshraj Yadav <deshrajdry@gmail.com>
2023-11-15 19:17:42 -08:00
Sidharth Mohanty
122313d8a5 [New] Substack loader (#949) 2023-11-14 21:52:15 -08:00
Deven Patel
95c0d47236 [Feature] Discourse Loader (#948)
Co-authored-by: Deven Patel <deven298@yahoo.com>
2023-11-13 16:39:11 -08:00
Deven Patel
919cc74e94 [Feature] Add MySQL Loader (#920)
Co-authored-by: Deven Patel <deven298@yahoo.com>
Co-authored-by: Deshraj Yadav <deshrajdry@gmail.com>
2023-11-13 13:21:36 -08:00
Deven Patel
539286aafd [Feature] Add Slack Loader (#932)
Co-authored-by: Deven Patel <deven298@yahoo.com>
2023-11-13 13:06:01 -08:00
Deven Patel
7de8d85199 [Feature] Add Postgres data loader (#918)
Co-authored-by: Deven Patel <deven298@yahoo.com>
2023-11-08 23:50:46 -08:00
Deven Patel
68183e9dce [Feature] Gmail Loader (#841) 2023-10-27 18:05:08 -07:00
Deven Patel
797bb567c6 [feat]: Add openapi spec data loader (#818) 2023-10-25 14:19:13 -07:00
Deshraj Yadav
a5c86a2f5c [Bugfix]: Fix issue of context overspilling into other apps (#835) 2023-10-19 17:46:33 -07:00
Muhammad Muzammil
8b64deab40 [Feature]: Unstructured File Loader Support - USF (#815) 2023-10-18 16:43:41 -07:00
Deven Patel
7641cba01d [Feature] JSON data loader support (#816) 2023-10-18 13:53:15 -07:00
Richard Awoyemi
1741d3bef6 [fix]: Fix sitemap loader (#753) 2023-10-06 16:24:15 -07:00
Ojuswi Rastogi
540a0a3685 [feat]: Add support for XML file format (#757) 2023-10-06 15:39:32 -07:00
Deshraj Yadav
64a34cac32 [OpenSearch] Add chunks specific to an app_id if present (#765) 2023-10-04 15:46:22 -07:00
Rupesh Bansal
d0af018b8d Add support for image dataset (#571)
Co-authored-by: Rupesh Bansal <rupeshbansal@Shankars-MacBook-Air.local>
2023-10-04 09:50:40 +05:30
Ayush Mishra
16f8de810c Chore/follow_snake_conventions_in_config (#716) 2023-09-27 21:38:42 +05:30
Taranjeet Singh
36b26e08c3 feat: add support for mdx file (#604) 2023-09-13 05:13:18 +05:30
Taranjeet Singh
2bd6881361 feat: Add embedding manager (#570) 2023-09-12 12:13:53 +05:30
Deshraj Yadav
79f5a1d052 [chore]: Rename modules for better readability and maintainability (#587) 2023-09-11 07:01:40 +05:30
cachho
bd595f84e8 feat: csv loader (#470)
Co-authored-by: Taranjeet Singh <reachtotj@gmail.com>
2023-09-05 13:48:03 +05:30
cachho
0d4ad07d7b Feat/serialize deserialize (#508)
Co-authored-by: Taranjeet Singh <reachtotj@gmail.com>
2023-09-04 01:20:18 +05:30
cachho
4c8876f032 feat: add method - detect format / data_type (#380) 2023-08-17 01:48:24 +05:30
cachho
ce6eb39009 feat: notion loader (#405) 2023-08-09 13:15:22 +05:30
cachho
a681d47bce fix: docs_site use chunker config implementation (#326) 2023-07-20 11:59:59 +05:30
aaishikdutta
c12362486f feat: added data format to metadata internally (#314) 2023-07-19 05:35:43 +05:30
cachho
9c58627372 chore: load chunker from config (#270) 2023-07-17 21:24:35 +05:30
Deshraj Yadav
a548863a09 Feature: Add support for loading docs website (#293) 2023-07-16 22:22:52 -07:00
Deshraj Yadav
fd97fb268a feat: Update line length to 120 chars (#278) 2023-07-15 19:41:55 +05:30
Taranjeet Singh
86e4146126 feat: Add new data type: code_docs_loader (#274) 2023-07-15 09:02:11 +05:30
Deshraj Yadav
9ca836520f Resolve conflicts (#208) 2023-07-11 10:20:05 +05:30
aaishikdutta
6936d6983d Added documentation (#219) 2023-07-11 08:31:42 +05:30
Anupam Singh
eda28cc491 featL AddConfig should allow configuring Chunker (#200) 2023-07-11 04:23:56 +05:30
Hao (Harin) Wu
996211e23e bug: Prevent clashing chunk IDs (#160)
This commit inserts a repeating chunk once only
preventing the chroma duplicate id error.
2023-07-08 10:29:47 +05:30
Sahil Kumar Yadav
0bb3d0afe9 feat: changed doc_file to docx and update readme (#157) 2023-07-07 16:18:05 +05:30
cachho
51adc5c886 refactor: Use src instead of url as argument value (#111) 2023-07-07 16:14:44 +05:30
Sahil Kumar Yadav
68e732a426 feat: add google doc support added (#155) 2023-07-06 14:04:27 +05:30
cachho
f5f5e7edd1 feat: add local text (#44)
This commits extends the "add_local" function. It
adds support to take text and index/embed it.
2023-06-25 23:13:41 +05:30
cachho
ff2d5ce7fa feat: add local qna pair 2023-06-23 19:53:57 +05:30
Taranjeet Singh
08f155a551 Update website to web page
This commit renames the website loader, chunker
to web page, as it is loading and chunking a single
url than the complete website.
2023-06-20 16:50:57 +05:30
Taranjeet Singh
4329caa17c Chunkers: Refactor each chunker & add base class
Adds a base chunker from which any chunker can inherit.
Existing chunkers are refactored to inherit from this base
chunker.
2023-06-20 16:30:23 +05:30