Commit Graph

31 Commits

Author SHA1 Message Date
Deven Patel
7641cba01d [Feature] JSON data loader support (#816) 2023-10-18 13:53:15 -07:00
Richard Awoyemi
1741d3bef6 [fix]: Fix sitemap loader (#753) 2023-10-06 16:24:15 -07:00
Ojuswi Rastogi
540a0a3685 [feat]: Add support for XML file format (#757) 2023-10-06 15:39:32 -07:00
Deshraj Yadav
64a34cac32 [OpenSearch] Add chunks specific to an app_id if present (#765) 2023-10-04 15:46:22 -07:00
Rupesh Bansal
d0af018b8d Add support for image dataset (#571)
Co-authored-by: Rupesh Bansal <rupeshbansal@Shankars-MacBook-Air.local>
2023-10-04 09:50:40 +05:30
Ayush Mishra
16f8de810c Chore/follow_snake_conventions_in_config (#716) 2023-09-27 21:38:42 +05:30
Taranjeet Singh
36b26e08c3 feat: add support for mdx file (#604) 2023-09-13 05:13:18 +05:30
Taranjeet Singh
2bd6881361 feat: Add embedding manager (#570) 2023-09-12 12:13:53 +05:30
Deshraj Yadav
79f5a1d052 [chore]: Rename modules for better readability and maintainability (#587) 2023-09-11 07:01:40 +05:30
cachho
bd595f84e8 feat: csv loader (#470)
Co-authored-by: Taranjeet Singh <reachtotj@gmail.com>
2023-09-05 13:48:03 +05:30
cachho
0d4ad07d7b Feat/serialize deserialize (#508)
Co-authored-by: Taranjeet Singh <reachtotj@gmail.com>
2023-09-04 01:20:18 +05:30
cachho
4c8876f032 feat: add method - detect format / data_type (#380) 2023-08-17 01:48:24 +05:30
cachho
ce6eb39009 feat: notion loader (#405) 2023-08-09 13:15:22 +05:30
cachho
a681d47bce fix: docs_site use chunker config implementation (#326) 2023-07-20 11:59:59 +05:30
aaishikdutta
c12362486f feat: added data format to metadata internally (#314) 2023-07-19 05:35:43 +05:30
cachho
9c58627372 chore: load chunker from config (#270) 2023-07-17 21:24:35 +05:30
Deshraj Yadav
a548863a09 Feature: Add support for loading docs website (#293) 2023-07-16 22:22:52 -07:00
Deshraj Yadav
fd97fb268a feat: Update line length to 120 chars (#278) 2023-07-15 19:41:55 +05:30
Taranjeet Singh
86e4146126 feat: Add new data type: code_docs_loader (#274) 2023-07-15 09:02:11 +05:30
Deshraj Yadav
9ca836520f Resolve conflicts (#208) 2023-07-11 10:20:05 +05:30
aaishikdutta
6936d6983d Added documentation (#219) 2023-07-11 08:31:42 +05:30
Anupam Singh
eda28cc491 featL AddConfig should allow configuring Chunker (#200) 2023-07-11 04:23:56 +05:30
Hao (Harin) Wu
996211e23e bug: Prevent clashing chunk IDs (#160)
This commit inserts a repeating chunk once only
preventing the chroma duplicate id error.
2023-07-08 10:29:47 +05:30
Sahil Kumar Yadav
0bb3d0afe9 feat: changed doc_file to docx and update readme (#157) 2023-07-07 16:18:05 +05:30
cachho
51adc5c886 refactor: Use src instead of url as argument value (#111) 2023-07-07 16:14:44 +05:30
Sahil Kumar Yadav
68e732a426 feat: add google doc support added (#155) 2023-07-06 14:04:27 +05:30
cachho
f5f5e7edd1 feat: add local text (#44)
This commits extends the "add_local" function. It
adds support to take text and index/embed it.
2023-06-25 23:13:41 +05:30
cachho
ff2d5ce7fa feat: add local qna pair 2023-06-23 19:53:57 +05:30
Taranjeet Singh
08f155a551 Update website to web page
This commit renames the website loader, chunker
to web page, as it is loading and chunking a single
url than the complete website.
2023-06-20 16:50:57 +05:30
Taranjeet Singh
4329caa17c Chunkers: Refactor each chunker & add base class
Adds a base chunker from which any chunker can inherit.
Existing chunkers are refactored to inherit from this base
chunker.
2023-06-20 16:30:23 +05:30
Taranjeet Singh
468db83337 Add simple app functionality
This commit enables anyone to create a app and add 3 types of data
sources:

* pdf file
* youtube video
* website

It exposes a function called query which first gets similar docs from
vector db and then passes it to LLM to get the final answer.
2023-06-20 14:42:55 +05:30