Commit Graph

25 Commits

Author SHA1 Message Date
Saurabh Misra
1354747ca8 ️ Speed up get_word_count() by 6% in embedchain/chunkers/base_chunker.py (#1268) 2024-06-05 10:36:00 -07:00
Deshraj Yadav
3616eaadb4 [Refactor] Improve logging package wide (#1315) 2024-03-13 17:13:30 -07:00
Deshraj Yadav
2985b667b0 [Bug fix] Fix issue with gmail loader (#1228) 2024-01-29 18:36:02 +05:30
Deshraj Yadav
862ff6cca6 [Bug fix] Fix embedding issue for opensearch and some other vector databases (#1163) 2024-01-12 14:15:39 +05:30
Sandra Serrano
2496ed133e [Bug fix] Fix typos, static methods and other sanity improvements in the package (#1129) 2024-01-09 00:17:46 +05:30
Deven Patel
c0ee680546 [Improvement] Add support for min chunk size (#1007) 2023-12-15 05:59:15 +05:30
Deven Patel
51b4966801 [Improvements] Package improvements (#993)
Co-authored-by: Deven Patel <deven298@yahoo.com>
2023-12-05 23:42:45 -08:00
Deshraj Yadav
f6b80e01a1 [Feature] Add support for custom streaming callback (#971) 2023-11-22 01:06:33 -08:00
Deshraj Yadav
9fcf2130b5 [Feature] Improve github and youtube channel loader (#966)
Co-authored-by: Deven Patel <deven298@yahoo.com>
2023-11-17 18:25:14 -08:00
Deven Patel
07fb6bee54 [Features] Add Github and Youtube Channel loaders (#957)
Co-authored-by: Deven Patel <deven298@yahoo.com>
Co-authored-by: Deshraj Yadav <deshrajdry@gmail.com>
2023-11-15 19:17:42 -08:00
Deshraj Yadav
a5c86a2f5c [Bugfix]: Fix issue of context overspilling into other apps (#835) 2023-10-19 17:46:33 -07:00
Deshraj Yadav
64a34cac32 [OpenSearch] Add chunks specific to an app_id if present (#765) 2023-10-04 15:46:22 -07:00
Rupesh Bansal
d0af018b8d Add support for image dataset (#571)
Co-authored-by: Rupesh Bansal <rupeshbansal@Shankars-MacBook-Air.local>
2023-10-04 09:50:40 +05:30
Taranjeet Singh
2bd6881361 feat: Add embedding manager (#570) 2023-09-12 12:13:53 +05:30
Deshraj Yadav
79f5a1d052 [chore]: Rename modules for better readability and maintainability (#587) 2023-09-11 07:01:40 +05:30
cachho
0d4ad07d7b Feat/serialize deserialize (#508)
Co-authored-by: Taranjeet Singh <reachtotj@gmail.com>
2023-09-04 01:20:18 +05:30
cachho
4c8876f032 feat: add method - detect format / data_type (#380) 2023-08-17 01:48:24 +05:30
aaishikdutta
c12362486f feat: added data format to metadata internally (#314) 2023-07-19 05:35:43 +05:30
Deshraj Yadav
fd97fb268a feat: Update line length to 120 chars (#278) 2023-07-15 19:41:55 +05:30
Taranjeet Singh
86e4146126 feat: Add new data type: code_docs_loader (#274) 2023-07-15 09:02:11 +05:30
Deshraj Yadav
9ca836520f Resolve conflicts (#208) 2023-07-11 10:20:05 +05:30
aaishikdutta
6936d6983d Added documentation (#219) 2023-07-11 08:31:42 +05:30
Hao (Harin) Wu
996211e23e bug: Prevent clashing chunk IDs (#160)
This commit inserts a repeating chunk once only
preventing the chroma duplicate id error.
2023-07-08 10:29:47 +05:30
cachho
51adc5c886 refactor: Use src instead of url as argument value (#111) 2023-07-07 16:14:44 +05:30
Taranjeet Singh
4329caa17c Chunkers: Refactor each chunker & add base class
Adds a base chunker from which any chunker can inherit.
Existing chunkers are refactored to inherit from this base
chunker.
2023-06-20 16:30:23 +05:30