[Feature] Add support for RAG evaluation (#1154)

Co-authored-by: Deven Patel <deven298@yahoo.com>
Co-authored-by: Deshraj Yadav <deshrajdry@gmail.com>
This commit is contained in:
Deven Patel
2024-01-11 20:02:47 +05:30
committed by GitHub
parent 69e83adae0
commit e2cca61cd3
18 changed files with 788 additions and 21 deletions

View File

@@ -0,0 +1,41 @@
---
title: '📝 evaluate'
---
`evaluate()` method is used to evaluate the performance of a RAG app. You can find the signature below:
### Parameters
<ParamField path="question" type="Union[str, list[str]]">
A question or a list of questions to evaluate your app on.
</ParamField>
<ParamField path="metrics" type="Optional[list[Union[BaseMetric, str]]]" optional>
The metrics to evaluate your app on. Defaults to all metrics: `["context_relevancy", "answer_relevancy", "groundedness"]`
</ParamField>
<ParamField path="num_workers" type="int" optional>
Specify the number of threads to use for parallel processing.
</ParamField>
### Returns
<ResponseField name="metrics" type="dict">
Returns the metrics you have chosen to evaluate your app on as a dictionary.
</ResponseField>
## Usage
```python
from embedchain import App
app = App()
# add data source
app.add("https://www.forbes.com/profile/elon-musk")
# run evaluation
app.evaluate("what is the net worth of Elon Musk?")
# {'answer_relevancy': 0.958019958036268, 'context_relevancy': 0.12903225806451613}
# or
# app.evaluate(["what is the net worth of Elon Musk?", "which companies does Elon Musk own?"])
```