[Feature] Add support for RAG evaluation (#1154)

Co-authored-by: Deven Patel <deven298@yahoo.com> Co-authored-by: Deshraj Yadav <deshrajdry@gmail.com>
2024-01-11 20:02:47 +05:30
parent 69e83adae0
commit e2cca61cd3
18 changed files with 788 additions and 21 deletions
--- a/docs/api-reference/pipeline/evaluate.mdx
+++ b/docs/api-reference/pipeline/evaluate.mdx
@@ -0,0 +1,41 @@
+---
+title: '📝 evaluate'
+---
+
+`evaluate()` method is used to evaluate the performance of a RAG app. You can find the signature below:
+
+### Parameters
+
+<ParamField path="question" type="Union[str, list[str]]">
+    A question or a list of questions to evaluate your app on.
+</ParamField>
+<ParamField path="metrics" type="Optional[list[Union[BaseMetric, str]]]" optional>
+    The metrics to evaluate your app on. Defaults to all metrics: `["context_relevancy", "answer_relevancy", "groundedness"]`
+</ParamField>
+<ParamField path="num_workers" type="int" optional>
+    Specify the number of threads to use for parallel processing.
+</ParamField>
+
+### Returns
+
+<ResponseField name="metrics" type="dict">
+    Returns the metrics you have chosen to evaluate your app on as a dictionary.
+</ResponseField>
+
+## Usage
+
+```python
+from embedchain import App
+
+app = App()
+
+# add data source
+app.add("https://www.forbes.com/profile/elon-musk")
+
+# run evaluation
+app.evaluate("what is the net worth of Elon Musk?")
+# {'answer_relevancy': 0.958019958036268, 'context_relevancy': 0.12903225806451613}
+
+# or
+# app.evaluate(["what is the net worth of Elon Musk?", "which companies does Elon Musk own?"])
+```