Evaluators Input Arguments
ReferenceOutputPair
| Attributes | |
|---|---|
| reference | The str expected output of your LLM App. It typically is a high-quality ground truth value that is used by an evaluator to assess the quality of the output. This is ideally annotated by a human. |
| output | The str output answer of your LLM App. |
| contexts | A list of ContextChunk representing all the segments or pieces of information provided to an LLM to give the model sufficient context to understand and respond accurately to a query. |
QueryReferenceOutputTriplet
| Attributes | |
|---|---|
| query | The str input query to your LLM App. |
| reference | The str expected output of your LLM App. It typically is a high-quality ground truth value that is used by an evaluator to assess the quality of the output. This is ideally annotated by a human. |
| output | The str output answer of your LLM App. |
| contexts | A list of ContextChunk representing all the segments or pieces of information provided to an LLM to give the model sufficient context to understand and respond accurately to a query. |
QueryReferenceContextsTriplet
| Attributes | |
|---|---|
| query | The str input query to your LLM App. |
| reference | The str expected output of your LLM App. It typically is a high-quality ground truth value that is used by an evaluator to assess the quality of the output. This is ideally annotated by a human. |
| contexts | A list of ContextChunk representing all the segments or pieces of information provided to an LLM to give the model sufficient context to understand and respond accurately to a query. |
ReferenceOutputWeightsTriplet
| Attributes | |
|---|---|
| reference | A dict expected output of your LLM App. It typically is a high-quality ground truth value that is used by an evaluator to assess the quality of the output. This is ideally annotated by a human. |
| output | A dic output answer of your LLM App. |
| weights | A dict mapping keys to their respective weights, indicating the importance of each key in the evaluation. Default weight is 1.0 if not provided. Weights must be in the interval [0.0,1.0] or a ValueError is raised. If the key represents a nested dic object, its weight is automatically computed recursively as the sum of the weights of its nested keys. |
| contexts | A list of ContextChunk representing all the segments or pieces of information provided to an LLM to give the model sufficient context to understand and respond accurately to a query. |
VariablesContextsPair
| Attributes | |
|---|---|
| variables | A dict[str, str] containing arbitrary input texts representing the input arguments for the Custom Evaluator prompt_template. It is mandatory to include a non-empty key-value pair with the key output, representing the generated answer to be evaluated. |
| contexts | A list of ContextChunk representing all the segments or pieces of information provided to an LLM to give the model sufficient context to understand and respond accurately to a query. |
ContextChunk
| Attributes | |
|---|---|
| document | A str segment or piece of information provided to an LLM to give the model sufficient context to understand and respond accurately to a query. |
| relevance | A float in the interval [-1,1] representing how closely a retrieved document matches a given query. This number is typically returned by a Vector Databases. |