Evaluators Input Arguments
ReferenceOutputPair
Attributes | |
---|---|
reference | The str expected output of your LLM App. It typically is a high-quality ground truth value that is used by an evaluator to assess the quality of the output . This is ideally annotated by a human. |
output | The str output answer of your LLM App. |
contexts | A list of ContextChunk representing all the segments or pieces of information provided to an LLM to give the model sufficient context to understand and respond accurately to a query . |
QueryReferenceOutputTriplet
Attributes | |
---|---|
query | The str input query to your LLM App. |
reference | The str expected output of your LLM App. It typically is a high-quality ground truth value that is used by an evaluator to assess the quality of the output . This is ideally annotated by a human. |
output | The str output answer of your LLM App. |
contexts | A list of ContextChunk representing all the segments or pieces of information provided to an LLM to give the model sufficient context to understand and respond accurately to a query . |
QueryReferenceContextsTriplet
Attributes | |
---|---|
query | The str input query to your LLM App. |
reference | The str expected output of your LLM App. It typically is a high-quality ground truth value that is used by an evaluator to assess the quality of the output . This is ideally annotated by a human. |
contexts | A list of ContextChunk representing all the segments or pieces of information provided to an LLM to give the model sufficient context to understand and respond accurately to a query . |
ReferenceOutputWeightsTriplet
Attributes | |
---|---|
reference | A dict expected output of your LLM App. It typically is a high-quality ground truth value that is used by an evaluator to assess the quality of the output . This is ideally annotated by a human. |
output | A dic output answer of your LLM App. |
weights | A dict mapping keys to their respective weights, indicating the importance of each key in the evaluation. Default weight is 1.0 if not provided. Weights must be in the interval [0.0,1.0] or a ValueError is raised. If the key represents a nested dic object, its weight is automatically computed recursively as the sum of the weights of its nested keys. |
contexts | A list of ContextChunk representing all the segments or pieces of information provided to an LLM to give the model sufficient context to understand and respond accurately to a query . |
VariablesContextsPair
Attributes | |
---|---|
variables | A dict[str, str] containing arbitrary input texts representing the input arguments for the Custom Evaluator prompt_template . It is mandatory to include a non-empty key-value pair with the key output, representing the generated answer to be evaluated. |
contexts | A list of ContextChunk representing all the segments or pieces of information provided to an LLM to give the model sufficient context to understand and respond accurately to a query . |
ContextChunk
Attributes | |
---|---|
document | A str segment or piece of information provided to an LLM to give the model sufficient context to understand and respond accurately to a query . |
relevance | A float in the interval [-1,1] representing how closely a retrieved document matches a given query . This number is typically returned by a Vector Databases. |