Datasets
A Dataset is a collection of inputs and expected outputs of an LLM App. This is a good way to run bulk evaluations and collaborate with the Subject Matter Experts (SMEs) that typically annotate your data entries with high quality human labels and scores.
What are Datasets?
Dataset Entries
Each entry of a Datasets is composed of:
Element | Definition |
---|---|
Query | The input query to your LLM App. |
Output | The output answer of your LLM App. |
Reference | The expected output of your LLM App.It typically is a high-quality ground truth value that is used by an evaluator to assess the quality of the Output. This is ideally annotated by a human. |
Comments | General information, typically annotated by a human, to provide more clarity about the Reference or Score they manually insterted. |
Score | A score, typically annotated by a human, to indicate the quality of the Reference value. For example, if the Reference annotation is good, but lenghtly, a human annotator might penalize its Score. We encourage to use a value in [0-1] as a Score. |
Create a Dataset
Tip
Please consult our API Reference or full Swagger API documentation to create a dataset via APIs.
To create a new datasets, click the datasets page. You will be prompted to insert the name for that dataset.
button on theUpload CSV
To upload a CSV file to your datasets, select your dataset of choice from the datasets page, use the button and select file to upload from your system and finally click on .
Add a Dataset Entry
Tip
Please consult our API Reference or full Swagger API documentation to create a dataset via APIs.
To upload a new entry to your datasets, select your dataset of choice from the datasets page, click the button and fill in some or all the available fields (Query, Reference, Output, Comments, Score).