Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Encompassing 2,700 pairs of text-to-speech audio renderings with 15 human preference annotations per pair. Produced by datapointai and updated in March 2026, it provides comparative naturalness ratings for audio generated from identical text prompts. The collection totals 40,500 individual human judgments to support high-confidence audio quality evaluation.
The dataset is provided in Parquet format and is optimized for use with the Hugging Face Datasets library. Users should account for the 15-fold redundancy in annotations when calculating consensus scores.