BosonAI created a dataset of 1645 diverse test cases for evaluating Text-to-Speech models. The dataset focuses on six challenging scenarios: emotions, paralinguistics, foreign words, syntactic complexity, complex pronunciation, and questions. It was released in June 2025 to accompany a research paper.
Use Cases
- Benchmarking TTS model performance on emotional expressiveness based on the 'emotions' scenario
- Evaluating TTS model accuracy on foreign word pronunciation based on the 'foreign words' scenario
- Testing TTS model robustness on complex syntactic structures based on the 'syntactic complexity' scenario
- Assessing TTS model handling of non-standard text like URLs and formulas based on the 'complex pronunciation' scenario
- Measuring TTS model naturalness on interrogative intonation based on the 'questions' scenario
Strengths
- Contains 1645 test cases, providing a substantial evaluation corpus
- Focuses on six distinct and challenging scenarios for TTS evaluation
- Created to accompany a peer-reviewed research paper, suggesting academic rigor
Limitations
- Column-level documentation is absent; field semantics must be inferred after download
- Row count is known, but specific file formats and data structure details are unknown
- Data may reflect bias inherent to the specific test cases selected by the authors
Provenance
- Source
- BosonAI, via Hugging Face
- Collection Method
- Created to accompany the paper 'EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge'
- Freshness
- Last updated 2025-06-23 16:43:54