7 stress-test categories of evaluation samples designed for calculating domain-wise Character Error Rate (CER) scores. The dataset contains unique sentence-language pairs to ensure clean metrics for Text-to-Speech (TTS) robustness testing.
Use Cases
- Calculate domain-wise Character Error Rate (CER) scores by comparing synthesized audio against the 'text' column.
- Evaluate TTS model stability across 7 specific stress-test categories.
- Benchmark multilingual TTS performance using the unique sentence-language pairs.
Strengths
- Includes 7 distinct stress-test categories for domain-wise performance analysis.
- Features a 'text' column containing the original input sentences for TTS synthesis.
- Ensures each sentence-language pair appears exactly once to prevent metric bias.