Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A 2026 benchmark from KRAFTON provides 6,000 prompt–text pairs for evaluating zero-shot text-to-speech models. It covers four acoustic regimes: Clean, Noisy, Wild, and Emotional, using prompts from 12 different datasets. This framework aims to assess model robustness in realistic and challenging recording scenarios.
The full description and technical report are hosted externally; license details should be verified on the source page before use.