Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Cantonese Audio TTS Dataset is a collection for text-to-speech applications, combining alvanlii/cantonese-radio and alvanlii/cantonese-youtube with an additional dataset of equal size. The dataset creator alvanlii applied filtering and audio enhancement techniques, including the removal of overlapped voices and music. It was last updated on 2026-04-05.
Speaker labels are not directly provided; the description suggests using speaker embedding models like Nvidia's TitaNet for speaker identification.