Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A Cantonese audio dataset features storyteller Zhang Yuekai narrating four classic literary works, including 'Romance of the Three Kingdoms' and 'Water Margin'. It is designed for TTS and ASR model training, as well as linguistic and literary research. The dataset contains audio files and corresponding standardized text transcripts.
Audio files are split into a /source directory for originals and an /opus directory for processed training data; text uses full-width punctuation and characters only.