722 seed utterances and 32,506 Common Voice samples were used to generate this Taiwanese Hokkien (Min Nan) speech dataset via the CosyVoice3 model. The dataset includes audio files, corresponding text, and speaker metadata. It was created by lianghsun and last updated on March 19, 2026.
Use Cases
- Training Taiwanese Hokkien speech synthesis models based on the audio and text pairs.
- Evaluating TTS model performance on a Hokkien corpus based on the provided audio samples.
- Studying speaker characteristics and emotion in synthetic speech based on the speaker_id and emotion metadata.
- Developing multilingual speech systems incorporating Taiwanese Hokkien based on the domain and accent classifications.
Strengths
- Includes 722 seed utterances from a specific TAT source.
- Incorporates 32,506 cleaned samples from the Common Voice corpus.
- Audio files have a specified sample rate of 22050 Hz.
Limitations
- Row count is unknown, which may limit suitability assessment.
- Column-level documentation is absent; field semantics must be inferred after download.
- Data may reflect geographic or source bias inherent to the seed text and Common Voice sources.
Provenance
- Source
- lianghsun
- Collection Method
- Batch-generated by the CosyVoice3 (Fun-CosyVoice3-0.5B) model.
- Freshness
- Last updated 2026-03-19 02:40:34; freshness should be verified.
- Geography
- Taiwan