Wu dialect speech data provides a manually curated benchmark for multiple speech processing tasks. It includes 9.75 hours of Wu dialect ASR data, covering Shanghainese, Suzhounese, and Mandarin code-mixed speech. The benchmark was created by ASLP-lab and updated in February 2026.
Use Cases
- Wu dialect automatic speech recognition (ASR) based on Shanghainese and Suzhounese audio
- Wu-to-Mandarin automatic speech translation (AST) based on described speech translation tasks
- Speaker attribute analysis based on benchmark's speaker attribute evaluation
- Speech emotion recognition based on benchmark's emotion recognition evaluation
- Wu dialect text-to-speech (TTS) and instruct TTS based on benchmark's TTS tasks
Strengths
- First publicly available, manually curated benchmark for Wu dialect speech processing
- ASR component includes 9.75 hours of audio
- Benchmark covers multiple tasks: ASR, AST, speaker attributes, emotion recognition, TTS, and instruct TTS
Limitations
- Description metadata is limited; actual data quality requires manual inspection after download
- Column-level documentation is absent; field semantics must be inferred after download
Provenance
- Source
- ASLP-lab
- Collection Method
- Manually curated benchmark
- Freshness
- Last updated 2026-02-08 17:11:54; freshness should be verified
- Geography
- Wu dialect regions, likely including Shanghai and Suzhou