Composed of a Chinese Mandarin speech corpus featuring recordings from 400 speakers representing various accent regions across China. The audio was captured in quiet indoor settings using high-fidelity microphones and is provided at a 16kHz sampling rate with manual transcriptions.
Use Cases
- Train Mandarin automatic speech recognition (ASR) models using the 16kHz audio recordings and their associated manual transcriptions
- Analyze phonetic variations across different Chinese regions using the 400-speaker accent-diverse dataset
- Benchmark speech-to-text accuracy against the provided 95% accuracy manual transcription baseline
Strengths
- 400 speakers from diverse accent areas across China
- Manual transcription accuracy verified at over 95%
- Audio recordings downsampled to a 16kHz sampling rate
- Recorded in quiet indoor environments using high-fidelity microphones