Cleaned and denoised audio-text pairs for the Mooré language (ISO 639-3: mos) sourced from public domains. This unified corpus is specifically curated for low-resource speech tasks including text-to-speech (TTS) and automatic speech recognition (ASR).
Use Cases
- Train automatic speech recognition (ASR) models using the aligned audio and text transcriptions
- Develop text-to-speech (TTS) synthesis engines for the Mooré language using the denoised audio samples
- Perform phonetic and prosodic analysis of the Mooré language using the unified speech and text data
Strengths
- Aligned audio and text pairs for the Mooré language (ISO 639-3: mos)
- Cleaned and denoised audio files optimized for high-fidelity speech synthesis
- Unified corpus structure derived from multiple publicly available sources