UrduMegaSpeech-1M is a large-scale Urdu-English parallel speech corpus containing over one million audio-text samples. It provides high-quality audio recordings paired with Urdu transcriptions and English source text, created by author humair025 for tasks like automatic speech recognition and speech translation.
Use Cases
- Train an automatic speech recognition model using the Urdu transcriptions paired with audio recordings.
- Develop a speech translation system by leveraging the parallel Urdu transcriptions and English source text.
- Build a text-to-speech model utilizing the high-quality audio recordings and corresponding Urdu transcriptions.
- Analyze speech data quality by employing the provided quality metrics for each audio-text sample.
Strengths
- Contains over one million audio-text samples, providing substantial scale for model training.
- Includes parallel Urdu transcriptions and English source text, enabling multilingual speech tasks.
- Provides quality metrics for each individual sample, aiding in data filtering and analysis.
Limitations
- Specific details on audio characteristics, speaker demographics, or recording conditions are not provided in the input.
- The dataset's geographic origin and temporal coverage are unknown, which may limit contextual analysis.
Provenance
- Source
- huggingface
- Collection Method
- High-quality audio recordings paired with transcriptions and source text, method details unspecified.
- Time Range
- null
- Freshness
- Last updated on December 2,量与2025.
- Geography
- null