A benchmark containing approximately 6.52 hours of human-annotated broadcast speech, totaling 8085 utterances, across 13 distinct domains. It is designed for automatic speech recognition performance evaluation in challenging conditions. The dataset was created by SUST-CSE-Speech and last updated on March 9, 2024.
Use Cases
- Benchmarking ASR model performance based on the multi-domain broadcast speech data.
- Evaluating ASR robustness in spontaneous speech conditions based on the dataset's design.
- Testing ASR systems for domain-shifting scenarios based on the 13 distinct domains.
- Evaluating ASR performance on multi-talker speech based on the dataset's design.
- Testing ASR systems for code-switching speech based on the dataset's design.
Strengths
- Contains 8085 utterances, providing a substantial number of speech samples.
- Includes approximately 6.52 hours of annotated audio, offering a significant duration of speech data.
- Covers 13 distinct domains, likely providing variety in speech content.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
- Last updated 2024-03-09 20:24:47; freshness should be verified.
Provenance
- Source
- SUST-CSE-Speech
- Collection Method
- Human-annotated broadcast speech.
- Freshness
- 2024-03-09
- Geography
- Bangladeshi