A subset of the Librispeech corpus, published on huggingface by sahara22 and last updated on 2026-03-22. The dataset likely contains audio files and corresponding transcriptions for training and evaluating automatic speech recognition models. Its specific size, format, and licensing details are not provided in the available metadata.
Use Cases
- Train an acoustic model for English speech recognition (inferred from domain, verify after download)
- Benchmark ASR system performance on clean read speech (inferred from domain, verify after download)
- Fine-tune a pre-trained speech model on a specific speech corpus (inferred from domain, verify after download)
Strengths
- Published on the huggingface platform, facilitating easy access and integration.
- Last updated on 2026-03-22 11:38:44, indicating recent maintenance.
Limitations
- Metadata is minimal; actual content, size, and license require verification after download.
- Row count, file formats, and column-level documentation are unknown, which may limit suitability assessment.
Provenance
- Source
- huggingface user sahara22
- Collection Method
- Subset derived from the Librispeech corpus.
- Time Range
- null
- Freshness
- Last updated 2026-03-22 11:38:44.
- Geography
- null