DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Librispeech Long: Extended English Speech Audio for ASR | DataSalon

Home Speech & AudioLibrispeech Long: Extended English Speech Audio for ASR

Speech & Audio

Librispeech Long: Extended English Speech Audio for ASR

Name: Librispeech Long: Extended English Speech Audio for ASR
Creator: distil-whisper
Published: 2023-11-02T14:22:51
Keywords: English, Audio, Long Form, Speech Recognition

by distil-whisper·Updated 2y ago

Available on 1 platform

Description

Librispeech Long is a speech audio dataset derived from the LibriSpeech corpus, likely containing longer-form English audio segments. The dataset was created by distil-whisper and was last updated on Hugging Face in November 2023. Its specific size, format, and license details are not provided in the available metadata.

Use Cases

Fine-tuning speech recognition models on longer audio segments.
Benchmarking ASR system performance on extended speech.
Training or evaluating models for audiobook or podcast transcription.
Developing speaker diarization or segmentation algorithms on continuous speech.

Strengths

Based on the established LibriSpeech corpus, a widely-used benchmark in speech recognition.
Created by distil-whisper, suggesting a focus on efficient, distilled model applications.

Limitations

Description metadata is limited; actual data quality requires manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.

Provenance

Source: distil-whisper
Freshness: Last updated 2023-11-02 14:22:54; freshness should be verified.

License is unknown; users must verify permissions before use.

Audio English Long Form Speech Recognition

Related Datasets

Quality Score

D28

Description

Source

Reputation

Quality Score

D28

Description

Source

Reputation

Access

Community

11.5K downloads

4 likes

0 views

Dataset Info

Author: distil-whisper
Created: Nov 2, 2023
Updated: Nov 2, 2023
Last synced: May 25, 2026

Access

Community

11.5K downloads

4 likes

0 views

Dataset Info

Author: distil-whisper
Created: Nov 2, 2023
Updated: Nov 2, 2023
Last synced: May 25, 2026

Librispeech Long: Extended English Speech Audio for ASR

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info