DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

LibriSpeech Train Clean 100: English Speech Audio for ASR | DataSalon

Home Speech & AudioLibriSpeech Train Clean 100: English Speech Audio for ASR

Speech & Audio

LibriSpeech Train Clean 100: English Speech Audio for ASR

Available on 1 platform

Description

A Kaggle-hosted dataset titled 'Librispeech_train-clean-100', likely containing audio files for automatic speech recognition (ASR) model training. The title suggests it is a subset of the LibriSpeech corpus, comprising 100 hours of 'clean' speech. Specific details on size, format, and provenance require verification after download.

Use Cases

Train a speech recognition model on clean, read English speech (inferred from domain, verify after download)
Benchmark ASR system performance on a standard corpus subset (inferred from domain, verify after download)
Fine-tune pre-trained models for specific acoustic conditions (inferred from domain, verify after download)

Strengths

Published on Kaggle, a major platform for data science resources.
Title references the well-known LibriSpeech corpus, suggesting a standard benchmark origin.

Limitations

Metadata is minimal; actual content requires verification after download.
Column-level documentation is absent; field semantics must be inferred after download.
Data may reflect bias inherent to its source corpus (e.g., speaker demographics, recording conditions).

Provenance

Source: Likely derived from the LibriSpeech corpus.
Collection Method: Method of gathering is unknown.
Time Range: Temporal coverage is unknown.
Freshness: Last update date is unknown; freshness unverified.
Geography: Spatial coverage is unknown.

License is unknown; verify terms before use.

Audio Machine Learning Audio Data Training Data Speech Recognition

Related Datasets

Quality Score

D16

Description

Source

Reputation

Quality Score

D16

Description

Source

Reputation

Access

Community

0 views

Dataset Info

Last synced: Apr 9, 2026

Access

Community

0 views

Dataset Info

Last synced: Apr 9, 2026

LibriSpeech Train Clean 100: English Speech Audio for ASR

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info