DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

LibriSpeech Train Clean 100: 100 Hours of Read English Speech | DataSalon

Home Speech & AudioLibriSpeech Train Clean 100: 100 Hours of Read English Speech

Speech & Audio

LibriSpeech Train Clean 100: 100 Hours of Read English Speech

Available on 1 platform

Description

LibriSpeech is a widely used corpus for automatic speech recognition research. This specific subset, 'train_clean_100', likely contains 100 hours of read English speech audio and corresponding transcripts. It is published on Kaggle, but detailed metadata about its exact composition and origin is not provided in the input.

Use Cases

Train an acoustic model for English speech recognition (inferred from domain, verify after download)
Benchmark ASR system performance on clean, read speech (inferred from domain, verify after download)
Fine-tune a pre-trained model on a specific speech corpus (inferred from domain, verify after download)

Strengths

Published on Kaggle, a major platform for data science resources.
The title suggests a focus on 'clean' speech, which may indicate lower noise levels for model training.

Limitations

Metadata is minimal; actual content, size, and structure require verification after download.
Column-level documentation is absent; field semantics must be inferred after download.
License, author, and last update date are unknown, which may affect usage rights and freshness assessment.

Provenance

Source: LibriSpeech corpus (inferred from title).
Collection Method: Likely derived from audiobook recordings (inferred from corpus nature).
Time Range: null
Freshness: Last update date is unknown; freshness unverified.
Geography: null

null

Audio Machine Learning Audio Data Clean Speech Speech Recognition

Related Datasets

Quality Score

D15

Description

Source

Reputation

Quality Score

D15

Description

Source

Reputation

Access

Community

0 views

Access

Community

0 views

LibriSpeech Train Clean 100: 100 Hours of Read English Speech

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Community