Librispeech-Childrenized-5000: A Children's Speech Adaptation
Available on 1 platform
Sign in to view source links and access this dataset
Description
Librispeech-Childrenized-5000 is a speech dataset derived from the LibriSpeech corpus, likely containing 5,000 audio samples. It appears to be a modified version tailored for children's speech characteristics, published on Kaggle. The specific source, collection method, and temporal details are not provided in the available metadata.
Use Cases
Training automatic speech recognition (ASR) systems on children's speech (inferred from domain, verify after download)
Benchmarking model performance on age-specific acoustic features (inferred from domain, verify after download)
Studying phonetic and prosodic variations in child versus adult speech (inferred from domain, verify after download)
Strengths
Published on Kaggle, a major platform for data science resources.
Title suggests a specific scale of 5,000 samples.
Limitations
Metadata is minimal; actual content requires verification after download.
Column-level documentation is absent; field semantics must be inferred after download.
Data may reflect source bias inherent to the original LibriSpeech corpus.
Provenance
Source
Derived from the LibriSpeech corpus.
Collection Method
Likely involves processing or filtering the original audio to simulate or select children's speech.
Time Range
Temporal coverage of the source LibriSpeech corpus is unknown for this adaptation.
Freshness
Last update date is unknown; freshness unverified.
Geography
Spatial coverage is unknown.
License restrictions are unknown; verify before commercial use.