Librispeech Synth 300h is a synthetic speech dataset derived from the LibriSpeech corpus, containing up to 300 hours of audio. It is hosted on Kaggle and appears to be a processed version for speech synthesis tasks, likely containing audio generated by text-to-speech systems. The specific creator, generation method, and exact audio characteristics require verification after download.
Use Cases
- Training a multi-speaker text-to-speech model (inferred from domain, verify after download)
- Benchmarking speech synthesis quality on a known corpus (inferred from domain, verify after download)
- Studying speaker characteristics in synthetic audio (inferred from domain, verify after download)
Strengths
- Published on Kaggle, a platform with established data sharing infrastructure.
- The title suggests a substantial 300-hour duration for model training.
Limitations
- Metadata is minimal; actual content requires verification after download.
- Column-level documentation is absent; field semantics must be inferred after download.
- Data may reflect bias inherent to its source corpus and synthesis method.
Provenance
- Source
- Derived from the LibriSpeech corpus.
- Collection Method
- Synthetic generation, likely via text-to-speech systems.