Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
The LibriSpeech corpus contains approximately 1000 hours of read English speech, sampled at 16 kHz. It was prepared by Vassil Panayotov with assistance from Daniel Povey, derived from audiobooks in the LibriVox project.
Audio files are stored in .flac format and require conversion to float32 arrays using a provided mapping function with soundfile. The dataset is a dummy version on Hugging Face, which may be a subset or placeholder.