Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
The People's Speech Dataset contains over 30,000 hours of transcribed English speech, licensed for academic and commercial use under CC-BY-SA and CC-BY 4.0. It was created by MLCommons to train speech-to-text systems and features a diverse set of speakers.
The dataset uses multiple Creative Commons licenses (CC-BY and CC-BY-SA versions 2.0, 2.5, 3.0, 4.0). Users must comply with the specific license terms for their intended use, which may include share-alike requirements.