Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Kazakhstan's largest open Kazakh speech corpus, an extended version of the ISSAI KSC2 dataset from Nazarbayev University. It contains approximately 1,110 hours of audio across 595,690 recordings, enhanced with punctuation and word-level timestamps from MFA alignment. The dataset is published and maintained by Jeti Labs.
The dataset size is listed as 52.9 GB, which is substantial for download and storage.