Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
KSC2 Structured is an enhanced version of the Kazakh Speech Corpus 2, providing audio recordings paired with transcripts that have restored punctuation and capitalization. Developed by Inflexion Lab, this dataset addresses the limitation of the original KSC2's plain lowercase transcripts. The dataset page was last updated in March 2026.
The full dataset description, including detailed specifications, is hosted externally on the Hugging Face dataset page. The license is noted as CC BY 4.0 in the description but listed as unknown in the input metadata.