Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Nogai Unified Corpus v1 is the largest publicly available, curated textual dataset for the critically endangered Nogai language. It was engineered by ansarzeinulla to solve data scarcity for this historically 'zero-resource' Turkic language, enabling machine learning tasks. The dataset was last updated on June 6, 2026.
License information is unknown; users should verify terms of use before downloading.