Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
KazakhOCR is a synthetic benchmark dataset for evaluating multimodal models on Optical Character Recognition (OCR) for the Kazakh language. It contains text in Arabic, Cyrillic, and Latin scripts. The dataset was curated by Henry Gagnier, Sophie Gagnier, and Ashwin Kirubakaran and is licensed under MIT.
License is listed as MIT in the description.