Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
GlotOCR-bench is a dataset of 16,375 images covering 158 writing systems and over 2,000 languages, designed to evaluate fundamental OCR capabilities. It was created by cis-lmu and was last updated on the platform on April 15, 2026. The dataset is released under the GlotOCR Open Evaluation License v1.0.
The dataset is released under the GlotOCR Open Evaluation License v1.0; the full terms are in the LICENSE file. The metadata is under CC0-1.0.