Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
An ongoing corpus of historical Hán-Nôm manuscript pages annotated for OCR training. Each page includes a high-resolution image, per-character bounding boxes with corrected text labels, and candidate alternates from two upstream OCR engines. The dataset is maintained by Aerbote88 and was last updated on May 11, 2026.
License is unknown, which may impose usage restrictions.