Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
DocAtlas provides high-fidelity multilingual OCR data covering 82 languages and 10 writing systems. The dataset is built through model-free differential rendering and provides precise structural annotations in a unified DocTag format. It was created by ahmedheakl and last updated in May 2026.
License is unknown; users should verify licensing terms before use.