Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
OCR-Markdown-Dense-200x is a synthetic dataset designed for dense document optical character recognition tasks. The dataset was created by author prithivMLmods and was last updated on April 21, 2026. It focuses on extracting structured HTML or Markdown representations from densely packed document pages.
License is unknown; terms of use must be verified before application.