Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A benchmark dataset pairs document images from OmniDocBench v1.5 with prompts containing PaddleOCR-extracted markdown text. The task is to correct OCR errors and restore proper formatting using the source image as reference. The dataset was created by author 'andynoodles' and was last updated on 2026-04-07.
License is unknown, which may restrict usage.