Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A preview version of the olmOCR benchmark, last updated March 2026 by AllenAI. It builds upon the original benchmark's 1,403 PDF files and 7,010 manually created unit test cases by adding new synthetic tests. These synthetic tests are designed to evaluate optical character recognition performance on challenging scenarios.
License is unknown; restrictions should be verified before use.