Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Epstein Files OCR — Datasets 1–8 (Early Release) contains page-level OCR output in Markdown format from a public release of documents related to the Jeffrey Epstein case. The dataset is designed for question answering, information retrieval, and text classification tasks. It was created by the author 'ishumilin' and last updated on March 17, 2026.
The dataset is marked as no longer maintained, and users are directed to a separate 'Complete OCR Dataset'.