Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Salesforce developed UniDoc-Bench in 2024 as a benchmark for multimodal retrieval-augmented generation (MM-RAG). It contains 1,700+ multimodal QA pairs derived from a corpus of 70,000 real-world PDF pages across eight domains. The data links evidence across text, tables, and figures to support complex document-based reasoning tasks.
Released under CC BY-NC 4.0 license, which prohibits commercial use. Associated with Arxiv paper 2510.03663.