Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
DocVQA consists of 10,000 to 100,000 document images paired with question-answer sets, formatted by lmms-lab in 2024. This version is derived from the original 2020 DocVQA research to facilitate standardized evaluation of Large Multi-modality Models (LMMs). It provides a structured framework for testing how models interpret text and layout within diverse document types.
This is a formatted version specifically for the lmms-eval library; users may need the lmms-eval environment for one-click evaluation functionality.