A dataset for visual question answering on documents, published by HuggingFaceM4 on December 18, 2023. The dataset likely contains images of documents paired with questions and answers. Its specific scale, columns, and content require verification after download.
Use Cases
- Train a model to answer questions about text and layout in scanned documents (inferred from domain, verify after download)
- Benchmark visual question answering systems on structured document images (inferred from domain, verify after download)
- Fine-tune a multimodal LLM for tasks like form understanding or receipt parsing (inferred from domain, verify after download)
Strengths
- Published on Hugging Face by HuggingFaceM4
- Last updated 2023-12-18 17:30:35
Limitations
- Metadata is minimal; actual content requires verification after download
- Column-level documentation is absent; field semantics must be inferred after download
- Row count is unknown, which may limit suitability assessment
Provenance
- Source
- HuggingFaceM4
- Freshness
- Last updated 2023-12-18 17:30:35