SlideVQA: Multi-Image Document Question Answering on 10K+ Slide Decks

Name: SlideVQA: Multi-Image Document Question Answering on 10K+ Slide Decks
Creator: NTT-hil-insight
Published: 2025-03-26T10:10:16
Keywords: Size Categories10 Kn100 K, Librarypolars, Task Categoriesquestion Answering, Librarydask, Languageen, Task Categoriesvisual Question Answering, Modalitytext, Librarymlcroissant, Modalityimage, Librarydatasets, Arxiv230104883, Parquet, Regionus

by NTT-hil-insightUpdated 1y ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

SlideVQA is a document visual question answering dataset containing between 10,000 and 100,000 records, released by NTT-hil-insight in 2023. It focuses on multi-image reasoning where models must select specific evidence slides from a deck to answer natural language questions.

Use Cases

Evidence image selection: Identifying specific slide indices that contain the information needed for a query
Multi-modal answer generation: Producing text answers by synthesizing visual data from multiple slides
Slide-level retrieval: Ranking images within a deck to find the most relevant visual context

Strengths

10,000 to 100,000 record scale
Multi-image context requiring evidence retrieval
Peer-reviewed methodology from Arxiv 2301.04883

Limitations

Domain-specific focus on presentation slides may not generalize to other document types like forms
High memory requirements for processing multiple images per query

Provenance

Source: NTT-hil-insight (Arxiv 2301.04883)
Collection Method: Annotated slide decks for evidence selection and question answering
Freshness: Last updated March 2025; based on 2023 research.

Users should consult the Arxiv paper 2301.04883 for specific evaluation metrics regarding evidence selection accuracy.

Parquet Size Categories10 Kn100 K Librarypolars Task Categoriesquestion Answering Librarydask Languageen Task Categoriesvisual Question Answering Modalitytext Librarymlcroissant Modalityimage Librarydatasets Arxiv230104883 Regionus

Related Datasets

Quality Score

D35

Description

36

Source

36

Reputation

39

Access

22

Community

853 downloads

15 likes

0 views

Dataset Info

Author: NTT-hil-insight
Created: Mar 26, 2025
Updated: Mar 27, 2025
Last synced: May 20, 2026

Access

22

Community

853 downloads

15 likes

0 views

Dataset Info

Author: NTT-hil-insight
Created: Mar 26, 2025
Updated: Mar 27, 2025
Last synced: May 20, 2026

SlideVQA: Multi-Image Document Question Answering on 10K+ Slide Decks

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info