4,241 multimodal science questions representing the test split of the ScienceQA benchmark. It contains image-based multiple-choice questions accompanied by hints, lectures, and step-by-step explanations across natural, social, and language science subjects.
Use Cases
- Benchmark multimodal models by predicting the 'answer' index from 'image' and 'question' features
- Train models to generate reasoning chains using the 'hint', 'lecture', and 'explanation' text fields
- Analyze model accuracy across scientific disciplines using the 'subject' and 'topic' metadata
Strengths
- Includes 4,241 test instances with image-based scientific questions
- Features structured 'choices', 'answer', 'hint', 'lecture', and 'explanation' columns
- Categorized by 'subject', 'topic', and 'category' for granular performance analysis