Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
DARE (Diverse Visual Question Answering with Robustness Evaluation) is a multiple-choice VQA benchmark created by cambridgeltl. It evaluates Vision-Language Model performance across five diverse categories and includes four robustness-oriented evaluations based on variations in prompts, answer options, output format, and the number of correct answers. The validation split contains images, questions, answer options, and correct answers.
The full dataset description is hosted externally; the validation split is available but other splits may not be published.