Distract-Bench: 506-Sample Multimodal Reasoning Benchmark with Distractions

Name: Distract-Bench: 506-Sample Multimodal Reasoning Benchmark with Distractions
Creator: EthanSun
Published: 2026-06-07T23:44:28
Keywords: Vision Language, Benchmark, Distraction Robustness, Computer Vision, Multimodal Reasoning, Multimodal

by EthanSunUpdated 25d ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

A 506-sample multimodal reasoning benchmark created by EthanSun and last updated on 2026-06-08. It evaluates vision-language models on their ability to remain faithful to task-relevant visual evidence when visually salient but answer-irrelevant distractions are added. Each sample includes original and distracted images, a question, answer choices, the correct answer, and the distraction specification.

Use Cases

Benchmarking model faithfulness to visual evidence based on the comparison of original and distracted images.
Studying distraction robustness in multimodal AI based on the described distraction specifications.
Developing training or evaluation protocols for vision-language models based on the structured question-answer pairs.

Strengths

Contains 506 benchmark samples, providing a defined scale for evaluation.
Each sample includes both original and distracted images, enabling direct comparison.
Provides a distraction specification for each sample, detailing the construction method.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: huggingface
Freshness: Last updated 2026-06-08 01:18:03; freshness should be verified.

License restrictions are unknown and should be verified before use.

Multimodal Vision Language Benchmark Distraction Robustness Computer Vision Multimodal Reasoning

Related Datasets

Quality Score

D37

Description

42

Source

36

Reputation

39

Access

26

Community

16 downloads

1 likes

0 views

Dataset Info

Author: EthanSun
Created: Jun 7, 2026
Updated: Jun 8, 2026
Last synced: Jun 15, 2026

Access

26

Community

16 downloads

1 likes

0 views

Dataset Info

Author: EthanSun
Created: Jun 7, 2026
Updated: Jun 8, 2026
Last synced: Jun 15, 2026

Distract-Bench: 506-Sample Multimodal Reasoning Benchmark with Distractions

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info