DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Vlmsareblind: Vision-Language Models are Blind Benchmark | DataSalon

Home Multimodal & LLMVlmsareblind: Vision-Language Models are Blind Benchmark

Multimodal & LLM

Vlmsareblind: Vision-Language Models are Blind Benchmark

by XAI·Updated 1y ago

Available on 1 platform

Description

7 visual reasoning tasks comprising geometric primitives designed to test the fundamental perception of Vision-Language Models. The dataset includes categories such as line intersections, circle overlaps, and nested shapes where models frequently fail despite human-level performance.

Use Cases

Benchmark the spatial reasoning capabilities of multimodal models using the geometric task labels and ground truth coordinates.
Identify systematic perception errors in vision encoders by comparing model outputs against the 'blindness' task categories.
Develop improved vision-language alignment techniques by training models to recognize basic topological relationships like 'inside' or 'intersecting'.

Strengths

Includes 7 distinct geometric task categories: line intersection, circle overlap, nested squares, counting, touching circles, overlapping circles, and line length.
Features 2D geometric renderings that are trivial for human vision but challenging for state-of-the-art multimodal models.
Provides a benchmark for evaluating topological and spatial relationship recognition independent of linguistic priors.

Parquet Size Categories1 Kn10 K Librarypolars Task Categoriesquestion Answering Languageen Arxiv240706581 Modalitytext Librarymlcroissant Modalityimage Librarydatasets Librarypandas Regionus Doi1057967hf5669 Licensemit

Related Datasets

Quality Score

D27

Description

Source

Reputation

Quality Score

D27

Description

Source

Reputation

Access

Community

1.2K downloads

28 likes

0 views

Dataset Info

Author: XAI
Created: Jul 8, 2024
Updated: Nov 22, 2024
Last synced: Apr 28, 2026

Access

Community

1.2K downloads

28 likes

0 views

Dataset Info

Author: XAI
Created: Jul 8, 2024
Updated: Nov 22, 2024
Last synced: Apr 28, 2026

Vlmsareblind: Vision-Language Models are Blind Benchmark

Description

Use Cases

Strengths

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info