Name: EmbSpatial-Bench: 3,640 Egocentric Spatial QA Pairs for LVLM Evaluation
Creator: Phineas476
Published: 2024-06-22T06:20:05
Keywords: Spatial Reasoning, Vision Language, Benchmark, Multimodal

Description

A benchmark for evaluating embodied spatial understanding in Large Vision-Language Models, created by Phineas476 and last updated on June 23, 2024. It comprises 3,640 question-answer pairs automatically derived from embodied scenes, covering 294 object categories and 6 spatial relationships from an egocentric perspective. The associated EmbSpatial-SFT dataset provides instruction-tuning data for spatial tasks.

Use Cases

Benchmarking LVLM performance on embodied spatial understanding based on the 3,640 QA pairs.
Training or fine-tuning models for egocentric spatial reasoning based on the 6 defined relationships.
Analyzing model capabilities across diverse object categories based on the 294 covered categories.
Developing instruction-following models for spatial tasks using the associated EmbSpatial-SFT data.

Strengths

Contains 3,640 QA pairs, providing a substantial evaluation set.
Covers 294 object categories, suggesting diversity in visual concepts.
Focuses on 6 specific spatial relationships, enabling targeted analysis.
Automatically derived from embodied scenes, which may ensure scale and consistency.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is known for the benchmark but other details like file formats and sample data are unavailable.
Data may reflect bias inherent to the source embodied scenes and generation method.

Provenance

Source: huggingface
Collection Method: Automatically derived from embodied scenes.
Time Range: null
Freshness: Last updated 2024-06-23 17:35:21; freshness should be verified.
Geography: null

License is unknown; terms of use must be verified before download.

Multimodal Spatial Reasoning Vision Language Benchmark

EmbSpatial-Bench: 3,640 Egocentric Spatial QA Pairs for LVLM Evaluation

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info