RoboPoint: 1.4 Million Image-QA Instances for Spatial Affordance Prediction

Name: RoboPoint: 1.4 Million Image-QA Instances for Spatial Affordance Prediction
Creator: wentao-yuan
Published: 2024-09-20T05:38:24
Keywords: Arxiv240610721, Size Categories1 Mn10 M, Librarywebdataset, Modalitytext, Librarymlcroissant, Modalityimage, WEBDATASET, Librarydatasets, Regionus, Licenseapache 20

by wentao-yuanUpdated 1y ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

1.432 million image-QA instances developed by wentao-yuan in 2024 facilitate fine-tuning Vision-Language Models for spatial affordance prediction. The collection integrates 667K synthetic instances for object and free space referencing with 100K LVIS detection samples and 150K instruction-following pairs.

Use Cases

Training models to identify navigable areas using the 320K free space reference instances
Enhancing spatial grounding in VLMs using the 347K synthetic object reference instances
Improving general visual instruction following via the 150K GPT-generated instances

Strengths

1.432 million total records across five distinct data categories
100,000 object detection instances sourced from the LVIS dataset
Apache 2.0 license allows for broad research and commercial application

Limitations

46% of the data is synthetic, which may lead to domain gap issues in real-world robotics
GPT-generated instruction-following instances may contain label noise or hallucinations

Provenance

Source: wentao-yuan (Arxiv 2406.10721)
Collection Method: synthetic data pipeline, LVIS extraction, and GPT-based instruction generation
Freshness: Last updated September 2024

Data is provided in WebDataset format; users should ensure compatibility with webdataset or mlcroissant libraries for efficient data ingestion.

WEBDATASET Arxiv240610721 Size Categories1 Mn10 M Librarywebdataset Modalitytext Librarymlcroissant Modalityimage Librarydatasets Regionus Licenseapache 20

Related Datasets

Quality Score

D36

Description

39

Source

36

Reputation

39

Access

22

Community

1.8K downloads

12 likes

0 views

Dataset Info

Author: wentao-yuan
Created: Sep 20, 2024
Updated: Sep 22, 2024
Last synced: Jun 4, 2026

Access

22

Community

1.8K downloads

12 likes

0 views

Dataset Info

Author: wentao-yuan
Created: Sep 20, 2024
Updated: Sep 22, 2024
Last synced: Jun 4, 2026

RoboPoint: 1.4 Million Image-QA Instances for Spatial Affordance Prediction

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info