VLFeedback: 80,000 Multi-Modal Instructions with GPT-4V Preference Labels

Name: VLFeedback: 80,000 Multi-Modal Instructions with GPT-4V Preference Labels
Creator: MMInstruction
Published: 2023-11-08T15:46:04
Keywords: Size Categories10 Kn100 K, Librarypolars, Librarydask, Task Categoriesvisual Question Answering, Modalitytext, Librarymlcroissant, Modalityimage, Librarydatasets, Parquet, Regionus, Arxiv231210665

by MMInstructionUpdated 1y ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

VLFeedback contains 80,000 multi-modal instructions and 320,000 model responses annotated by GPT-4V for vision-language preference learning. Developed by MMInstruction in late 2023, the dataset aggregates instructions from diverse sources to evaluate a pool of 12 different Large Vision-Language Models (LVLMs).

Use Cases

Training reward models for RLHF using the GPT-4V preference rankings
Benchmarking LVLM performance by comparing the four model responses provided for each instruction
Instruction fine-tuning of vision-language models to improve alignment with AI-generated preferences

Strengths

80,000 multi-modal instructions
Comparative data featuring responses from 12 distinct LVLMs
High-quality synthetic annotations provided by GPT-4V

Limitations

Reliance on synthetic GPT-4V annotations which may contain model-specific biases compared to human ground truth
Potential for temporal staleness as the 12 source LVLMs are superseded by newer architectures

Provenance

Source: MMInstruction (Arxiv 2312.10665)
Collection Method: Synthetic annotation of model-generated responses using GPT-4V
Time Range: 2023
Freshness: Last updated October 2024; reflects model capabilities and instructions available as of late 2023.

Users should consult the associated paper for the specific list of 12 LVLMs used to generate the response pool; data is provided in Parquet format.

Parquet Size Categories10 Kn100 K Librarypolars Librarydask Task Categoriesvisual Question Answering Modalitytext Librarymlcroissant Modalityimage Librarydatasets Regionus Arxiv231210665

Related Datasets

Quality Score

D37

Description

36

Source

41

Reputation

41

Access

22

Community

542 downloads

50 likes

0 views

Dataset Info

Author: MMInstruction
Created: Nov 8, 2023
Updated: Oct 17, 2024
Last synced: Jun 19, 2026

Access

22

Community

542 downloads

50 likes

0 views

Dataset Info

Author: MMInstruction
Created: Nov 8, 2023
Updated: Oct 17, 2024
Last synced: Jun 19, 2026

VLFeedback: 80,000 Multi-Modal Instructions with GPT-4V Preference Labels

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info