Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
VLFeedback contains 80,000 multi-modal instructions and 320,000 model responses annotated by GPT-4V for vision-language preference learning. Developed by MMInstruction in late 2023, the dataset aggregates instructions from diverse sources to evaluate a pool of 12 different Large Vision-Language Models (LVLMs).
Users should consult the associated paper for the specific list of 12 LVLMs used to generate the response pool; data is provided in Parquet format.