DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

RLAIF-V: 83,132 Multimodal Preference Pairs for Vision-Language Model Alignment | DataSalon

Home Multimodal & LLMRLAIF-V: 83,132 Multimodal Preference Pairs for Vision-Language Model Alignment

Multimodal & LLM

RLAIF-V: 83,132 Multimodal Preference Pairs for Vision-Language Model Alignment

Name: RLAIF-V: 83,132 Multimodal Preference Pairs for Vision-Language Model Alignment
Creator: unsloth
Published: 2024-09-25T07:41:44
Keywords: Vision Language, Rlaif, Preference Pairs, Large Scale, Multimodal Feedback, Multimodal

by unsloth·Updated 1y ago

Available on 1 platform

Description

RLAIF-V-Dataset is a large-scale multimodal feedback dataset created by unsloth. It provides 83,132 preference pairs, where instructions are collected from a diverse set of sources. The dataset was last updated on Hugging Face on 2024-09 26.

Use Cases

Training reward models for multimodal reinforcement learning from AI feedback (RLAIF) based on the preference pairs.
Fine-tuning vision-language models for improved alignment based on high-quality human or AI feedback.
Benchmarking the performance of multimodal large language models (MLLMs) on preference-based tasks.

Strengths

Contains 83,132 preference pairs, indicating a substantial scale.
Described as providing high-quality feedback.
Instructions are collected from a diverse set of sources, suggesting variety.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: unsloth
Collection Method: Instructions collected from a diverse set of sources; feedback likely generated via AI or human annotation.
Time Range: null
Freshness: Last updated 2024-09-26 01:39:43; freshness should be verified.
Geography: null

null

Multimodal Vision Language Rlaif Preference Pairs Large Scale Multimodal Feedback

Related Datasets

Quality Score

D38

Description

Source

Reputation

Quality Score

D38

Description

Source

Reputation

Access

Community

29 downloads

7 likes

0 views

Dataset Info

Author: unsloth
Created: Sep 25, 2024
Updated: Sep 26, 2024
Last synced: Apr 18, 2026

Access

Community

29 downloads

7 likes

0 views

Dataset Info

Author: unsloth
Created: Sep 25, 2024
Updated: Sep 26, 2024
Last synced: Apr 18, 2026

RLAIF-V: 83,132 Multimodal Preference Pairs for Vision-Language Model Alignment

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info