Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
RLAIF-V provides between 10,000 and 100,000 multimodal preference-alignment records developed by OpenBMB to improve Multimodal Large Language Model (MLLM) trustworthiness. The data utilizes AI-generated feedback to refine model responses, serving as a core training component for the MiniCPM-V 4.5 model released in 2024.
This data is released under the CC BY-NC 4.0 license, which prohibits commercial redistribution or use. It is specifically optimized for MLLM trustworthiness and preference alignment.