400,000 human preference responses from 82,000 unique annotators evaluating text-to-image model outputs. The dataset categorizes feedback into preference, coherence, and alignment metrics for large-scale model ranking.
Use Cases
- Develop reward models for RLHF by utilizing the preference and alignment labels
- Perform large-scale T2I model benchmarking based on the coherence and alignment categories
- Study human perception of AI-generated imagery using the 400,000 response data points
Strengths
- 400,000 human responses collected within a 48-hour window
- Feedback from 82,000 individual annotators via the Rapidata Python API
- Three distinct evaluation labels: preference, coherence, and alignment