A curated collection of human preference datasets across three categories: fine-tuning, RLHF, and evaluation. This repository indexes resources specifically designed for training and benchmarking Large Language Models against human-labeled preferences.
Use Cases
- Identify datasets for RLHF to train reward models that predict human preferences
- Select fine-tuning resources to adjust model behavior using human-labeled data
- Locate evaluation sets to benchmark LLM performance against human-aligned metrics
Strengths
- Categorizes datasets for Reinforcement Learning from Human Feedback (RLHF) workflows
- Includes resources specifically for Large Language Model (LLM) fine-tuning
- Organizes datasets into a curated list format for model evaluation (eval) tasks