Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
RationaleRM provides between 10,000 and 100,000 records designed to align the reasoning processes of reward models with human judgments, released by Qwen in February 2026. The dataset focuses on rationale consistency to distinguish frontier models and detect deceptive alignment in text classification and question-answering tasks. It serves as a benchmark for evaluating whether a model's internal logic matches its final output.
Users should consult Arxiv paper 260204649 for the specific definitions and methodologies used to define rationale consistency; licensed under CC BY 4.0.