Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
NVIDIA's Nemotron-Cascade-RM-Training dataset provides 81,808 samples for training reward models in reinforcement learning from human feedback (RLHF). It contains prompts, data sources, and category information. The dataset was published by NVIDIA in December 2025.
The full dataset description, including column details and specific license, is hosted externally at the provided Hugging Face URL.