Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
45,882 samples comprise this Reinforcement Learning from Human Feedback training dataset. NVIDIA created it for language model alignment, with the dataset last updated in December 2025.
The full description and details are only available on the Hugging Face dataset page; the dataset is stated to be ready for commercial use.