Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
PKU-SafeRLHF is a dataset for AI safety research, particularly for reducing harmful outputs from language models. It was created by the PKU-Alignment Team and was last updated in October 2024. The dataset includes single-dimension preference data, question-answer pairs, and prompts.
License is Creative Commons Attribution-NonCommercial 4.0 (CC BY-NC 4.0). Data contains potentially offensive or harmful content intended for safety research.