Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
PKU-Alignment developed this dataset to facilitate Constrained Value Alignment through Safe Reinforcement Learning from Human Feedback (Safe RLHF). It provides human-annotated preference data for Large Language Models, specifically targeting the balance between helpfulness and safety constraints as of late 2024.
Users should refer to the PKU-Alignment GitHub repository for specific data loading scripts and implementation details related to the 'Beaver' model series.