Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
TRL's Sentiment and Descriptiveness Preference Dataset originates from an early RLHF paper by OpenAI. The data has been preprocessed into a standard prompt, chosen, rejected format for reinforcement learning from human feedback. The dataset was last updated on the Hugging Face platform on 2024-04-09.
License information is unknown and should be verified before use.