Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
2023-12-26 dataset from unalignment illustrates using direct preference optimization (DPO) to de-censor language models. It contains toxic and harmful text examples, many with attached warnings or disclaimers.
Usage requires acknowledgment that data contains toxic/harmful content and profanity. License is listed as CC BY 4.0 but full terms are on the dataset page.