Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Unalignment Toxic Dpo V0.2 Zh Cn is a multilingual dataset intended to illustrate the use of Direct Preference Optimization (DPO) for model unalignment. The dataset was created by tastypear and last updated on 2024-01-31. Its description states it contains highly toxic or harmful examples.
Usage restrictions from the original dataset apply. The Chinese translations are model-paraphrased and may not be accurate.