Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Ling-Coder-DPO is a subset of 250,000 samples used for Direct Preference Optimization (DPO) training of the Ling-Coder Lite model. The dataset was created by inclusionAI and last updated on Hugging Face on March 27, 2025. It is part of a larger collection that also includes a supervised fine-tuning (SFT) subset with over 5 million samples and a synthetic question-answering subset.
License is unknown, which may restrict commercial use.