Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A 2023 experimental dataset by rombodawg containing 650,000 instruction-formatted lines of data. The content is roughly 80% coding instruction data and 20% non-coding instruction data, intended to preserve logic and reasoning skills during model training. This is a refined version of the LosslessMegaCodeTraining series.
License is unknown; terms of use must be verified before application.