Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Over 15 million data points covering math, code, knowledge, and instruction following form the full set of core-domain SFT data used for post-training the MiniCPM5-1B-SFT model. This dataset is a key representative of L3 refined data within the UltraData L0-L4 tiered data management framework. It was authored by openbmb and last updated on Hugging Face in May 2026.
License is unknown, which may restrict commercial or research use.