Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Modotte's MathX-20M dataset is a 20-million-instance corpus curated from public sources and enhanced with synthetic data from both closed and open-source models. It is designed as a foundation for instruction-based model tuning and fine-tuning, focusing on mathematical reasoning. The dataset was last updated on February 10, 2026.
License is listed as Apache 2.0 in platform tags, but this should be verified on the dataset page.