Sign in to view source links and access this dataset
Description
1,000 supervised-fine-tuning examples of math reasoning traces. Problems were sampled from a Nemotron math problem set, answered by DeepSeek V4 Pro with high reasoning effort, and reviewed for correctness by DeepSeek V4 Flash. The dataset was authored by blythet and last updated on April 30, 2026.
Use Cases
Fine-tuning language models for step-by-step math problem solving based on the reasoning traces.
Studying chain-of-thought reasoning patterns in large language models based on the high-effort traces.
Creating benchmark datasets for evaluating mathematical reasoning fidelity based on the reviewed problems.
Training models to avoid pathological reasoning patterns like looping or excessive backtracking based on the dataset's curation focus.
Strengths
Contains 1,000 high-signal examples specifically curated for supervised fine-tuning.
Reasoning traces were generated by DeepSeek V4 Pro with high reasoning effort.
Correctness was independently reviewed by DeepSeek V4 Flash against expected answers.
Limitations
Row count is unknown, which may limit suitability assessment.
Column-level documentation is absent; field semantics must be inferred after download.
Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
Source
Problems sampled from a Nemotron math problem set, originally sourced from StackExchange-Math and AoPS.
Collection Method
Problems answered by DeepSeek V4 Pro with thinking enabled, then reviewed by DeepSeek V4 Flash.
Time Range
null
Freshness
Last updated 2026-04-30 23:38:35; freshness should be verified.
Geography
null
License is unknown; restrictions should be verified before use.