Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
NVIDIA released Nemotron-Math-Proofs-v1 in early 2025, providing a large-scale collection of approximately 580,000 natural language proof problems. The dataset includes 550,000 formalizations into Lean 4 theorem statements and 900,000 model-generated reasoning traces for mathematical distillation.
The dataset is released under the CC BY-SA 4.0 license; users should be familiar with the Lean 4 theorem prover to utilize the formalization fields.