Nemotron-RL-Math-v2: Curated Mathematical Problems for Reinforcement Learning

Name: Nemotron-RL-Math-v2: Curated Mathematical Problems for Reinforcement Learning
Creator: nvidia
Published: 2026-06-01T22:00:43
Keywords: Mathematics, Problem Solving, Text, Rlvr, Reinforcement Learning

by nvidiaUpdated 4d ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

Nemotron-RL-Math-v2 is a small curated set of mathematical problems selected for reinforcement learning workflows. The dataset is designed for Reinforcement Learning from Verifiable Rewards (RLVR) and was created by NVIDIA. Problems are sourced from AoPS and StackExchange-derived math data held out from the Nemotron-SFT-Math-v4 SFT set.

Use Cases

Training RL agents on mathematical problem-solving based on the described verifiable answers.
Benchmarking RLVR methods based on the curated set of problems with validation signals.
Fine-tuning language models for math using a reinforcement learning reward signal derived from the dataset's structure.

Strengths

Designed for RL workflows with verifiable answers, a specific training paradigm.
Curated from established sources like AoPS and StackExchange.
Created by NVIDIA, suggesting institutional backing.

Limitations

Row count is unknown, which may limit suitability assessment.
Column-level documentation is absent; field semantics must be inferred after download.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: NVIDIA, with problems sourced from AoPS and StackExchange.
Collection Method: Curated selection from held-out math data.
Freshness: Last updated 2026-06-04 04:54:18; freshness should be verified.

License is unknown; terms of use must be verified before application.

Text Mathematics Problem Solving Rlvr Reinforcement Learning

Related Datasets

Quality Score

D38

Description

42

Source

39

Reputation

37

Access

26

Community

4 downloads

1 likes

0 views

Dataset Info

Author: nvidia
Created: Jun 1, 2026
Updated: Jun 4, 2026
Last synced: Jun 8, 2026

Access

26

Community

4 downloads

1 likes

0 views

Dataset Info

Author: nvidia
Created: Jun 1, 2026
Updated: Jun 4, 2026
Last synced: Jun 8, 2026

Nemotron-RL-Math-v2: Curated Mathematical Problems for Reinforcement Learning

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info