Sign in to view source links and access this dataset
Description
Nemotron-RL-Science-v1 provides a reinforcement learning dataset for training models on science reasoning tasks. Each example includes a problem, a reference answer, and a verifiable RL environment configuration for training with verifiable rewards. The dataset was created by NVIDIA and covers three science domains: Physics, Biology, and Chemistry.
Use Cases
Training a policy model with verifiable rewards based on the provided RL environment configurations.
Benchmarking reinforcement learning agents on science reasoning problems across physics, biology, and chemistry domains.
Developing answer-extraction methods using the provided templates for structured output.
Studying agent/verifier interactions in a controlled, reward-driven setup for scientific problem-solving.
Strengths
Includes verifiable RL environment configurations (agent prompt, verifier reference, answer-extraction template) for each problem.
Covers three distinct science domains: Physics, Biology, and Chemistry.
Provides both problems and reference answers for supervised or reward-based learning.
Limitations
Description metadata is limited; actual data quality requires manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Provenance
Source
NVIDIA
Freshness
Last updated 2026-06-04 06:46:39; freshness should be verified.
License is unknown; terms of use must be verified before application.