Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A collection of synthetic datasets designed for pretraining the NVIDIA Nemotron 3 family of large language models. The dataset is aimed at improving model capabilities on specific tasks, including factual recall, moral scenarios, and diverse generative and multiple choice questions. It was created by NVIDIA and last updated on the platform on June 4, 2026.
License is unknown; terms of use must be verified before application.