Sign in to view source links and access this dataset
Description
NVIDIA's Nemotron-RL-instruction_following-structured_outputs dataset tests a model's ability to follow output formatting instructions under JSON schema constraints. Each problem consists of a document, an output formatting instruction (schema), and a question, with difficulty varied by instruction location, comprehensiveness, and schema complexity. The dataset was last updated on January 12, -2026.
Use Cases
Benchmarking LLM instruction-following capabilities based on the described problem components.
Training models to generate JSON outputs based on schema constraints mentioned in the description.
Evaluating model performance on tasks with varied instruction location and comprehensiveness.
Researching the impact of schema complexity on model output quality.
Strengths
Designed by NVIDIA, a major AI research institution.
Problems are structured with three components (document, instruction, question) to test specific capabilities.
Difficulty is systematically varied by factors like instruction location and schema complexity.
Limitations
Description metadata is limited; actual data quality requires manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count, file formats, and license are unknown, which may limit suitability assessment.
Provenance
Source
NVIDIA
Collection Method
Likely synthetically generated or curated for model evaluation.
Time Range
null
Freshness
Last updated 2026-01-12 23:38:23; freshness should be verified.
Geography
null
License is unknown; terms of use must be verified before application.