Sign in to view source links and access this dataset
Description
Nemotron-SFT-SWE-v3 is a software engineering instruction tuning dataset designed to advance the capabilities of LLMs on SWE-Bench style tasks. It includes agentic trajectories collected using a variety of agent harnesses, including the OpenHands, SWE-agent, and mini-SWE-agent frameworks. The dataset was created by NVIDIA Corporation on 2026-06 04 and is ready for commercial use.
Use Cases
Fine-tuning language models for software engineering tasks based on the described SWE-Bench style instruction data.
Training agentic models for code generation and problem-solving based on the collected agent trajectories.
Benchmarking model performance on software engineering challenges based on the dataset's instruction tuning focus.
Developing specialized coding assistants based on the commercial-use-ready, instruction-tuned examples.
Strengths
Designed specifically for advancing LLM capabilities on SWE-Bench style tasks.
Includes agentic trajectories collected using multiple frameworks: OpenHands, SWE-agent, and mini-SWE-agent.
Dataset is explicitly stated as ready for commercial use.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count, file formats, and data size are unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
Source
NVIDIA Corporation
Collection Method
Agentic trajectories collected using the OpenHands, SWE-agent, and mini-SWE-agent frameworks.
Time Range
Dataset creation date is 2026-06-04.
Freshness
Last updated 2026-06-06 00:24:01; freshness should be verified.
Geography
null
License is unknown and should be verified before use.