AmanPriyanshu's tool-reasoning-sft-RESEARCH-explorations dataset contains 149,025 multi-turn code exploration agent trajectories converted into a strict reasoning and tool-call format. The data is derived from random-small-github-repositories and random-python-github-repositories, where each trajectory represents an agent navigating a GitHub repository using terminal commands to locate a target file. The dataset was last updated on 2026-04 02.
Use Cases
- Training agent models for sequential decision-making based on multi-turn reasoning trajectories.
- Fine-tuning models for tool-call generation based on validated FSM transitions.
- Researching agent behavior in code exploration tasks based on terminal command sequences.
- Benchmarking agent performance in navigating GitHub repositories to locate target files.
Strengths
- Contains 149,025 multi-turn agent trajectories.
- Data has been cleaned, stripped, and converted into a strict reasoning + tool-call format.
- Trajectories include validated FSM transitions.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
- Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
- Source
- Derived from AmanPriyanshu/random-small-github-repositories and AmanPriyanshu/random-python-github-repositories.
- Collection Method
- Agent search sessions navigating GitHub repositories using terminal commands.
- Time Range
- null
- Freshness
- Last updated 2026-04-02 20:08:26.
- Geography
- null