Name: SkillRL-SFT-Data: Instruction-Output Pairs for Agent Training
Creator: Jianwen
Published: 2026-04-03T06:41:33
Keywords: Skill Hierarchy, Text, Reinforcement Learning, Decision Making, Instruction Following

Description

Instruction-output pairs for training base agent policies in interactive decision-making environments. The dataset was created by author Jianwen for the SkillRL research paper and was last updated on 2026-04-12. It contains examples with structured instructions and corresponding expert actions.

Use Cases

Supervised fine-tuning of language models for agent control based on the provided instruction-output pairs.
Training base policies for agents operating in ALFWorld, WebShop, and Search environments.
Research on hierarchical skill learning and skill transfer using the retrieved skill context from the SkillBank.
Benchmarking agent performance on instruction-following tasks in interactive decision-making settings.

Strengths

Dataset is explicitly designed for and used in a published research paper (SkillRL).
Covers three distinct interactive decision-making environments: ALFWorld, WebShop, and Search.
Each example contains structured instruction with retrieved skill context and expert action output.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: huggingface
Collection Method: Created for the paper 'SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning'.
Freshness: Last updated 2026-04-12 04:51:35; freshness should be verified.

License is unknown; users should verify licensing terms before use.

Text Skill Hierarchy Reinforcement Learning Decision Making Instruction Following

SkillRL-SFT-Data: Instruction-Output Pairs for Agent Training

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info