Name: SkillRL SFT AlfWorld Prompt Completion: 7,486 Agent Trajectories
Creator: yananchen
Published: 2026-04-07T19:12:40
Keywords: Agent Evolution, Language Model Finetuning, Skill Rl, Text, Reinforcement Learning

Description

7,486 prompt-completion pairs for fine-tuning language models, derived from the SkillRL paper on evolving agents via recursive skill-augmented reinforcement learning. The dataset contains 237 distinct tasks and was authored by yananchen, with a last recorded update in April 2026. All rows contain actions within the admissible action list, indicating a focus on valid agent behavior.

Use Cases

Fine-tuning language models for agent control based on the described AlfWorld task completions.
Training models on recursive skill augmentation based on the methodology from the SkillRL paper.
Benchmarking language model performance on sequential decision-making tasks based on the 237 distinct parsed tasks.
Studying action validity in agent trajectories based on the 100% match between output actions and the admissible action list.

Strengths

All 7,486 rows contain actions that exactly match the admissible action list, indicating high internal consistency.
The dataset is structured around 237 distinct tasks, providing variety in the training scenarios.
Contiguous runs of the same task have a mean length of 15.0, suggesting sequences of related actions for skill learning.

Limitations

Description metadata is limited; actual data quality requires manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.

Provenance

Source: Paper: SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning.
Collection Method: Likely generated from agent trajectories in the AlfWorld simulation environment.
Freshness: Last updated 2026-04-10 01:13:19; freshness should be verified.

License is unknown, which may restrict usage.

Text Agent Evolution Language Model Finetuning Skill Rl Reinforcement Learning

SkillRL SFT AlfWorld Prompt Completion: 7,486 Agent Trajectories

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info