WebChain is a large-scale dataset of real-world web interaction trajectories for training and evaluating GUI and web agents. It contains 31,725 trajectories, 317,993 steps, and 428 unique domains, with a core contribution of Triple Alignment for spatial grounding and long-horizon planning. The dataset was created by computer-use-agent-Lab and is documented in a paper from 2026.
Use Cases
- Training GUI agents based on the dataset's human-annotated web interaction trajectories
- Evaluating long-horizon planning capabilities based on the described structural and visual context alignment
- Developing models for spatial action grounding based on the Triple Alignment supervision method
- Benchmarking web agent performance across 428 unique domains mentioned in the description
Strengths
- Large scale with 31,725 annotated trajectories and 317,993 individual steps
- Broad domain coverage across 428 unique websites
- Provides Triple Alignment supervision for both spatial grounding and long-horizon planning
Limitations
- Column-level documentation is absent; field semantics must be inferred after download
- Row count is unknown, which may limit suitability assessment
- Data may reflect temporal or source bias inherent to the collection method
Provenance
- Source
- computer-use-agent-Lab
- Collection Method
- Human-annotated collection of real-world web interaction trajectories
- Time Range
- null
- Freshness
- Last updated 2026-04-14 07:10:24
- Geography
- null