Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A benchmark dataset for evaluating OpenClaw-style end-to-end agents on data analysis tasks. Every task is grounded in real-world data and has a single objective gold answer. The dataset was created by GTML-LAB and was last updated on April 30, 2026.
License is unknown; terms of use must be verified before application.