Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
ATBench-Claw provides a benchmark for evaluating safety in executable AI agent trajectories, focusing on critical decision points before actions like file deletion or code execution. Created by AI45Research, this dataset is an extension of ATBench and a companion to the AgentDoG diagnostic framework. It was last updated in March 2026.
The full description and data details are hosted externally on the Hugging Face dataset page; a visit is required for complete documentation and access.