Name: Agent Trajectory Safety Benchmark With Fine-Grained Annotations
Creator: AI45Research
Published: 2026-01-24T13:19:22
Keywords: Librarypolars, Ai Safety, Trajectory Benchmark, Agent Safety, Size Categoriesn1 K, Modalitytext, Tool Use, Arxiv260118491, Librarymlcroissant, Librarydatasets, Benchmark, Librarypandas, Text, Regionus, Multi Turn Interaction, JSON, Agent, Licenseapache 20

Description

ATBench is a trajectory-level dataset for evaluating agentic safety in realistic, long-horizon interactions. It contains 500 annotated execution trajectories split evenly between safe and unsafe examples, with an average of 8.97 multi-turn interactions and 1,575 unique tools. The benchmark was created by AI45Research and was last updated in January 2026.

Use Cases

Classify trajectory safety labels using features like turn count and tool usage patterns.
Analyze the relationship between specific tool calls within a trajectory and the resulting safety annotation.
Benchmark the performance of safety classifiers on the provided 250 safe and 250 unsafe trajectory examples.
Study patterns in multi-turn interactions, leveraging the average of 8.97 turns per trajectory.

Strengths

Contains 500 annotated trajectories for a balanced initial evaluation.
Provides fine-grained, taxonomy-grounded safety annotations for precise analysis.
Includes 1,575 unique tools, offering diversity in agent action space.

Limitations

Limited to 500 total trajectories, which may be insufficient for training large models.
The geographic and application domain bias of the trajectories is unknown.
Relies on annotations which may contain subjective judgment or label noise.

Provenance

Source: AI45Research via Hugging Face.
Collection Method: Annotated execution trajectories from agent interactions; specific collection method not detailed.
Freshness: Last updated 2026-01-27.

Full description requires visiting the external Hugging Face dataset page. Specific column names, data formats, and license details are not provided in the input.

Text JSON Librarypolars Ai Safety Trajectory Benchmark Agent Safety Size Categoriesn1 K Modalitytext Tool Use Arxiv260118491 Librarymlcroissant Librarydatasets Benchmark Librarypandas Regionus Multi Turn Interaction Agent Licenseapache 20

Agent Trajectory Safety Benchmark With Fine-Grained Annotations

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info