Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Aegis AI released a benchmark in 2026 containing 2,288 multi-step agent trajectories for evaluating AI-agent governance verifiers. It includes 513 hand-authored gold-standard trajectories and 1,775 provenance-flagged augmented examples. The dataset is designed to score whether a verifier catches drift inside an agent's trajectory, not whether a prompt is harmful.
License is unknown; users must verify terms before use.