Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
SODA evaluates how conversation depth affects agent safety across 16 tool-use environments with 80 scenarios. Each task places a harmful request at a controlled depth from D=0 to D=20, preceded by regular agentic tasks. The dataset was created by author 'cesun' and was last updated on 2026-06-12.
License is unknown; terms of use must be verified before application.