Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
SightAct-Bench is a synthetic benchmark containing 14 families of tasks for evaluating the safety of Vision-Language Model-powered browser agents. The benchmark, authored by SightAct, Bench, and hosted on Harvard Dataverse, was last updated on 2026-05-04. It specifically tests whether agents safely handle task-relevant sensitive-information requests when a visually suspicious interaction is embedded in the workflow.
License information is unknown and should be checked before use.