Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
CentificAIResearch's benchmark evaluates the safety and robustness of email classification systems under adversarial conditions. It consists of two complementary datasets designed to assess model classification accuracy and the reliability of LLM-based graders. The dataset was last updated on June 22, -2026.
License is unknown; terms of use must be verified before application.