Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A multi-domain reasoning dataset built to improve frontier models by revealing their failures and turning expert grading into training signal. The dataset pairs self-contained tasks with weighted rubrics across three domains — Computer Science, Data Science, and Chemistry. It was created by TuringEnterprises and last updated on 2026-06-16.
License is unknown; terms of use must be verified before application.