Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
SciConBench is a large-scale, live benchmark for evaluating AI agents on open-domain scientific conclusion synthesis. The dataset, created by hayoungjung, focuses on the long-horizon task of retrieving and assessing evidence from the open web to produce expert-level scientific conclusions. It was last updated on June 11, 2026.
The description notes gating was added to prevent bots from scraping, which may affect access.