Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
AstralBench is a curated collection of 50 mathematical problems selected from sources like IMO AnswerBench, Project Euler, and Putnam for benchmarking AI model performance. The dataset, created by author nguyen599 and last updated on 2026-03-25, covers diverse topics and difficulty levels. Current model performance on these problems reportedly ranges from 5% to 30% accuracy.
License is unknown; terms of use should be verified before application.