Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
FINAL Bench is a functional metacognitive reasoning benchmark for Large Language Models containing fewer than 1,000 records, released by FINAL-Bench in early 2026. It shifts evaluation from final-answer accuracy to measuring an AI's ability to identify knowledge gaps and perform error recovery.
The dataset is provided in JSON format and is compatible with the Hugging Face datasets library, pandas, and polars.