Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A benchmark dataset from Kaggle for evaluating the cognitive calibration of large language models. The dataset likely contains tasks or prompts designed to measure how well an LLM's confidence aligns with its accuracy. Its specific size, structure, and creation details require verification after download.
License restrictions are unknown; review terms before use.