DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Meta Cognitive Calibration Benchmark for Large Language Models | DataSalon

Home NeuroscienceMeta Cognitive Calibration Benchmark for Large Language Models

Neuroscience

Meta Cognitive Calibration Benchmark for Large Language Models

Available on 1 platform

Description

A benchmark dataset from Kaggle for evaluating the cognitive calibration of large language models. The dataset likely contains tasks or prompts designed to measure how well an LLM's confidence aligns with its accuracy. Its specific size, structure, and creation details require verification after download.

Use Cases

Benchmarking LLM confidence calibration on reasoning tasks (inferred from domain, verify after download)
Analyzing the relationship between model uncertainty and answer correctness (inferred from domain, verify after download)
Training or fine-tuning models for improved self-assessment capabilities (inferred from domain, verify after download)

Strengths

Published on Kaggle, a major platform for data science resources.
Platform tags indicate a focus on LLM benchmarking and cognitive science.

Limitations

Metadata is minimal; actual content requires verification after download.
Row count, column definitions, and file formats are unknown.
License, author, and last update information are unavailable.

Provenance

Source: Kaggle
Collection Method: Unknown
Time Range: Unknown
Freshness: Last update date is unknown; freshness unverified.
Geography: Unknown

License restrictions are unknown; review terms before use.

Text Tabular Llm Benchmark Ai Evaluation Benchmark Cognitive Science

Related Datasets

Quality Score

D16

Description

Source

Reputation

Quality Score

D16

Description

Source

Reputation

Access

Community

0 views

Access

Community

0 views

Meta Cognitive Calibration Benchmark for Large Language Models

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Community