Name: BonaFide: Ground-Truth Faithfulness Labels for Chain-of-Thought Reasoning
Creator: yoavgurarieh
Published: 2026-05-13T06:48:40
Keywords: Nlp Evaluation, Chain Of Thought, Benchmark, Text, Faithfulness Evaluation

Description

A dataset containing ground-truth faithfulness labels for chains of thought (CoTs), used for evaluating CoT faithfulness metrics. The dataset was created by author yoavgurarieh and last updated on May 16, 2026. It is constructed from tasks whose outputs reveal which intermediate computations must have produced them, then labeling CoTs against those computations.

Use Cases

Benchmarking chain-of-thought faithfulness metrics based on the provided ground-truth labels.
Training or fine-tuning models to detect unfaithful reasoning based on the labeled CoT data.
Analyzing the impact of misleading hints on reasoning faithfulness in the described diversionary setting.

Strengths

Provides ground-truth labels specifically for evaluating chain-of-thought faithfulness.
Constructed using a methodology where task outputs reveal required intermediate computations.
Includes a diversionary setting with misleading hints to test robustness.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: huggingface
Collection Method: Constructed from tasks whose outputs reveal required intermediate computations, with CoTs labeled against those computations.
Time Range: null
Freshness: Last updated 2026-05-16 08:39:03; freshness should be verified.
Geography: null

License is unknown; restrictions should be verified before use.

Text Nlp Evaluation Chain Of Thought Benchmark Faithfulness Evaluation

BonaFide: Ground-Truth Faithfulness Labels for Chain-of-Thought Reasoning

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info