Sign in to view source links and access this dataset
Description
Honesty-Scratchpad-600x is a dataset containing 600 high-quality synthetic examples. It is designed to train language models, particularly smaller ones with 1B to 8B parameters, to become more truthful, recognize knowledge boundaries, and use an internal verification scratchpad before answering. The dataset was created by Aadeshisdoingsomething and was last updated on June 3, 2026.
Use Cases
Fine-tuning language models for improved truthfulness based on the structured verification examples.
Training models to recognize and admit knowledge boundaries based on the dataset's core mechanic.
Implementing internal verification routines in model outputs based on the described query-and-verification structure.
Strengths
Contains 600 high-quality synthetic examples.
Specifically designed for training smaller language models in the 1B to 8B parameter range.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
The dataset is synthetic, which may limit its applicability to real-world knowledge verification.
Provenance
Source
Aadeshisdoingsomething on Hugging Face.
Collection Method
Synthetically generated.
Freshness
Last updated 2026-06-03 22:58:52; freshness should be verified.
License is unknown; terms of use must be verified before application.